Gentoo Archives: gentoo-user

From: Steve <gentoo_sjh@×××××××.uk>
To: gentoo-user@l.g.o
Subject: [gentoo-user] Solid state disks...
Date: Sun, 22 Feb 2009 13:48:40
Message-Id: 49A157BA.6010903@shic.co.uk
1 I'm playing around with an application that requires me to manage a
2 large (multi-gigabyte to terabyte), bespoke, frequently-updating data
3 structure in real-time... key concerns are for durability and
4 efficiency. While a traditional approach might be to employ an
5 expensive DBMS on expensive hardware... I'm looking to be more
6 innovative. I want to achieve big-iron beating performance on a
7 shoestring budget... and I'm optimistic since the problem domain doesn't
8 translate well to traditional RDBMS approaches.
9
10 An obvious alternative to a DBMS is to use the file-system directly...
11 in principle this could work - but it would be a laborious process
12 fraught with potential pitfalls with respect to atomicity of updates,
13 transactional recovery (in case of a fail-stop while processing a large
14 update) etc. Another issue is that in order to establish an efficient
15 and reliable implementation, it becomes necessary to second guess
16 details about the implementation of file-systems... this vastly
17 complicates any implementation and might render it unacceptably fragile
18 (subject to unexpected deviations in behaviour as the implementation is
19 moved between hardware/OS-versions etc.
20
21 I've recently discovered that SSDs are becoming more affordable... and
22 this might present new options. There were major hurdles in attempting
23 to establish a strategy to interact with hard-disk block devices...
24 including, but not limited to, a significant difficulty in establishing
25 the extent to which locality of reference affected performance. Another
26 worry was that it might be difficult to establish that a write had
27 actually completed (i.e. the data reliably and durably stored - not just
28 that the responsibility for recording the data was now exclusively with
29 the drive.) My hope is that SSD technology simplifies some of these
30 concerns - allowing a clear model for access performance that should
31 allow an efficient and reliable implementation.
32
33 I'd like to hear about anyone who has experience with configuring SSDs
34 for use with (Gentoo) Linux - and especially from anyone who's
35 investigated performance issues. I've read that SSDs typically have a
36 64Kib block size... this would work fine for me (though I understand
37 that it is a significant impediment for high performance with existing
38 file systems. I'd be interested to know if anyone has done performance
39 analysis of SSDs at the device level under Linux... and am intrigued if
40 there is more to interacting with them than establishing the block size
41 from manufacturer data - then reading/writing appropriately many bytes
42 from block devices... and/or flushing appropriately aligned and sized
43 blocks of memory mapped data. For example, is there an interface to
44 quiz an SSD about its block-size? I'm intrigued to establish if I can
45 rely upon my data being durably stored on an SSD when a flush/write returns.
46
47 In a practical sense, I'd like to experiment with some SSD hardware -
48 but there seems to be a lot to chose from. For development purposes,
49 I'd not need more than, say, 32GB - and I'm not all that fussed about
50 absolute performance - as long as the relative performance of various
51 interactions will increase proportionally were I to move to more
52 expensive SSDs in future. I'm interested to establish any practical
53 anecdotes (or hard statistical data) about the relative merits of
54 various interfaces for SSDs - and to establish if RAID needs to be taken
55 into account when establishing a performance model.
56
57 Any feedback would be appreciated... especially from any gentooist who
58 is interested in SSD performance/reliability/configuration.