SSD’s, Journaling, and noatime/relatime

On occasion, you will see the advice that the ext3 file system is not suitable for Solid State Disks (SSD’s) due to the extra writes caused by journaling — and so Linux users using SSD’s should use ext2 instead. However, is this folk wisdom actually true? This weekend, I decided to measure exactly what the write overhead of journaling actually is in actual practice.

For this experiment I used ext4, since I recently added a feature to track the amount of writes to the file system over its lifetime (to better gauge the wear and tear on an SSD). Ext4 also has the advantage that (starting in 2.6.29), it can support operations with and without a journal, allowing me to do a controlled experiment where I could manipulate only that one variable. The test workload I chose was a simple one:

  • Clone a git repository containing a linux source tree
  • Compile the linux source tree using make -j2
  • Remove the object files by running make clean

For the first test, I ran the test using no special mount options, and the only difference being the presence or absence of the has_journal feature. (That is, the first file system was created using mke2fs -t ext4 /dev/closure/testext4, while the second file system was created using mke2fs -t ext4 -O ^has_journal /dev/closure/testext4.)


Amount of data written (in megabytes) on an ext4 filesystem
Operation with journal w/o journal percent change
git clone 367.7 353.0 4.00%
make 231.1 203.4 12.0%
make clean 14.6 7.7 47.3%

 

What the results show is that metadata-heavy workloads, such as make clean, do result in almost twice the amount data written to disk. This is to be expected, since all changes to metadata blocks are first written to the journal and the journal transaction committed before the metadata is written to their final location on disk. However, for more common workloads where we are writing data as well as modifying filesystem metadata blocks, the difference is much smaller: 4% for the git clone, and 12% for the actual kernel compile.

The noatime mount option

Can we do better? Yes, if we mount the file system using the noatime mount option:


Amount of data written (in megabytes) on an ext4 filesystem mounted with noatime
Operation with journal w/o journal percent change
git clone 367.0 353.0 3.81%
make 207.6 199.4 3.95%
make clean 6.45 3.73 42.17%

 

This reduces the extra cost of the journal in the git clone and make steps to be just under 4%. What this shows is that most of the extra meta-data cost without the noatime mount option was caused by update to the last update time for kernel source files and directories.

The relatime mount option

There is a newer alternative to the noatime mount option, relatime. The relatime mount option updates the last access time of a file only if the last modified or last inode changed time is newer than the last accessed time. This allows programs to be able to determine whether a file has been read size it was last modified. The usual (actually, only) example that is given of such an application is the mutt mail-reader, which uses the last accessed time to determine if new mail has been delivered to Unix mail spool files. Unfortunately, relatime is not free. As you can see below, it has roughly double the overhead of noatime (but roughly half the overhead of using the standard Posix atime semantics):


Amount of data written (in megabytes) on an ext4 filesystem mounted with relatime
Operation with journal w/o journal percent change
git clone 366.6 353.0 3.71%
make 216.8 203.7 6.04%
make clean 13.34 6.97 45.75%

 

Personally, I don’t think relatime is worth it. There are other ways of working around the issue with mutt — for example, you can use Maildir-style mailboxes, or you can use mutt’s check_mbox_size option. If the goal is to reduce unnecessary disk writes, I would mount my file systems using noatime, and use other workarounds as necessary. Alternatively, you can use chattr +A to set the noatime flag on all files and directories where you don’t want noatime semantics, and then clear the flag for the Unix mbox files where you care about the atime updates. Since the noatime flag is inherited by default, you can get this behaviour by setting running chattr +A /mntpt right after the filesystem is first created and mounted; all files and directories created in that file system will have the noatime file inherited.

Comparing ext3 and ext2 filesystems


Amount of data written (in megabytes) on an ext3 and ext2 filesystem
Operation ext3 ext2 percent change
git clone 374.6 357.2 4.64%
make 230.9 204.4 11.48%
make clean 14.56 6.54 55.08%

 

Finally, just to round things out, I tried the same experiment using the ext3 and ext2 file systems. The difference between these results and the ones involving ext4 are the result of the fact that ext2 does not have the directory index feature (aka htree support), and both ext2 and ext3 do not have extents support, but rather use the less efficient indirect block scheme. The ext2 and ext3 allocators are also someone different from each other, and from ext4. Still, the results are substantially similar with the first set of Posix-compliant atime update numbers (I didn’t bother to do noatime and relatime benchmark runs with ext2 and ext3, but I expect the results would be similar.)

Conclusion

So given all of this, where did the common folk wisdom that ext3 was not suitable for SSD’s come from? Some of it may have been from people worrying too much about extreme workloads such as “make clean”; but while doubling the write load sounds bad, going from 4MB to 7MB worth of writes isn’t that much compared to the write load of actually doing the kernel compile or populating the kernel source tree. No, the problem was that first generation SSD’s had a very bad problem with what has been called the “write amplification effect”, where a 4k write might cause a 128k region of the SSD to be erased and rewritten. In addition in order to provide safety against system crashes, ext3 has more synchronous write operations — that is where ext3 waits for the write operation to be complete before moving on, and this caused a very pronounced and noticeable stuttering effect which was fairly annoying to users. However, the next generation of SSD’s, such as Intel’s X25-M SSD, have worked around the write amplification affect.

What else have we learned? First of all, for normal workloads that include data writes, the overhead from journaling is actually relatively small (between 4 and 12%, depending on the workload). Further, than much of this overhead can be reduced by enabling the noatime option, with relatime providing some benefit, but ultimately if the goal is to reduce your file system’s write load, especially where an SSD is involved, I would strongly recommend the use of noatime over relatime.

81 thoughts on “SSD’s, Journaling, and noatime/relatime

  1. hey, very nice entry!
    i am using an lenovo thinkpad t400s with an 128gb toshiba ssd and ubuntu 9.04. as filesystem, i use ext4 with the noatime instead of relatime and journal. shoud i rather use this ssd without journal? thanks!

  2. Thank you for all your educational posts on block devices and filesystems.

    As you, and probably a lot of other people, including myself, prefer noatime over relatime, is there any chance this could be implemented as an option to write in the super block? I often use live systems on CD’s or USB sticks, and it would be very nice to only have to specify it once per filesystem and then ensure it will always be mounted with that option. Same goes for any mount option, I guess.

  3. @gutman

    I have basically the same system. Do you have any more information on the quality/performance of this SSD? Are you still happy with ext4 with journal, or have you found something better? Thanks, if you reply it will be appreciated. Can anyone else comment on these SSD? Thanks.

  4. @14-ish

    With kernels 2.6.27 through 2.6.31 and an Acer AspireOne AOA 101, I have found that the geometry of the block device is configured differently after a resume from RAM, which causes massive filesystem corruption with all EXT variants. The symptoms are in the kernel messages – after resume I get numerous out of range errors accessing the device, and on next FSCK I lose just about every file that has been touched since resume, as well as random corruption of other files.

    Mostly because of my tiny SSD, I keep my /home on a 16G SD card in the recessed SD slot on the left. Needless to say I don’t use suspend to RAM on this netbook any more. Resume from RAM is about 14 seconds anyhow. Booting only takes twice that long.

    I’m not sure how to report the block device configuration problem, and I’m not really looking forward to corrupting my /home again to replicate the error to create a bug report, lol. If it does happen again, I’ll try to be a better community member and get that bug report in.

    Next time any of you are reporting corruption of an SD card’s filesystem ask yourself: have I suspended to RAM? Have I got an EXT filesystem?

  5. @55 Some relevant details:

    In the first paragraph the block device I’m referring to is /dev/mmcblk0, which is a SanDisk 16G card in a JMicron controller with PCI ID 197b:2381

    mmc0: SDHCI controller on PCI [0000:01:00.0] using ADMA
    sdhci-pci 0000:01:00.2: SDHCI controller found [197b:2381] (rev 0)
    sdhci-pci 0000:01:00.2: PCI INT A -> GSI 16 (level, low) -> IRQ 16
    sdhci-pci 0000:01:00.2: Refusing to bind to secondary interface.
    sdhci-pci 0000:01:00.2: PCI INT A disabled

    [I'm guessing that explains the high CPU use when writing to this device]

    mmc0: new SDHC card at address 0007
    mmcblk0: mmc0:0007 SD16G 15.3 GiB
    mmcblk0: p1
    jmb38x_ms 0000:01:00.3: PCI INT A -> GSI 16 (level, low) -> IRQ 16
    jmb38x_ms 0000:01:00.3: setting latency timer to 64

    [This contradicts the interrupt rejection above]

    EXT2-fs warning (device mmcblk0p1): ext2_fill_super: mounting ext3 filesystem as ext2

    [expect this is because I turned the journal off]

    pciehp 0000:00:1c.0:pcie04: Device 0000:01:00.0 already exists at 0000:01:00, cannot hot-add
    pciehp 0000:00:1c.0:pcie04: Cannot add device at 0000:01:00
    pciehp 0000:00:1c.0:pcie04: service driver pciehp loaded
    pciehp 0000:00:1c.1:pcie04: Bypassing BIOS check for pciehp use on 0000:00:1c.1
    pciehp 0000:00:1c.1:pcie04: HPC vendor_id 8086 device_id 27d2 ss_vid 0 ss_did 0

    [not sure why pciehp is trying to reconfigure all my PCI devices at this late stage]

  6. Hi Ted,

    After sleeping on it, I am not sold on the stats given in the tables. These are for workloads a normal user would not likely see in their day-to-day use. In reality writes will be more sparse over time, resulting in many more syncs per write, which in turn will prevent a lot of write combining.

    How bad can bad be?

    For each of these I deleted the file; synced, and re-ran the test 20 times then took the lowest time (trying to find the hardware limit):

    $ time bash -c ‘dd if=/dev/zero of=nominal_case bs=4096 count=1; sync’
    reveals a ‘real’ time of 92ms

    $ time bash -c ‘dd if=/dev/zero of=nominal_case bs=409600 count=1; sync’
    reveals a ‘real’ time of 687ms

    Running sync few thousand times reveals that it requires 58ms to run, on average. Subtracting that from the above numbers I get:

    Create a 4K file with a single 4K write: 34ms
    Create a 400k file with a single 400k write: 629ms = 6.29ms per block

    Now for the worst-case scenario: create a 4k file with 4096 single-char writes:

    $ time bash -c ‘for (( a=0 ; a>pathological_case; sync; done’
    reveals a ‘real’ time of 5m35.320s on an idle machine.

    Subtracting the sync times I get 96168ms. That’s a 2828:1 performance ratio for the single-block file. For the 100 block file, it’s 15289:1.

    This is not hard to predict. For every write to the file the mtime in the directory will change, the file data will change. If journaling is on, that will change too. Depending on the filesystem, the block bitmaps may change as well, so we can expect to see that if 1 char is written per sync period we may well see on the order of 4 blocks written = 16K:1 write amplification, which in turn explains much of the performance penalty seen in this test.

    Even the ideal X25 hardware will suffer from this, unless it has some internal heuristic which recognises the incremental writes and takes advantage of some flash devices’ ability to write very small blocks, even a byte at a time, and also assuming that the filesystem does not prevent it from doing this by allocating a new block for every write, as some of the journaling filesystems can/do.

    So again, I don’t believe that the tables presented are indicative of normal use.

    The reality is probably somewhere around 3x the overhead suggested, with journaling plus atime resulting in perhaps more than double the write volume, and journaling alone contributing perhaps more like 30-50% write load for lighter use such as browsing, which I think most people do a lot more of than compiling…

  7. Wil,

    The X-25 hardware will demonstrate minimal write amplification since it has an indirection layer which maps 512 byte sectors into partially written flash blocks. Intel doesn’t say what size erase blocks it is using, but say that it’s 64k. That means there are 128 sectors in each erase block. Each time the system writes a 512 byte sector, the new contents of that sector will be written into a partially written erase block, and then the indirection layer will be updated so that sector that had contained the previous counts of that sector is marked as no longer in use. Eventually, when an erase block consists of completely superseded sectors, it is erased and then made available for new contents. If the X-25 comes close to completely exhausting available unwritten sectors, then the X-25 controller will pick a flash block that contains the largest number of previously used sectors, and copies the still-in-use sectors so that the flash block can be erased. In effect, it’s a garbage collection pass.

    As a result, it doesn’t matter whether the writes are contiguous or sparsely separated. Writing non-contiguous sectors will have some overhead, since it results in more garbage collections, and copying still-unused sectors is extra overhead. Still, X-25′s write amplification factor is claimed by Intel to be 1.1, as compared to a factor of 20 found in most naively implemented flash devices. (This includes most USB thumb drives, SD cards, and most older SSD’s. There are a few newer SSD’s that are competently implemented, but there are many SSD’s which are complete crap — and after the X-25, any SSD which isn’t implementing this kind of flash translation layer is complete crap. :-)

    As far as your results are concerned, the time needed do certain operations can be an indication of write amplification, yes. But note that most of the time, many applications are constantly forcing blocks to the disk using the sync command. If you do, it’s true that it will force writes to the allocation bitmaps, to the journal, etc. But if you aren’t forcing a sync after every single 4k write, then multiple updates to the file system metadata and the journal is very likely to occur. Most desktop workloads don’t look like mail server workloads, which tend to force a sync after typically writing two files (the qf* and df* files if you’re send-mail, or the *-D and *-H files if you are using mutt, etc).

  8. @55: Wil,

    I doubt the problem is due to a different hard drive geometry, since the Linux kernel doesn’t care about the hard drive geometry. The fdisk program cares about it, but only because the bootloaders care about HD geometry if they are using the oldest BIOS interfaces.

    It does sound like the SDHCI controller isn’t getting its state properly saved before the system is suspended, such that some or all reads and writes aren’t being accepted afterwards. This is going to cause problems no matter what file system you are using. I can’t really help you debug this; I’d suggest sending a note to LKML. Maybe Rafael Wysocki (who does a lot with suspend/resume). Or you might send a note to the linux-mmc@vger mailing list. According to the MAINTAINERS file, the sdhci driver is orphaned, which means there is no active maintainer, but maybe someone on the LKML or the linux-mmc list will be able to help you.

    Good luck!

  9. @58 Sorry, I should have been more specific. I mean data : write bandwidth amplification, not hardware write amplification after the fact due to rewriting entire eraseblocks for a 512-byte disk block flush. Again, to be clear, I was talking about the ratio found when comparing a single byte written to a file followed by a sync, to the number of bytes passing the interface to the block device. This ratio is inherent in the way the filesystem uses blocks to represent storage, and doesn’t even have anything to do with SSDs.

    When files are being updated slowly, ie not at Bonnie-like speeds, then the system’s natural sync rate applies. If I only modify a small number of files in that period then the write combining done between each flush can only merge that limited number of changes. Ie if I write 1000 files over the course of 1 hour, that will average to about 1 file every 3 seconds. This cannot be handled as efficiently as if all 1000 are written in 15 seconds.

    I hope this clarifies what I’m talking about. Real-world usage is far more more sparse in the time domain than your git-clone ; make; make clean example, and that is why I’m skeptical of your numbers when applied to an average user.

  10. Aha, someone has figured out the Acer AspireOne SD card problem. This probably applies to a lot of other SD controllers as well.

    http://en.gentoo-wiki.com/wiki/Acer_Aspire_One_A110L#SD_Cards_and_suspend

    Long story short, the kernel unmounts and remounts the device during suspend/resume. EXT2 and EXT3 fail to unmount, the kernel increments the device number (ie mmcblk2p1 instead of mmcblk1p1) and therefore you end up writing to a nonexistent device.

    2 workarounds include : use UNSAFE_RESUME kernel option for the MMC driver, which will prevent the kernel from messing with the device, or use LVM, which will redirect the mount seamlessly, and add lovely things like dm_crypt, snapshotting, etc. :-)

  11. @60:

    Well, I still don’t believe that most users have workloads that use a huge number of sync’s. And very often people will have workloads where a number files are writing in groups (i.e., when they type “make”, or when updating the software on their laptop to fix security bug in firefox or GNOME). If the user is editing a document in a word processor, they won’t be writing 1000 files an hour. Maybe they hit the save button once every five minutes, but even if they hit the save button once a minute, that’s still only 60 files an hour.

    So what workload do you think users would have where they will both (a) writing 1000 files a minute, and (b) will be spreading those writes evenly over the hour, so there is no write-combining, or where they will be writing files slowly? In fact, the vast majority of files are written all at once, and are not appended to slowly. The only exception to this are log files and mail spool files. But those are really the exceptions that prove the rule….

  12. @62: The 1000 files in a minute is the ideal circumstance, like your make example. That allows lots of write combining, even preventing some temporary files from ever making it to disk…

    It’s pretty easy to list examples of slow updates. A couple dozen RSS feeds open in an RSS aggregator (these creep in increments of around 300 bytes). A bittorrent client keeping a client list for a number of active torrents. A web developer with their browser set to refresh a page every 15 seconds. 2 chat clients logging all friend messages and status changes. The same web developer running their own Apache with PHP, editing their script and debugging bits here and there in a relatively constant stream. Many of these programs keeping their own logs and sending notices to the system message log…. it’s really not that difficult to generate a write every 3 seconds.

    I still think you’re largely right, but I still think it’s going to be a lot worse than your tables, which are representative of the ideal circumstances for which the filesystems and VFS are tuned – a heavy workload.

    Probably the only way to really prove this would be to have counters in the VFS, FS, and block layers which track the total number of block writes generated by each of:

    1) data changes
    2) journal changes
    3) metadata changes

    … and reveal them as counters in /proc for a given mount so that benchmarks can be run on various scenarios, and so that the write bandwidth of the journal can be compared directly to the volume of input data.

    This sounds pretty similar to what you started out the whole article with.

    PS – running without a journal on something as flaky as an SD card, I do keep regular backups. I use rsync on an hourly cron job to an external disk (when present) and duplicity nightly to a remote server, so I feel pretty comfortable.

    PPS, thanks for all your pointers re MMC / SDHCI, which put me on the track of a good solution – using LVM to manage the removable media. I wish it came configured out of the box like that. Once I get it tweaked to my tastes I will pass it on as a suggestion to UBUNTU/MID team and Debian. Very much appreciated.

  13. @54, ZNiP

    Yes, I am still happy with this solution! I am using ext4 with noatime and journal on the toshiba ssd (THNS128GG4BAAA).

    Small write-benchmark with ubuntu 9.10/64bit:

    root@t400s:/home/user# dd if=/dev/zero of=test.dat bs=1M count=5000
    5000+0 Datensätze ein
    5000+0 Datensätze aus
    5242880000 Bytes (5,2 GB) kopiert, 23,9574 s, 219 MB/s

  14. That’s write-through caching performance, not real sustainable write performance.

    To find the real sustainable sequential write speed, try this:

    dd if=/dev/zero of=_filler bs=1M count=10 oflag=sync; rm -f _filler

    My AOA SSD’s performance (Z-P230, model SSDPAMM0008G1) is roughly 5.4M/s, a little over half what Intel claim(ed) it is (38M/s read, 10M/s write). The drive is at 88% capacity, so that’s really not too bad. SSD drives tend to slow down as they fill up, and get older, for a variety of reasons. Intel has pulled all specs for this device from their site. I’m guessing they’re not specially proud of it.

  15. @65:

    Actually, Intel’sX25-M Product Manual claims a sustained write performance of up to 70MB/s, but that’s not the important figure. The much more interesting number, which most SSD’s, including your SD flash card, can’t match is the random 4k write benchmark using iometer. Traditionally flash devices, including most SD cards, were used in cameras, where you tended to write nice big images, and so you could get away with a relatively primitive filesystem such as FAT. But Linux/Unix system (or even Windows or MacOS) will be writing small files and big files, and will tend to have a much more random workload that includes writing many more single-block writes.

    Using an iometer queue depth of 3, with 4k random writes, the X25-M can sustain a write bandwidth of 54.5 MB/s. In contast, a crappy JMicron JMF602B-based SSD can do maybe 21 k/s, and with an average latency of 500ms, and a worst-case latency of 2 seconds. That means the system could take up to 2 seconds to write a random 4k block!!! This is what really matters if you’re trying to engineer for performance. (The X25-M has an average random 4k write latency of 0.089ms, and a worst case write latency of around 100ms.) 3 orders of magnitude difference when measuring 4k random write latency is what separates the big boys from the also-rans. Sequential write speed matters only if your work load is a digital camera taking pictures, or something equivalent.

    Try running iometer with a 4k random write size and a queue depth of 3, and see what you get on your SD card, and report back if you dare. :-)

  16. hehe, yes, I know the X25 is fast. I have been following it, the OCZ Vertex series, and some other pretenders.

    The SuperTalent FusionIO PCIE 8x card, blows away all the SSD competition. It’s not really an SSD – it’s more like a NAND memory expansion with up to 2TB of address space. It beats the X25m by a factor of 25 or more on random 4k block writes (off the top of my head.)

    Having no Windows platform to run iometer on, and with the Linux binary having no command-line options to directly run tests (that I can find), also the documentation for the latest version of IoMeter currently dumping Perl code instead of running the script on the IoMeter website, I’m afraid I’m not going to get any mileage out of that app…

    http://www.iometer.org/cgi-bin/jump.cgi?URL=http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/iometer/iometer/Docs/Iometer.pdf?rev=HEAD

  17. @68: Wil,

    It wasn’t clear to me you understood that the X25-M is fast, if you’re comparing it to a SD card. :-(

    The Fusion IOXtreme drive is fast, yes, but it’s also about 4 times the cost of an X25-M drive, and you can’t boot off of one, and it won’t fit in an laptop. But sure, if you have a PCIE card, and you can afford it, and you need its speed, then it’s a good choice.

    The trick with using Iometer under Linux is apparently to put the options in the configuration file, but it’s not at all well documented, and I don’t have time to try to figure out the the config file from the sources — but iometer is open source, so hopefully someone will get around to fixing it up and documenting it better for operation under Linux.

  18. That X25-M SSDSA2MH160G2R5 part is looking pretty inviting. There’s an OCZ part that’s a little cheaper, faster, smaller.

    Really, if you have a laptop in the $1000+ range, what can anyone add to it that will improve the overall performance as much for the money as putting one of the above SSDs into it?

    I’m sure the manufacturers know this, but it’s hard to sell 1/10th the storage for the same money, regardless of the performance.

    There’s still the question of reliability. Both Intel and OCZ state “1.5M hours MTBF” then offer a 1 or 2 year warranty. Hello? 1.5M hours is 171 years! If there’s a shred of honesty behind their statements then they would back them with lifetime unrated replacement warranties. So take both with a sack of salt.

    I’ll leave checking OCZ’s track record on SSD parts as an exercise for the reader. Spending that much money without doing research is something I would never encourage anyone to do. BTW, that was a subtle hint to check OCZ’s track record before buying their parts…

  19. @69: I was comparing the Intel Z-P230 to an SD card. They have very similar performance characteristics, in fact a high-end SD card outperforms the Intel SSD in all but max linear read, and even then by <10%. Pretty sad.

  20. @69: Wil,

    Sorry, I missed the Z-P230 reference. I must have been reading your post too quickly, and I probably mistook it for an Atom CPU part number…. My bad. Yeah, as far as I can tell that Intel SSD was something cheap cheap cheap that was intended for sale directly to netbook manufacturers. It has since disappeared without a trace.

    Note BTW, that if price is an issue and you don’t need that much capacity (how much do you need for a netbook or even a laptop if most of your software is on the cloud? I need 80GB because I’m a developer; if I nuked all of the source trees I could probably live with 40GB) you can also get a 40GB Kingston SSDNow V SSD which uses the Intel controller, but with half the flash channels. You have to be careful though; the 64G and 128G Kinginston SSDNow V use the JMicron controller. (V stands for value, which can sometimes also be another way of saying, buyer needs to be careful. :-)

    With OCZ, yes, you have to be careful. They produce a large number of devices at different price points, and clearly at different levels of quality.

  21. I should add that if you have a larger laptop, such as Lenovo Thinkpad T series, you can use two drives. A 40GB or 80GB SSD drive where you have your OS and your home directory, and a 500GB 5400rpm hard drive where you have your build directories, music, images, etc. That’s what I do these days… the source tree is on the SSD for speed, but the build trees where the compiler deposits the object files is on a 500GB disk. That way I get the best of both worlds. I use the SSD for frequently accessed files or files where fast access will improve the “feel” of my system, and I use the hard drive for bulk storage, and writes which if slower (such as writing object files) won’t hinder the overall speed of my system.

    This is also very easy to do for a desktop, of course — although these days I generally use my laptop instead of a desktop.

  22. Hi Ted,
    How do you use the ext4 information about the lifetime write in practice?
    I’ve seen your kernel commit:

    commit afc32f7ee9febc020c73da61402351d4c90437f3
    Author: Theodore Ts’o
    Date: Sat Feb 28 19:39:58 2009 -0500

    ext4: Track lifetime disk writes

    Add a new superblock value which tracks the lifetime amount of writes
    to the filesystem. This is useful in estimating the amount of wear on
    solid state drives (SSD’s) caused by writes to the filesystem.

    but I don’t know how I can actually see that information? Does it appear
    in the output of some tool made for ext4?

  23. Ted,

    I notice that all of the benchmarks do writes. However, I wonder what might happen if you included one that was read-only, such as a recursive word count.
    Something like “find /fs -type f | xargs wc”.

    There you would see the atime/relatime factor really show up. But by how much? I would guess A Lot. I think atime is useless (mutt bedamned) personally, but by how much?

  24. Another program that depends on atime is tmpwatch. Took me a while to figure out why /var/tmp/nginx/* kept disappearing on me. Turns out noatime + tmpwatch guarantees issues for more than just mutt.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>