<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Don&#8217;t fear the fsync!</title>
	<atom:link href="http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/feed/" rel="self" type="application/rss+xml" />
	<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/</link>
	<description>Musings about Open Source, Linux, and Life by Theodore Tso</description>
	<lastBuildDate>Mon, 22 Feb 2010 22:39:59 -0500</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Dioktos</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2778</link>
		<dc:creator>Dioktos</dc:creator>
		<pubDate>Sat, 07 Nov 2009 11:18:43 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2778</guid>
		<description>@6: Regarding &quot;Ubuntu Jaunty and Firefox 11 beta kernels&quot;:

I knew Firefox was getting bloated, but that&#039;s a bit excessive... :P</description>
		<content:encoded><![CDATA[<p>@6: Regarding &#8220;Ubuntu Jaunty and Firefox 11 beta kernels&#8221;:</p>
<p>I knew Firefox was getting bloated, but that&#8217;s a bit excessive&#8230; <img src='http://thunk.org/tytso/blog/wp-includes/images/smilies/icon_razz.gif' alt=':P' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Glenn Maynard</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2611</link>
		<dc:creator>Glenn Maynard</dc:creator>
		<pubDate>Mon, 22 Jun 2009 22:25:19 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2611</guid>
		<description>Linux isn&#039;t a lowest-common-denominator kernel that doesn&#039;t do anything not already possible on other platforms.  Developers do the best they can manage on each platform; that&#039;s just part of porting.

(If the kernel can&#039;t implement fbarrier() for a particular scenario, it should return an error.  In practice, it would probably need to take an array of fds, and it should be possible to tell in advance whether fbarrier() will work with a particular set of FDs, to select an appropriate writing strategy.  Anyhow, we&#039;re not here to design an API that will probably never be implemented, but none of this is very difficult to define sensibly.)</description>
		<content:encoded><![CDATA[<p>Linux isn&#8217;t a lowest-common-denominator kernel that doesn&#8217;t do anything not already possible on other platforms.  Developers do the best they can manage on each platform; that&#8217;s just part of porting.</p>
<p>(If the kernel can&#8217;t implement fbarrier() for a particular scenario, it should return an error.  In practice, it would probably need to take an array of fds, and it should be possible to tell in advance whether fbarrier() will work with a particular set of FDs, to select an appropriate writing strategy.  Anyhow, we&#8217;re not here to design an API that will probably never be implemented, but none of this is very difficult to define sensibly.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2610</link>
		<dc:creator>Robert</dc:creator>
		<pubDate>Mon, 22 Jun 2009 14:58:26 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2610</guid>
		<description>Consider the case of NFS, or SSHFS, or any random not-an-ssd flash filesystem. Or ext2 on an older kernel (there are still environments that prefer kernel 2.0.current because of resource limits or new regressions). No matter what nice tools you have on a preferred filesystem, you cannot assume that your application is running in such an environment.

I accept that an fbarrier() api would be convenient and would substantially improve things for many usage profiles. But, what is the fallback strategy when running in an environment where the fbarrier() api is missing, ineffective, or where the api knows it cannot provide the expected guarantees? Merely having the fbarrier execute flush and sync type operations will result in worse behavior, because the app writer just blindly called fbarrier considering that it would do the right thing. 

I generally find that hiding complexity from the developer is a bad thing. It leads to him being unaware of what the machine is actually doing, and often leads to wildly inaccurate assumptions about performance and safety.

From #16 above (I think) 
&gt; If we are peppering our code with fsync’s, even if it doesn’t hurt “that much”, we are violating the abstraction that says the kernel is supposed to take care of buffering, caching, and writing things out to disk in a sane way.

The problem is that the kernel cannot know what each developer means by &quot;a sane way&quot;, and kernel behavior that is correct for one usage case is totally wrong for another. The behavior that is correct for a high load webserver is almost certainly wrong for a critical log path, and neither behavior is really quite right for a responsive medium-importance gui app. Adapting the kernel and libs to assume any one of these jobs is going to break more things than it fixes. This suggests to me that the expected kernel abstraction has gotten a little too abstract.

Perhaps what is truly needed is to document correct recipes for all the different intended-behavior cases then make sure the kernel doesn&#039;t suddenly regress any of the expected behaviors. It seems to me that a lot of people are counting on behavior that may or may not be in   but that certainly are NOT in ext2. When something other than their favorite assumed fs behaves differently, they go around crying &quot;Bug!&quot;. I agree that there is a bug, but perhaps we disagree as to which code contains it ;)</description>
		<content:encoded><![CDATA[<p>Consider the case of NFS, or SSHFS, or any random not-an-ssd flash filesystem. Or ext2 on an older kernel (there are still environments that prefer kernel 2.0.current because of resource limits or new regressions). No matter what nice tools you have on a preferred filesystem, you cannot assume that your application is running in such an environment.</p>
<p>I accept that an fbarrier() api would be convenient and would substantially improve things for many usage profiles. But, what is the fallback strategy when running in an environment where the fbarrier() api is missing, ineffective, or where the api knows it cannot provide the expected guarantees? Merely having the fbarrier execute flush and sync type operations will result in worse behavior, because the app writer just blindly called fbarrier considering that it would do the right thing. </p>
<p>I generally find that hiding complexity from the developer is a bad thing. It leads to him being unaware of what the machine is actually doing, and often leads to wildly inaccurate assumptions about performance and safety.</p>
<p>From #16 above (I think)<br />
&gt; If we are peppering our code with fsync’s, even if it doesn’t hurt “that much”, we are violating the abstraction that says the kernel is supposed to take care of buffering, caching, and writing things out to disk in a sane way.</p>
<p>The problem is that the kernel cannot know what each developer means by &#8220;a sane way&#8221;, and kernel behavior that is correct for one usage case is totally wrong for another. The behavior that is correct for a high load webserver is almost certainly wrong for a critical log path, and neither behavior is really quite right for a responsive medium-importance gui app. Adapting the kernel and libs to assume any one of these jobs is going to break more things than it fixes. This suggests to me that the expected kernel abstraction has gotten a little too abstract.</p>
<p>Perhaps what is truly needed is to document correct recipes for all the different intended-behavior cases then make sure the kernel doesn&#8217;t suddenly regress any of the expected behaviors. It seems to me that a lot of people are counting on behavior that may or may not be in   but that certainly are NOT in ext2. When something other than their favorite assumed fs behaves differently, they go around crying &#8220;Bug!&#8221;. I agree that there is a bug, but perhaps we disagree as to which code contains it <img src='http://thunk.org/tytso/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Glenn Maynard</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2609</link>
		<dc:creator>Glenn Maynard</dc:creator>
		<pubDate>Sun, 21 Jun 2009 17:08:39 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2609</guid>
		<description>I&#039;ve used SQLite extensively.  It&#039;s absolutely a correct--and in my experience, the best--tool for this task.

Adding an fbarrier()-like API would not inherently make anything less suitable for any other task.  It might in practice, because implementing it may be difficult and cause internal design changes that could have adverse effects; but there&#039;s nothing *inherent* about it that would do that.

Right now, we have a drill, and the hammer hasn&#039;t been invented.  The only means  we have available to bang nails (safely write files) is to hit them with a drill (call fsync).  There are no hammers (fbarrier).</description>
		<content:encoded><![CDATA[<p>I&#8217;ve used SQLite extensively.  It&#8217;s absolutely a correct&#8211;and in my experience, the best&#8211;tool for this task.</p>
<p>Adding an fbarrier()-like API would not inherently make anything less suitable for any other task.  It might in practice, because implementing it may be difficult and cause internal design changes that could have adverse effects; but there&#8217;s nothing *inherent* about it that would do that.</p>
<p>Right now, we have a drill, and the hammer hasn&#8217;t been invented.  The only means  we have available to bang nails (safely write files) is to hit them with a drill (call fsync).  There are no hammers (fbarrier).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2607</link>
		<dc:creator>Robert</dc:creator>
		<pubDate>Sun, 21 Jun 2009 10:39:58 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2607</guid>
		<description>&gt; So now SQLite is braindead and unrobust. Right.

I&#039;d argue that SQLite is (when correctly configured) quite smart and robust. However, I&#039;d also argue that it is not the correct tool for the job the Firefox developers had in mind. It is, however, quite simple to use and that apparently makes it the best choice for the task. 



Part of the problem here is that everyone seems to assume that there is only one valid kind of problem, and that only the solutions that trade other things to maximize capability for *that* kind of problem are valid ones. 

Another problem is that plenty of app developers seem willing to assume that the product will always run on Linux (by which they really mean &quot;Always run on a linux box using a given filesystem, configured in a given way&quot;). Many of them also seem to assume that &quot;Linux&quot; (see above) should return the favor by becoming perfectly optimal for their task at the expense of every other problemspace that might potentially ALSO be using the Linux kernel and libraries.

It is frustrating to watch someone take a perfectly good electric drill, use it as a hammer, complain vociferously that it&#039;s awkward and far too complex for the job,  and proceed to re-engineer it permanently into a bad hammer. Some of us occasionally need to drill holes or install drywall, and would really have liked to use the drill as a drill.</description>
		<content:encoded><![CDATA[<p>&gt; So now SQLite is braindead and unrobust. Right.</p>
<p>I&#8217;d argue that SQLite is (when correctly configured) quite smart and robust. However, I&#8217;d also argue that it is not the correct tool for the job the Firefox developers had in mind. It is, however, quite simple to use and that apparently makes it the best choice for the task. </p>
<p>Part of the problem here is that everyone seems to assume that there is only one valid kind of problem, and that only the solutions that trade other things to maximize capability for *that* kind of problem are valid ones. </p>
<p>Another problem is that plenty of app developers seem willing to assume that the product will always run on Linux (by which they really mean &#8220;Always run on a linux box using a given filesystem, configured in a given way&#8221;). Many of them also seem to assume that &#8220;Linux&#8221; (see above) should return the favor by becoming perfectly optimal for their task at the expense of every other problemspace that might potentially ALSO be using the Linux kernel and libraries.</p>
<p>It is frustrating to watch someone take a perfectly good electric drill, use it as a hammer, complain vociferously that it&#8217;s awkward and far too complex for the job,  and proceed to re-engineer it permanently into a bad hammer. Some of us occasionally need to drill holes or install drywall, and would really have liked to use the drill as a drill.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Glenn Maynard</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2605</link>
		<dc:creator>Glenn Maynard</dc:creator>
		<pubDate>Sun, 21 Jun 2009 00:27:03 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2605</guid>
		<description>Having a stable OS doesn&#039;t mean your laptop battery won&#039;t fail, or that the dog won&#039;t yank the plug.

There&#039;s a similar issue using Vim on a busy system.  Writing a file can block the editor for several seconds, because it--correctly--uses a usual safe write sequence (a different one, since it&#039;s overwriting the whole file).  With write barriers, it could get safe writes without blocking, so I wouldn&#039;t have to wait for several seconds, breaking my train of thought, as Vim freezes up on fsync.

&gt; I wish the firefox developers would design their data I/O in a more robust and less braindead way overall

So now SQLite is braindead and unrobust.  Right.</description>
		<content:encoded><![CDATA[<p>Having a stable OS doesn&#8217;t mean your laptop battery won&#8217;t fail, or that the dog won&#8217;t yank the plug.</p>
<p>There&#8217;s a similar issue using Vim on a busy system.  Writing a file can block the editor for several seconds, because it&#8211;correctly&#8211;uses a usual safe write sequence (a different one, since it&#8217;s overwriting the whole file).  With write barriers, it could get safe writes without blocking, so I wouldn&#8217;t have to wait for several seconds, breaking my train of thought, as Vim freezes up on fsync.</p>
<p>&gt; I wish the firefox developers would design their data I/O in a more robust and less braindead way overall</p>
<p>So now SQLite is braindead and unrobust.  Right.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ralph Corderoy</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2601</link>
		<dc:creator>Ralph Corderoy</dc:creator>
		<pubDate>Thu, 18 Jun 2009 15:38:09 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2601</guid>
		<description>@ads, I think Firefox needs to distinguish between protecting the user from Firefox crashing, and the OS crashing.  The former can be common depending on version, plugins, etc.  The latter a lot more unusual with Linux.  As long as FF has handed the data to the OS then I don&#039;t mind if it doesn&#039;t reach the platters for a while;  the OS can keep it in RAM for a bit if it, and the user, prefers.</description>
		<content:encoded><![CDATA[<p>@ads, I think Firefox needs to distinguish between protecting the user from Firefox crashing, and the OS crashing.  The former can be common depending on version, plugins, etc.  The latter a lot more unusual with Linux.  As long as FF has handed the data to the OS then I don&#8217;t mind if it doesn&#8217;t reach the platters for a while;  the OS can keep it in RAM for a bit if it, and the user, prefers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Timothy Miller</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2600</link>
		<dc:creator>Timothy Miller</dc:creator>
		<pubDate>Thu, 18 Jun 2009 12:46:20 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2600</guid>
		<description>The problem with Firefox is that it accesses the disk _constantly_.  I&#039;m not as concerned about the disk activity when you&#039;re actually DOING something.  But Firefox accesses the disk even when you&#039;re doing NOTHING.  This prevents my Mac from sleeping, for instance.</description>
		<content:encoded><![CDATA[<p>The problem with Firefox is that it accesses the disk _constantly_.  I&#8217;m not as concerned about the disk activity when you&#8217;re actually DOING something.  But Firefox accesses the disk even when you&#8217;re doing NOTHING.  This prevents my Mac from sleeping, for instance.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ads</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2599</link>
		<dc:creator>ads</dc:creator>
		<pubDate>Thu, 18 Jun 2009 12:20:28 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2599</guid>
		<description>@Ralph,
For any system (firefox or otherwise), if you say &quot;we MUST preserve the complete state every few seconds&quot;, then we are back to fsync every few seconds, and the whole argument runs back to the beginning w.r.t. laptop-mode, slow fsyncs on certain systems, et cetera.  I personally can&#039;t imagine how a few tabs can be so important, but to each his own.

Maybe an eventual solution is to migrate /home to a log-based FS (say, nilfs2, now in 2.6.30) which internally does all this &quot;snapshotting&quot; as part of normal operations.  Alternatively, one could write a userland library which does this for configuration files.  Really, this means &quot;open config files in append mode, and use a re-playable format&quot;.</description>
		<content:encoded><![CDATA[<p>@Ralph,<br />
For any system (firefox or otherwise), if you say &#8220;we MUST preserve the complete state every few seconds&#8221;, then we are back to fsync every few seconds, and the whole argument runs back to the beginning w.r.t. laptop-mode, slow fsyncs on certain systems, et cetera.  I personally can&#8217;t imagine how a few tabs can be so important, but to each his own.</p>
<p>Maybe an eventual solution is to migrate /home to a log-based FS (say, nilfs2, now in 2.6.30) which internally does all this &#8220;snapshotting&#8221; as part of normal operations.  Alternatively, one could write a userland library which does this for configuration files.  Really, this means &#8220;open config files in append mode, and use a re-playable format&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ralph Corderoy</title>
		<link>http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/comment-page-5/#comment-2598</link>
		<dc:creator>Ralph Corderoy</dc:creator>
		<pubDate>Thu, 18 Jun 2009 10:51:08 +0000</pubDate>
		<guid isPermaLink="false">http://thunk.org/tytso/blog/?p=355#comment-2598</guid>
		<description>@ads, I agree Firefox developers need to re-think, but I&#039;m not sure how your suggestion of /dev/shm helps.  If Firefox crashes I want to get it back just as it was when it crashed, including those half a dozen new tabs I&#039;ve opened that I wouldn&#039;t be able to re-find again easily from this morning&#039;s email/RSS reading.  The reason Firefox is making lots of effort to keep saving the current state is that users, including me, would find it very annoying if it restored itself as it was half an hour ago.</description>
		<content:encoded><![CDATA[<p>@ads, I agree Firefox developers need to re-think, but I&#8217;m not sure how your suggestion of /dev/shm helps.  If Firefox crashes I want to get it back just as it was when it crashed, including those half a dozen new tabs I&#8217;ve opened that I wouldn&#8217;t be able to re-find again easily from this morning&#8217;s email/RSS reading.  The reason Firefox is making lots of effort to keep saving the current state is that users, including me, would find it very annoying if it restored itself as it was half an hour ago.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
