ext4, application expectations and power management
Mar 17, 2009, 08:32 (0 Talkback[s])
Re-Imagining Linux Platforms to Meet the Needs of Cloud Service Providers
"Asking the filesystem to truncate a file and then writing to it
is an invitation to failure - there's clearly no way for it to
intuit the correct answer here. In the end this has been avoided by
avoiding delayed allocation when writing to a file that's just been
truncated, so everything's fine.
"However, there's another case that also breaks. A common way of
saving files is to open("foo.tmp"), write(), close() and then
rename("foo.tmp", "foo"). The mindset here is that a crash will
either result in foo.tmp being zero length, foo still being the
original file or foo being your new data. The important aspect of
this is that the desired behaviour of this code is that foo will
contain either the original data or the new data. You may suffer
data loss, but you won't suffer complete data loss - the
application state will be consistent.
"When used with its (default) data=ordered journal option, ext3
provided these semantics. ext4 doesn't. Instead, if you want to
ensure that your data doesn't get trampled, it's necessary to
fsync() before closing in order to make sure it hits disk.
Otherwise the rename can occur before the data is written, and
you're back to a zero length file. ext4 doesn't make guarantees
about whether data will be flushed before metadata is written."