Atomic shared log file writes with FILE_APPEND_DATA
TL;DR: Serilog’s file sinks are getting multi-process sharing courtesy of this little-known Windows API feature.
I was flicking through my copy of Michael Kerrisk’s The Linux Programming Interface a few nights back and stumbled upon the Linux O_APPEND
file open flag. Files opened with this option don’t track a current position in the file but instead append all writes, atomically, to the end of the file.
This is just what concurrent writers to a shared log file need in order to avoid clobbering each other’s writes, or interleaving chunks of events in unexpected ways.
Serilog’s simple file and rolling file sinks had so far not implemented multi-process file sharing, because all of the synchronization mechanisms I’d seen elsewhere had some annoying edge case or major performance flaw. Wouldn’t it be great, I thought, if O_APPEND
existed on Windows?
Some investigation along these lines led me to FILE_APPEND_DATA
. The documentation for this Windows API flag describes it as an access right granting the right to append data to the file. It’s a little strange to consider it an “access right” rather than a feature of the file handle you’ve opened, but if you only have the right to append data to a file - not to write it at an arbitrary position - then I guess it makes sense that data should end up written to the end of the file in a reliable fashion.
What about using this from C#/.NET? This is where things get a little tricky.
Of course, C# libraries can PInvoke any Windows functions they please, including CreateFile()
with FILE_APPEND_DATA
specified. In practice, maintaining this sort of code is a pain. A quick search revealed this great StackOverflow thread discussing how FILE_APPEND_DATA
maps to the FileSystemRights.AppendData
constant in .NET.
var file = new FileStream(
"some_file.txt",
FileMode.Append,
FileSystemRights.AppendData, // <- Atomic append
FileShare.Write,
4096,
FileOptions.None);
While the FileStream
constructor accepts this flag, the internal buffering in both FileStream
and StreamWriter
make it awkward to get atomic writes working correctly. In order for writes to be atomic using FILE_APPEND_DATA
, they need to be performed in a single call to the underling Windows API.
To get atomically-appended writes to shared log files working:
- The
FileStream
internal buffer needs to be big enough to hold all of the data for a single write - Data has to be written into the buffer from position 0 so that it fits
- Text written through the
StreamWriter
wrapping theFileStream
has to be completely flushed before each write
Of these requirements, the first one (1) is the least pleasant to work around. The buffer size is fixed when a FileStream
is created (the 4096
argument above), so the only way to atomically write a larger event is to close and reopen the file with a larger buffer. For Serilog, determining how many bytes an event requires means formatting it into yet another buffer first.
Flushing the StreamWriter
and FileStream
between writes neatly takes care of requirements (2) and (3).
You can check out the slightly nasty choreography of buffers and flushing here in SharedFileSink.cs
.
How does it perform?
Writing events through a StreamWriter
’s buffer, to a MemoryStream
, to a FileStream
’s buffer, to Windows, doesn’t sound like a recipe for good performance. Surprisingly, my simple test of writing 1,000,000 small events to file on my Windows 10, i7, SSD laptop showed a difference of less than 10% between the “shared” and “exclusive” code paths (about 10 µs/event with sharing, 11 µs/event without).
Differences will be more significant when large events are encountered (FileStream
re-opening) and there’s some additional memory required, so sharing is off by default in both the file and rolling file sinks.
Turning off the default Flush()
that the sinks normally perform (buffered: true
) boosts performance to about 5 µs/event, and this option is not available in conjunction with sharing, so the performance difference between the approaches is more marked if that configuration is in use.
How do I get it?
These changes haven’t made it through to final released packages yet, but pre-release builds are on NuGet. If you’re in a situation requiring multi-process shared log files, update to the latest pre-release version of the appropriate package:
Update-Package Serilog.Sinks.RollingFile -Pre
Then pass shared: true
to the configuration method:
Log.Logger = new LoggerConfiguration()
.WriteTo.RollingFile("log-{Date}.txt", shared: true)
.CreateLogger();
Unfortunately, FileSystemRights.AppendData
isn’t exposed in .NET Core, so this feature only works on the full .NET Framework at the moment.
In all this was a fun little feature to investigate and implement, and I’d love to hear from you if you find it useful.