Message templates and properties – structured logging concepts in .NET (3)

This post is the third in a series; start here.

Trojan eventing

A long time ago I encountered a library called Bamboo.Prevalence, by the amazing Rodrigo “Bamboo” de Oliveira. Following Java’s Prevayler, the idea underpinning object prevalence is that rather persisting your application’s object model as a point-in-time snapshot, a log of commands affecting the model (Add item 5 to order 1!) is written instead. Later, to recreate the model at a point in time, the application can just replay those commands from the beginning to arrive at the desired state.

I was fascinated by the idea of object prevalence. It turned the whole notion of application state from a static, opaque thing into a dynamic and detailed view of the application’s history.

Prevalence never really took off in .NET because of the practical challenges associated with data migration, reporting, integration points and so-on. Existing infrastructure expected a SQL database, and moving away from that model would have necessitated “rewriting the world”.

Still, a little later on I encountered this idea again in the guise of event sourcing, an architectural pattern based on a similar application-state-as-transaction-log model.

As with object prevalence before, I found event sourcing very interesting, but as with object prevalence again, the divide between the architecture and the status quo in typical enterprise software environments kept it out of most projects I was involved with.

Events are exciting because they capture a whole dimension that’s otherwise invisible, but equally if not more important than the static view that the current system state describes. In .NET, messaging architectures sometimes capture an event-oriented view of an application, and Reactive Extensions (Rx) has had some success with getting a first class treatment of events into .NET apps, but events are still usually a secondary, transient consideration and hence a lost opportunity for diagnostics and analysis.

Eventually, down the line, I stumbled across what you might call Trojan eventing, a sneaky way of grafting a fully-structured, persistent, queryable event stream onto an application, without a change of architecture and in most cases, without many of your teammates even noticing! Hey, your app right now might be producing a first-class event stream! To help you recognize it, this post describes what it looks like 😉

Capturing vs. formatting log event data

The classic workflow associated with text logging goes something like this:

  1. Take the description of an event and associated parameters
  2. Format it into human-readable text
  3. Collect the text
  4. Parse the text to recover individual parameter values
  5. Filter based on parameters
  6. View matching events

If you break this down, the formatting step (2) actually achieves two things. The event is made into something human readable (the formatting part), but by serializing the whole thing as text, the associated parameters are also being captured in a form that can survive outside the application.

Step 4, where we need to expend effort to recover information from the event for machine processing, comes about because we threw all of the structural information away at step 2. This is painful because practically every interesting thing we can do with a log event revolves around the specific parameter values associated with the event. Writing regular expressions to extract those values gets tiring quickly.

To fix this, structured logging approaches data capture and formatting separately. In the structured logging workflow, capturing happens first while formatting an event into human-readable text is deferred until it’s needed later on, after any property-based filtering has been applied:

  1. Take the description of an event and associated parameters
  2. Capture the description and parameter values
  3. Collect a serialized representation like JSON
  4. Filter based on parameters
  5. Format into human-readable text*
  6. View matching events

*Formatting can be deferred, but it doesn’t necessarily have to be – the collected JSON can include the human-readable representation alongside the template and parameters, if it makes processing easier in subsequent steps.

What this all amounts to is that the log statements already in your application:

Log.Information("Checked out order {0} for ${1:C}", order.Id, order.Total);
// -> Checked out order o12345 for $15.30

Can be converted into fully-structured, first-class events with almost insignificant effort:

Log.Information("Checked out order {OrderId} for ${Total:C}", order.Id, order.Total);
// -> Checked out order o12345 for $15.30
// -> { … "OrderId": "o12345", "Total":"15.3" … }

The message template "Checked out order {OrderId} for ${Total:C}" names the parameters of the log event (and in some cases describes how they are captured) so that they can be represented as properties in a structured format like JSON, and easily manipulated without the need for free-text parsing.

Message template basics

Message templates are a superset of .NET format strings. That means every valid .NET format string is also a valid message template.

Log.Information("Checked out order {0} for ${1:C}", order.Id, order.Total);

The property names captured in the example above will be 0 and 1.

While numeric property names are a useful compatibility feature when moving code from a format-based logging implementation like log4net, descriptive names are much more pleasant to work with. Message templates extend format strings by allowing textual names to be used instead of numeric placeholders.

In a message template with named placeholders, the parameters specified alongside the template are always matched from left to right, that is, the two properties First and Second will have the values 1 and 2:

Log.Information("Capturing {First} and {Second}", 1, 2);

When writing a message template, either numeric or named placeholders must be used. While libraries like Serilog will try to do something reasonable when confronted with a mix of named and numeric placeholders, doing so doesn’t make much sense.

This is because when capturing, named placeholders are matched strictly left-to-right with the arguments, while numeric placeholders in .NET format strings allow out-of-order capturing and repeated substitutions of the same index.

Because of their relationship to standard .NET format strings, width and format specifiers are also supported in message templates.

Log.Information("Capturing {First,8} and {Second:x8}", 1, 2);
// -> Capturing        1 and 00000002

(To fully support deferred rendering, both the raw property value and its formatted representation need to be captured up front in this scenario.)

Likewise, braces in the message itself can be escaped by {{ doubling them.

Validating message template usage

SerilogAnalyzer is a Roslyn-based Visual Studio plug-in by Robin Sue that validates message template syntax and checks binding to parameters.

Usage issues result in a red squiggly, and an error or warning in the Visual Studio error list.

SerilogAnalyzer

You can download and install the Serilog Analyzer VSIX package from the GitHub project page.

Capturing structure

The other extension that message templates add to typical format strings is control for how parameters are captured.

By default, a parameter logged alongside a message template will be converted into a simple scalar value like string or int. This matches our intuition for how format strings are supposed to work: if you provide a complex object, say, a BlogPost instance, it’ll be logged via its ToString() representation:

var post = new BlogPost("Structured Logging Concepts")
{
    Tags = { "logging", "diagnostics", "analytics" }
};
Log.Information("Publishing {Post}", post);
// -> Publishing Structured Logging Concepts

This is a very limiting: later when searching the log, it would be useful to be able to find, say, all published posts that are tagged “diagnostics”.

By prefixing an @ sigil ahead of the property name, the post object will be captured in a structured representation instead.

Log.Information("Publishing [email protected]}", post);
// -> Publishing BlogPost { Name: "Structured Logging Concepts", Tags: "logging", … }

If the destination format is JSON, Post will be serialized as an object with detailed Name and Tag properties. (Serializing logger — so now you know 😉 )

Up next…

We’ll see how complex data is captured in more detail in the upcoming post on serialization. Up next we’ll consider event types, the final detail necessary before we can call the discussion of message templates complete.

Events and levels – structured logging concepts in .NET (2)

Introductions are tough, I’m glad that’s out of the way. This post is going to dig straight into events and levels. If concepts like events, timestamps and logging levels are old news for you, this might be one post in this series to skip. Saying a few words about these things up front will make it easier to dig into the nitty-gritty later though, with everyone on the same page.

We instrument a complex applications with logging so that when things go wrong (or when we need to verify that things are going right!) there’s a trail of events that have taken place in the system, for us to inspect.

Thinking about system events as, well, events is fairly natural: something happened in the system at a particular time and needs to be recorded. But, if you’re coming from a purely text-based logging mechanism (perhaps just writing logs to System.Console then it’s a good time now to pause and consider that events are truly a first-class concept in structured logging.

At a minimum, an event is a timestamp and some data, usually comprising a list of named properties.

EventsTimeline

The timestamp is recorded automatically when a log event is captured, so despite being one of the most important elements of an event, it doesn’t show up much in logging code.

To make real-world logging systems convenient to work with there are other special properties, common to all events, that are given additional meaning by the system.

SpecialProperties

Among these special properties you’ll usually encounter a message of some kind, but how exactly that works in Serilog and similar libraries is the topic of the next post, so we’ll park that here and just consider messages to be simple strings.

Beside the timestamp and message, the level (or ‘severity’) of an event is the next thing you’ll find. This is so familiar and ubiquitous that it’s worth discussing what levels are, and are not, in structured logging systems.

Application logging is a balancing act, constrained by the performance cost of producing events, the infrastructure cost of transmitting and storing them, and the attention cost of someone reading or being interrupted by them.

In traditional logging, where the level is one of the few first-class concepts apart from the message, it’s the basis for controlling most of these cost trade-offs. When events are recorded, they’re given a level like debug, warning, or error.

Log.Debug("Checking for new mail");
Log.Information("Setting new pricing");
Log.Warning("Incoming projectile detected");
Log.Error("Could not connect to configuration server");

If you want to follow along and experiment in Visual Studio, now’s a great time to follow the Getting Started instructions on the Serilog wiki to create a basic console application with Serilog configured.

Levels are arranged in a hierarchy from least to most important, and the application is configured to generate events at a certain level or higher. Logging statements below the minimum level are no-ops.

This role – maintaining application performance by reducing log volume under normal operation – is an important function of the level even in apps that use structured logging.

In Serilog, the default level is Information, so the first event above won’t be emitted. The MinimumLevel option can be used to override this:

Log.Logger = new LoggerConfiguration()
    .MinimumLevel.Debug()
    .WriteTo.LiterateConsole()
    .CreateLogger();

Besides using levels like debug to control verbosity, the error level may flag something that should be fixed, and the warning level flag something to investigate, but operationally, levels take a back seat once structured data comes into the picture. Levels might be considered for log retention, and also when generating alerts and notifications, but other application-specific properties often play as much or more of a role in these scenarios.

My rule of thumb, when in doubt, is that if it’s worth recording in production an event is Information. The Serilog wiki has a little more information on the levels provided there.

Custom levels, a common feature of older libraries like the log4* family, exist because without rich structured data to examine, the level field has to take on a lot of responsibility. Instead of having a less-important-error level for exceptions in a non-critical subsystem, and a more-important-error level for another code path, properties of a structured log event describe the context of an error and can be used to make decisions about what to do with it.

With the basic terminology covered, I’m looking forward to discussing message templates and properties in the next post, where modern structured logging starts to diverge from what most of us are familiar with.

Structured logging concepts in .NET Series (1)

Between now and the Serilog 2.0 release that’s due sometime in July, I’ve decided to go back to basics and write some posts on the fundamentals of modern structured logging in .NET.

In the complex, distributed, asynchronous systems being built today, answering questions about behavior – what caused this issue? – is a horribly tedious thing to have to do without the right instrumentation. I hope this series will be useful for anyone wanting to build more manageable .NET apps with the new crop of frameworks and tools.

The examples use Serilog, but most of the same concepts appear in other libraries, so where possible I’ll call out the alternative APIs.

A tentative plan of attack, following this post:

  • Events and levels
  • Message templates and properties
  • Event types
  • Context and correlation
  • Serialized data
  • Collection and storage

But now, let’s get the ball rolling!

What is structured logging?

Approaches to application diagnostic logging can be roughly divided into text-based logging and structured, or event-based, logging.

Text logging, sometimes also called ‘printf debugging’ after the C printf() family of functions, is the first one most programmers encounter. As your code executes, it prints lines of text out to the terminal that describe what’s going on.

12:23:10 [INF] Service "loyalty-card-printer" starting up on host 8fd342hkg22u
12:23:11 [INF] Listening on http://8fd342hkg22u.example.com:1234
12:23:20 [INF] Card replacement request received for customer-109972
12:40:55 [INF] Card replacement request received for customer-98048

Because text logs are written as a narrative in natural language, they’re nice and easy to read.

Text is a very expressive format, and tools including full-text indexing and regular expressions can do a lot with it, but answering anything non-trivial with text log data can still take some painstaking work.

Structured logging is an umbrella term for alternatives focusing on easier searching and analysis at scale. A structured log event, rather than being a line of text, is conceptually a timestamped set of name/value properties.

Structured logging systems separate the way the event is captured (the logging API) from the way that it is transported (perhaps JSON) and how it is rendered in a user interface or file.

A rough translation of the sequence above to structured events might be conceptually like:

time=12:23:10, level=INF, service=loyalty-card-printer, host=8fd342hkg22u, activity=started
time=12:23:11, level=INF, endpoint=http://8fd342hkg22u.example.com:1234, activity=listening
time=12:23:20, level=INF, customer=customers-1099972, activity=replacement
time=12:40:55, level=INF, customer=customer-98048, activity=replacement

Structured logs are easier to navigate because the properties in the event are first-class: a log server can return results for a query like activity == "replacement" && customer == "customer-98048" without any configuration to describe how the events should be parsed.

The line between structured and text logs is blurry:

  • Text logs can provide some structure by encoding well-known fields like the timestamp, logging level and thread id at the start of each line, and by including name=value formatted fields in the message itself
  • Structured logs sometimes look a lot like text logs when formatted out to a flat file or the terminal, and often include a message field that is itself free text

But, these similarities are skin deep. Structured logging and text logging require different mindsets in order to use each well.

Historical barriers to adoption

Structured logging isn’t always 100% rainbows-and-unicorns-awesome. My guess is that two warts have kept adoption of structured logging in second place.

First, structured logging implementations can be more labor-intensive at development time than text-based alternatives.

In .NET, the EventSource API illustrates this well. Writing a structured log event with EventSource requires an event definition in the form of a C# method in a separate class, along with the line of code to actually record the event:

[Event(123, Message = "Card replacement request received for: {0}")]
internal void CardReplacementRequested(string customerId)
{
  if (this.IsEnabled()) this.WriteEvent(123, customerId);
}

// Elsewhere:
// eventSource.CardReplacementRequested(customerId);

Switching between files and creating an annotated method just to record a single log event is a tough cost to bear, in comparison with the single line of code to record a typical text log event.

Second, many structured logging implementations produce output that is harder to visually scan. This phenomenon is exaggerated in the earlier example – there are nicer ways to format the same structured events – but it’s still very hard to complete with natural language for readability.

The topic of this series is the most recent structured logging approach for .NET, specifically designed to avoid these issues.

Logging with message templates

Since the technique was introduced by Serilog a few years ago, many .NET applications have adopted a style of logging that captures a structured event, as well as a human-friendly text representation of it, in the same line of code.

// var customerId = "customer-109972";
Log.Information("Card replacement request received for {CustomerId}", customerId);

The description of the event, called the message template, uses named ‘holes’ instead of the anonymous {0} and {1} numeric placeholders found in .NET format strings.

Under the hood, instead of immediately substituting the value of customerId into the template to create a text event, the logging library captures the template and the argument values individually. By matching the named holes in the template with the positional arguments, the customerId value is associated with a property called CustomerId on the event.

You can imagine the resulting event in JSON format might look like:

{
    "time": "2016-06-01T02:23:10Z",
    "level": "INF",
    "template": " Card replacement request received for {CustomerId}",
    "properties": {
        "customerId": " customer-109972"
    }
}

The event can be rendered as text using the template and properties, or stored and searched as structured data, achieving the best of both worlds: events that are optimally human-friendly and optimally machine-readable.

At the time of writing, you can use message templates in .NET with Serilog, Microsoft.Extensions.Logging (ASP.NET Core) and Logary.

(I actually heard the term “modern” used to describe this style of structured logging via Jason Roberts’ Pluralsight course on Serilog and Seq, which is a couple of versions behind but still well worth checking out if you have a subscription.)

It sounds like I’ve told the whole story of modern structured logging already with this post! One line of code is enough to convey the basic idea, but there really is more to know if you want to use it effectively. I’m aiming to fill two installments of this series each week, so if all goes well you won’t have to wait too long for the next post on events and levels….

Serilog 2.0 Release Plans

.NET Core and ASP.NET Core are expected to release-to-web by the end of June 2016. There have been a few related announcments recently, surrounding the split of the runtime release (end of June) from the tooling release and return to MSBuild (tied to a later Visual Studio release).

This post is just a quick note to clarify what’s happening with Serilog.

What’s in 2.0?

As many of you noticed, we made an early leap into the DNX tooling and related libraries. Version 1.5 of Serilog, the most recent master release, has been sitting stable for quite a while now, while our long-lived dev branch has been tracking .NET Core for the last year, collecting changes and improvements in readiness for the new .NET.

Here’s what we’ve been focusing on in Serilog 2.0.

  • Packaging — the core Serilog package is now a lightweight, low-dependency package with core types like Log, ILogger and LoggerConfiguration; sinks, enrichers, and facilities like <appSettings> support have moved out.
  • Portability — much more of Serilog now works across the board; for example, we can support LogContext on all target platforms; there’s no more Serilog.FullNetFx.dll.
  • Usability — a few rough spots in the API, especially concerning logger shut-down & disposal, have been cleaned up.
  • Performance — for example Serilog.Sinks.PeriodicBatching now accepts and enqueues writes without lock contention.
  • IntegrationSerilog.Extensions.Logging is a high-fidelity logging provider for ASP.NET Core that gets structured log data from your apps and the platform out to a huge variety of mature and supported sinks of all kinds – databases, log servers, the cloud, rolling files, …

There are plenty more goodies in there. (One of my favourites is this PR which landed support for fixed-length level names. In 2.0 you can use {Level,3:u} in output templates to get output like DBG, INF, WRN and so-on.) When we flick the switch on the final version we’ll publish full release notes.

What do the recent announcements mean for Serilog?

By far the biggest change has been the packaging work. Serilog used to include its own file, console and trace sinks, for example. These now live in their own repositories and packages: Serilog.Sinks.RollingFile lives in here on GitHub for example. This doesn’t just reduce dependency bloat – it’s also an important milestone in scaling out the Serilog project to support growth and community contributions (like this excellent alternative rolling file package which can now stand on an equal footing with the original).

Re-packaging Seriog has meant a lot of grappling with the in-progress tooling. DNX + DNU (soon dotnet) and project.json have brought game-changing improvements for us – especially the package building capabilities that mean a single build can compile (and then test!) for multiple target platforms. The earlier build system we’ve used to emulate this in Serilog 1.5 has been miserable. I’m sad to see project.json go, but happy that the major benefits are being carried through to the final version (even if we’re stuck with XML). I’m not thrilled about going through another round of churn, but I trust the .NET and ASP.NET engineers in their commitment to deliver a fully-baked, high-quality tooling release so however we have to get there – we’ll get there!

Roadmap to 2.0

Anyway, I digress 🙂 . Here’s how things will go for Serilog:

  1. We’re going all-in on RC2. It’s going to be a close, if not identical, reflection of the RTM runtime, so once it drops the Serilog core and as many sinks as possible will be ported. This also means nailing down the target framework monikers that the packages support, and paring back our declared dependencies to the bare minumum. The Serilog package will move to 2.0.0-rc-* versioning to reflect this.
  2. Serilog 2.0 will RTW as soon as .NET Core does, in June. The Serilog package will drop first (ASAP) followed by the supported sinks in roughly the order of usage.
  3. We’ll continue building/shipping Serilog with the RC2 version and later previews of the project tooling until the Visual Studio release when we can move forwards with it. From RC2, we’ll treat the tooling as if it’s RTM, and make sure Serilog and its sink/enricher/configuration packages work flawlessly with it, as well as the released Visual Studio 2013 and 2015 tools.

The Serilog contributors and community have put in a lot of work to get us to where we are today. Serilog’s the most mature and full-featured logging solution you’ll find on .NET Core, thanks to a huge effort tracking the platform changes to date. When .NET Core hits RTM, Serilog is going to be there, as we’ve always planned! I can’t wait for June.

When should I upgrade?

RC2. We’ve treated RC1 as a “best-effort” target so far, and supporting Serilog 2.0.0-beta-* has focused mostly on unblocking early adopters of the new runtime. From RC2 I’ll personally consider .NET Core a supported platform and pursue any Serilog issues on it as if they’re in an RTM. I think the promise of a .NET Core/ASP.NET Core RTW shortly afterwards will mean a lot of Serilog users and contributors will be moving forwards, so Serilog 2.0.0-rc-* is going with us.

How can I help?

If your favorite Serilog sink isn’t ported to .NET Core yet, or isn’t ready at RTW, please consider how you can chip some effort in – most porting work is a straightforward project file change and recompile. There are a lot of people in the Serilog community ready to help out; the best way to get started is to jump onto Gitter and mention you’re interested in sending a PR.

Serilog Timings

TL;DR: I’ve just kicked off development of Serilog Timings, a tiny package that adds streamlined operation timings on top of Serilog.

Though there are a million ways to collect timing data from applications, it’s often very useful to include it in regular logs:

  • Logs are visible all the time, even in development, while there isn’t always a metrics server handy
  • Logs make it easy to trace individual timings back to the conditions and processes that caused them
  • Structured logs events already carry the contextual properties around which the timings can be sliced and diced

It will come as no surprise that there are already a few options for recording timing information through Serilog. At its most basic, System.Diagnostics.Stopwatch and a handwritten log event can suffice. I’ve used SerilogMetrics, which includes timings along with counters and gauges, and I’ve also written several hand-rolled implementations for various apps.

As I’ve done this, the picture of what I want from operation timings has become a lot clearer:

Limited noise – when timings are recorded, it’s easy to end up with a crowd of log events: one declaring the operation started and assigning it an identifier, one describing the operation with a bunch of app-specific data, and finally the event completing operation with the elapsed time. Serilog Metrics was built with the goal of compressing all of this into a single event.

Natural messages – related to the first goal, I don’t really want generic “operation started” and “operation ended” messages in the log. Logs need to be optimized for human pattern recognition – the message printed by one operation should be visually distinct from any other so that they’re easy to pick out from a large body of text.

Stable event types – furthermore, the event type (essentially a hash of the message template) should be stable for a particular operation, regardless of whether the operation succeeded, failed, or completed with a custom result. Stable (and unique) event types make it much easier to find all instances of an operation, even though “successful” ones may vastly outnumber and hence obscure the unsuccessful ones.

A compact syntax, .NET Core support, LogContext integration, custom results, … okay, I’m particular in the extreme … you kinda get the picture 🙂

I won’t promise it’s the last attempt I’ll ever make to cover all of this, but Serilog Timings ticks all of the boxes right now. Here’s how it’s used.

Once you’ve configured Serilog and installed the NuGet Package, wrap the code you want to time with a using block and describe the Operation:

using (Operation.Time("Submitting payment for {ClaimId}", claim.Id))
{
    // Timed code goes here...
}

When the using block is completed, an event will be written to the log that looks like this one:

11:20:01 [INF] Submitting payment for claim-41546 completed in 11.2 ms

The description of the operation is a regular Serilog message template, which Serilog Timings extends with " {Outcome} in {Elapsed:0.0} ms". The properties from the description, plus the Outcome and Elapsed properties, are all included on the event if it’s sent to a structured log collector.

The completion event and any other events written within the block will be tagged with an OperationId property as well, if the Serilog LogContext class is enabled.

Operation.Time() assumes the operation always succeeds, so the message will be identical regardless of whether or not an exception is thrown.

Operations that need to distinguish between timings for successful completion vs failures use Operation.Begin() and Complete():

using (var op = Operation.Begin("Saving invoice {Filename} to blob storage", filename))
{
    // Timed code goes here...
   
    op.Complete();
}

Successfully completing the operation will produce an event exactly like the one in the first example. If Complete() is not called on an operation started with Begin(), perhaps because an exception was thrown, then it will be logged as abandoned:

11:29:37 [WRN] Saving claim-41456 invoice INF_CX7890.pdf to blob storage abandoned in 30978.8 ms

Notice that although the Outcome property is "abandoned" instead of "completed", and the level is warning, the message template (and thus the event type) is identical to the one from the success case. This means that while completed and abandoned timings can still be differentiated, given one of them it’s always simple to find the other.

To bail out of an operation altogether and suppress the warning message, an operation can be Cancel()ed.

Complete() can optionally accept a result property name and value to added to the event, so for example if you’re timing a network request, you can tag the event with BytesReceived, running a SQL query you might tag the event with Rows and so-on. More detailed instructions and documentation are in the project README.

Events generated by Serilog Timings display nicely in log files or the console. If you collect logs using SQL Server or another database you might elevate the Elapsed property to column-level for some interesting analysis opportunities (the Serilog sink for SQL Server recently acquired property-to-column mapping capability).

This screenshot from Seq shows a timing event from my test app with all of its attached information:

Data

The stable event type (B7B24E3B shown as Type in the screenshot above) lets us drill down to just the timings of the operation we’re interested in:

ByType

And once we’ve pushed this over into a signal we can run more interesting queries to follow the Elapsed value over time:

Aggregate

(Seq 3.1 just added the timeseries charting shown here, so my current obsession is trawling through historical log data to see what kinds of trends I can detect 😉 .)

What do you think? I’d love to hear from you if you give Serilog Timings a try, and issue reports or PRs to the GitHub project are welcome. Have fun!

Exploring Rust

For the past few years I’ve followed the evolution of Rust with interest. Rust is a new systems programming language (like C or C++) with greater emphasis on safety. I’ll spare you my reasons and opinions on its relevance since the Internet is replete with those 🙂

Instead, I thought I’d write this brief post to share my perspective on learning Rust as a developer who’s otherwise mostly writing C#. The post assumes you’ll jump over and check out the Rust documentation where topics are necessarily summarised here.

No, this isn’t one of those “I’m leaving .NET! …for Rust!” kinds of posts. Rust and C# are different enough to be complimentary, in much the same way C# and JavaScript are. Learning Rust also involves digesting concepts that have (so far) escaped C#’s all-consuming multi-paradigm expansion. Perhaps there are some ideas to bring back and apply in .NET, too?

So, you’re comfortable with C# – how does the knowledge you already have map onto this new thing? Here are my notes after spending a few days writing some Rust.

Projects, builds and compilation

The first thing you’ll probably write in Rust is not in the language itself, but at the command-line:

> cargo new --bin hello
...
> cd hello
> cargo run
...
Hello, world!

Cargo is Rust’s package manager, and combined with the repository at crates.io, it plays the role of NuGet in the Rust ecosystem.

Unlike C# and .NET, Rust came of age in the post-NPM world. The Rust standard library is smaller than the original .NET one, and feels a lot more like the upcoming .NET Core. HTTP clients, web frameworks, async I/O and JSON serialization live out in the crates.io ecosystem where they can evolve and improve through Darwinian pressures.

You’ll also probably note that NuGet.exe doesn’t have a run command like the one above. This is another parallel between Rust and .NET Core: the cargo command-line (like dotnet) provides a standardised interface to the whole toolchain including dependency management, builds, a test runner and so-on.

Testing

In C#, adding unit tests to a project is a decision point. In Rust, the decision is made for you, and every (Cargo) project gets testing built in. Tests sit right alongside the code being tested, often in the same source file.

Here’s an example:

fn add(a: i32, b: i32) -> i32 {
    a + b
}

#[test]
fn two_plus_two_makes_five() {
    let result = add(2, 2);
    assert_eq!(result, 5);
}

Side-by-side tests are not just a feel-good novelty: including tests in the same compilation unit as the code-under-test means there are no visibility issues to worry about. No need for internal and [InternalsVisibleTo] just to drive unit tests!

Code and its tests also always move together – no more parallel folder hierarchies in test projects to keep in sync. I like this enough to miss it in C# projects already. Conditional compilation in Rust means there’s no issue with code or dependency bloat, something that might be trickier to handle cleanly in the .NET world.

The baked-in assertion macros are very basic – I’m sure there are alternatives but in my limited exploration so far I haven’t felt the need for anything more.

Syntax

Rust has an expression-oriented curly-braces syntax. If you stare at C# (or C/C++, or Java, or JavaScript) all day, Rust isn’t that surprising. It’s not a language you’d immediately recognize by its syntax, like Python, Lisp or Smalltalk.

Coming at Rust from C# you might find the name: Type declaration order, -> Type return type declarations and implicit “return” unusual, but the tiny add() function above uses all of these and the meaning is pretty clear.

Approaching the Rust syntax is very much like approaching the syntax of any new language – it takes some time. So far nothing has left me scratching my head for too long.

Move semantics, lifetimes and borrowing

Here’s where Rust begins to diverge from “nice, generic, modern language” to strike out on its own. Two of Rust’s big selling points – memory safety without garbage collection, and freedom from data races – rely on statically determining the “owner” of an object as well as possible references to it (aliases).

let c1 = vec![1, 2, 3];
let c2 = c1;

// Fails to compile - the value has been moved out of c1 into c2
println!("Count: {}", c1.len());

You get the general idea in the example above: assignments in Rust are moves by default. This takes a bit of getting used to: in C#, assigments are always copies either of a simple value or a reference to an object, so after assignments in C#, both the source and destination variables are still usable.

For memory safety, the advantage of move semantics is that the compiler knows exactly when a value goes “out of scope” and can insert code to deallocate it at precisely the right moment, without needing a GC to scan for other references.

Now, this is obviously a pretty limited model, so Rust has a clever system of lifetimes and borrowing to make more complicated patterns expressible. A proper treatment of the idea is well beyond what I could attempt in this post (already starting to spill beyond the initial goal of “brief”), but with the risk of inadvertently bungling the analogy I’d like to try to translate the concept of a “lifetime” into C# terms, since I haven’t seen this attempted elsewhere.

Ready? Read the ownership, borrowing, and lifetimes chapters of the Rust book? Ok, here it is.

Dispose-safety in C#

Let’s imagine a typical use of the using statement in C#:

using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    using (var index = archive.OpenFile("index.json"))
    {
        var contents = ReadBackup(archive, index);
        // Read files from contents
    }
}

It’s a bit contrived, but you can imagine the object returned from ReadBackup(archive, index) may have a reference to either archive or index.

That’s fine in the example above, since we know at the time we // Read files from contents that nothing has been disposed yet, so all is well.

How about this code?

using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    BackupReader contents;
    using (var index = archive.OpenFile("index.json"))
    {
        contents = ReadBackup(archive, index);
    }
    // Read files from contents
}

Here, we might have a problem. The lifetime represented by the second using statement has finished, and index has been disposed. If the BackupReader holds a reference to index then we’ll have trouble, probably in the form of an ObjectDisposedException when we try to use it.

How can we tell if this code is safe? In C#, we can’t – statically – without examining the code of ReadBackup() and potentially any code it calls.

The analogy is not precise, since in Rust it’s deallocating the memory rather than calling Dispose() that we’re concerned with, but otherwise, this is the safety issue Rust lifetimes solve. The Rust compiler must know, when examining this code, whether the second usage of contents is valid, because allowing it otherwise could lead to reading from memory that’s already been freed.

Before translating the example into Rust, let’s invent a hypothetical dispose-safety checker for C#, where we can represent these kinds of things in method signatures using attributes. Here’s the original:

static BackupReader ReadBackup(Archive archive, Index index)
{
    var manifestFilename = index.GetManifestFilename();
    return new BackupReader(archive, manifestFilename);
}

Ah – so the index argument is only used to determine a filename, and the result only depends on index. Here’s the “dispose-safe C#” that expresses this:

[return: In("a")]
static BackupReader ReadBackup([In("a")] Archive archive, [In("b")] Index index)
{
    var manifestFilename = index.GetManifestFilename();
    return new BackupReader(archive, manifestFilename);
}

We’ve added some labels like [In("a")] to the arguments. These represent which using block the value came from. “Dispose-safe C#” requires these whenever a method accepts an IDisposable argument.

Annotating the return value with [In("a")] means that the result is valid in the same scope that the archive parameter is valid in – both are tagged with a.

The return value doesn’t have any relationship at all to index, so the b tag doesn’t appear anywhere else.

Back at the call-site, the checker does the mapping of labels to using blocks for us implicitly:

// 1
using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    BackupReader contents;

    // 2
    using (var index = archive.OpenIndex()))
    {
        // Infers that "a" is block 1 and "b" is block 2
        contents = ReadBackup(archive, index);
    }

    // Thus, contents depends only on block 1, so this is fine
    // Read files from contents
}

Now, what would have happened if ReadBackup() had a different implementation?

[return: In("b")]
static BackupReader ReadBackup([In("a")] Archive archive, [In("b")] Index index)
{
    var files = archive.LoadAllFiles();
    return new BackupReader(files, index);
}

In this version of the method, the returned BackupReader is declared to depend on b, the lifetime of the index argument. The mapping of the using statements is the same, but the code is no longer valid and the checker will reject it:

// 1
using (var archive = OpenZipArchive(@"Cbackup.zip"))
{
    BackupReader contents;

    // 2
    using (var index = archive.OpenIndex())
    {
        // Infers that a is block 1 and b is block 2
        contents = ReadBackup(archive, index);
    }

    // Error: contents depends on block 2, which has ended
    // Read files from contents
}

By giving the checker a bit more information about the intentions of the code, it’s helpfully saved us from an ObjectDisposedException at runtime — awesome!

Lifetime variables in Rust

I hope this little “dispose-safe C#” thought experiment sheds some light now on the corresponding (memory-safe) Rust:

fn read_backup<'a,'b>(archive: &'a Archive, index: &'b Index) -> BackupReader<'a> {
    let manifest_filename = index.manifest_filename();
    BackupReader::new(archive, manifest_filename)
}

Don’t let the reuse of the angle-brackets “generic” syntax throw you. The little 'a and 'b annotations in there are lifetime variables, exactly the same as the [In("a")] and [In("b")] examples in C#.

At the call-site, there’s no special using-like statement to delineate lifetimes in Rust, which are implied by the natural boundaries of blocks:

// 1
let archive = open_zip_archive("C\backup.zip");
let contents: BackupReader;

// 2
{
    let index = archive.open_index();
    contents = read_backup(&archive, &index);
}

// Read files from contents

When the compatible implementation of read_backup() is used, this compiles fine; when the second implementation of read_backup() is used here, the Rust compiler rejects the code with error: index does not live long enough.

I hope my analogy hasn’t stretched too far and led you astray, as my own understanding of Rust is far from perfect. My hope is that it makes the whole lifetime concept a bit more approachable. Ownership, borrowing and lifetimes are interesting concepts that it are worth spending some time on.

The stack and the heap

There’s one more place I’ve spotted where C# and Rust differ fundamentally, and that’s how memory is allocated between the stack and the heap.

C# has two families of data structures that dermine allocation behaviour: structs and classes. If a type is a struct then values of that type will be allocated on the stack:

// Here, now is the actual DateTime value
var now = new DateTime(2016, 03, 23);

If a type is a class, then values of that type are allocated on the heap, and we work with it through references:

// Here, user is a pointer to some memory holding a User
var user = new User("nblumhardt");

C# also has an “escape hatch” for moving structs to the heap as well, in the form of ‘boxing’:

// Here, now is a pointer to memory holding the DateTime
var now = (object)new DateTime(2016, 03, 23);

Boxed structs in C# however are a bit of a second-class citizen – there’s no static type to represent them, so they end up as objects.

After the last section, you might feel some relief to hear that Rust is simpler on this point. In Rust, all values are essentially C# structs:

// Here, user is the actual User value
let user = User::new("nblumhardt");

To move a value onto the heap, as in C#, the value is boxed:

// Here, user is a Box, pointing to memory on the heap
let user = Box::new(User::new("nblumhardt"));

Boxed values are first-class in Rust, so in practice they’re more like C#’s classes in usage, enjoying the benefits of static typing for example.

The declaration and construction syntax for structs in Rust is quite a bit different from the C# equivalent, but I’ve found that getting clear about the differences in the allocation model has made the biggest difference to my understanding of Rust code.

Language machinery

Rust is a modern language. Its designers have drawn widely from existing languages and experience to exclude the worst footguns (nullable references, mutability-by-default) and include the best available tools for abstraction (closures, generics, traits, pattern matching, algebraic data types and more).

One of the refreshing aspects of Rust is that like C#, it’s a multi-paradigm language with a functional features. Many of the new languages I’ve encountered in recent years are presented as functional languages, which is nice enough, but there’s a place in the future for variety.

I’d love to continue poking at different corners of Rust with this post, but writing it’s already pushing at the limits of my attention, so before I lose your interest too I’d better wrap up.

In closing

I’m having a great time exploring Rust. If you’re working in one of the currently popular industrial languages like C#, and want to learn something that’s different in more than just syntax, Rust could be what you’re looking for. Have fun!

Reading logger configuration from appsettings.json

ASP.NET Core has a completely new configuration subsystem that adds JSON, environment variables and other sources in addition to the classic XML format we’ve had since .NET 1.0. The appeal of Microsoft.Extensions.Configuration for me is that it’s easier to work with than the System.Configuration APIs.

Having the flexible JSON format is also a big plus, and it makes Serilog configuration on .NET Core quite a bit more natural than the XML <appSettings> support that ships in 1.5.

This post is about Serilog 2.0, currently in beta. Unless you’re targeting ASP.NET Core, you probably need the Serilog 1.5 package. But, all of this stuff will land later this year, so read on!

Here’s the Serilog configuration for colored console logging:

Log.Logger = new LoggerConfiguration()
  .WriteTo.ColoredConsole()
  .CreateLogger();

With the new Serilog.Settings.Configuration package, ReadFrom.Configuration() lets us pull the same information from appsettings.json.

var configuration = new ConfigurationBuilder()
  .AddJsonFile("appsettings.json")
  .Build();

Log.Logger = new LoggerConfiguration()
  .ReadFrom.Configuration(configuration)
  .CreateLogger();

Log.Information("Hello, world!");

In the JSON file itself, the relevant section is called "Serilog":

{
  "Serilog": {
    "WriteTo": ["ColoredConsole"]
  }
}

Simple, hey? As long as the sink package, in this case Serilog.Sinks.ColoredConsole is installed, the right assemblies will be found. The names of sinks (and the arguments they might accept) are the same as the method names in the C# API, because behind the scenes the JSON configuration is translated into exactly the same method calls as you’d make directly in C#.

Let’s look at a more complete example:

{
  "Serilog": {
    "MinimumLevel": "Debug",
    "WriteTo": [
      { "Name": "ColoredConsole" },
      { "Name": "Seq", "Args": { "serverUrl": "http://localhost:5341" } }
    ],
    "Enrich": ["FromLogContext", "WithMachineName", "WithThreadId"],
    "Properties": {
        "Application": "Sample",
        "Environment": "Test"
    }
  }
}

This one specifies a minimum level, and shows how "WriteTo" works to specify more complex sink configuration. Instead of just the name of the sink some arguments are included too. Seq is one of the network-based sinks for Serilog, and WriteTo.Seq() accepts a serverUrl argument in the C# API, so that’s passed under "Args" in the JSON version.

"Enrich" works the same way as "WriteTo": the options here are the same ones that you’ll see on the Enrich configuration API in C#, and arguments can be passed in the same way if they’re needed.

As a syntactic shortcut, Enrich.WithProperty("Application", "Sample") in C# is specified using the JSON "Properties" collection. (This has the side-effect of meshing nicely with environment variable configuration support, which we get “for free” by supporting the IConfiguration interface. More on that another day…)

What do you think? Let me know in the comments here, or join the Gitter chat we’ve just opened up for the Serilog project.

Remote Level Control in Serilog using Seq

Logging is indispensable when it’s needed, but too much logging can be a drain on performance and resources when everything is going smoothly and you don’t need really fine-grained log data.

Rather than adding more machines and buying fatter pipes, the largest apps I’ve worked on have all implemented some form of dynamic level control. In this system, logging usually defaults to something sensible (my preference is Information) and can be turned up or down from a central control panel when bugs need to be shaken out.

A service bus is usually the tool of choice for implementing it: publish a message like new LoggingLevelChange { App = "Ordering", Level = "Debug" } and the participating apps will notice, say “Hey, I do ordering!” and update their logging level accordingly.

There are always a few challenges to overcome before it all works smoothly. Ideally, every app would get this capability, but it’s not always high enough on the backlog to get done early on.

Along with the next Serilog version is a small change in the Seq sink that I’m really excited about. By sharing a LoggingLevelSwitch between the logging pipeline and Seq sink, it’s now possible to control the logging levels of connected apps easily — efficiently — from the Seq server.

var levelSwitch = new LoggingLevelSwitch();

Log.Logger = new LoggerConfiguration()
    .MinimumLevel.ControlledBy(levelSwitch)
    .WriteTo.LiterateConsole()
    .WriteTo.Seq("http://localhost:5341",
                 apiKey: "yeEZyL3SMcxEKUijBjN",
                 controlLevelSwitch: levelSwitch)
    .CreateLogger();

In this configuration, levelSwitch is passed to the MinimumLevel.ControlledBy() method and the WriteTo.Seq() statement.

Now, whenever a batch of events is sent to Seq, the logging level associated with the corresponding API key will be returned by Seq (or at two minute intervals, when things are quiet). The Seq sink will apply this to controlLevelSwitch, so the whole logging pipeline will be adjusted up or down.

DynamicLevelControl

Why is this so much better than what was possible before? Seq has had per-API key level control for a while, but this was only used as a filter: log events had to be generated at the highest required level and then discarded for the scheme to work effectively, using more CPU and memory in the client application than was strictly required. Now that the Seq sink can control the level of the logging pipeline, events that aren’t required are not generated in the first place.

Pre-release versions of the Seq sink including this feature work today on .NET 4.5+, but a pre-release Serilog 2.0 version is required as well. Depending on which other Serilog packages you use, this may or may not be possible to try out in your application. Some details of Serilog 2.0 status can be found in this post.

If you’re able to check it out I’d love to hear from you.

Serilog Tip – Don’t Serialize Arbitrary Objects

Serilog is built around the idea of structured event data, so complex structures are given first-class support.

var identity = new { Name = "Alice", Email = "[email protected]" };
Log.Information("Started a new session for [email protected]}", identity);

If you’ve spent some time with Serilog you will have encountered the @ ‘destructuring’ operator. This tells Serilog that instead of calling ToString() on identity, the properties of the object should be serialized and stored in structured form.

{"Identity": {"Name": "Alice", "Email": "[email protected]"}}

You might also have wondered – since Serilog is built around structured data, why isn’t serialization the default – or, why is @ required at all?

If you consider the process of serialization in general, this starts to make sense. Have you ever tried to convert an entity from EntityFramework to JSON with a general-purpose serializer? A System.Type? Give it a try! If you’re lucky enough to get a result, you’ll probably find it’s a very big blob of text indeed. Some serializers will bail out when circular references are detected, others will chew up RAM indefinitely. Most objects aren’t designed for serilalization, so they’re interconnected with many others that themselves link to yet more objects, and so-on.

Serializers – Serilog included – are not made for serializing arbitrarily-connected object graphs. Serilog has some safeguards to make sure your application survives a “mass serialization” event, but the effects on the health of your logging pipeline won’t be pretty.

In a logging library there’s a delicate balance to make between safety and runtime cost, so the best that most Serilog sinks can do is drop or reject events that are too large before trying to ship them elsewhere (the v2 version of the Seq sink now defaults to a 256 KB cap on event bodies).

What’s the TL;DR? When you serialize data as part of a Serilog event, make sure you know exactly what will be included. The @ operator is a powerful but sharp tool that needs to be used with precision.

Serilog 2.0 Progress Update

TL;DR: There are now 2.0.0-rc2-final-x Serilog packages on NuGet. These are primarily for the adventurers building apps for .NET Core and ASP.NET Core (previously ASP.NET 5). If you’re not pushing forwards into the next version of .NET just yet, then Serilog 1.5.x is the version you should use.

Blazing a trail, or curious to hear what we’ve been up to? Great! Read on…

Why Serilog v2?

serilog-nuget

.NET is changing – that’s possibly an understatement. The days of the big, bulky, slow-moving, preinstalled framework are drawing to a close, and in its place we’ll soon be building on a pay-for-play framework that is light enough to ship alongside your app. Targeting .NET Core means your app only has to carry with it the packages that it needs.

This spells trouble for the “batteries-included” packaging approach we took in Serilog 1.0. Including a bunch of sinks, enrichers and extensions in the core Serilog package means that the transitive closure of dependencies is large. Most apps only use one or two logging sinks, so shipping every package required by every core sink is a waste. In a year’s time, no one will want to use a logging framework that pulls in multiple megabytes of unnecessary junk.

So, fundamentally, v2 of Seriog is about packaging. Instead of the Serilog package carrying the rolling file and console sinks for example, those will now live in Serilog.Sinks.RollingFile and Serilog.Sinks.Console, respectively. Without going too crazy, other bits and pieces like <appSettings> support will also be moving out.

Since a good package factoring will mean a few breaking changes, we’ll capitalize on the opportunity to make a number of smaller improvements that couldn’t otherwise be included in 1.5.

Who is working on this?

There are dozens of active contributors to the various Serilog sub-projects. I’d love to list them here This Week in Rust-style, but I’m not organized enough to get that right. Everything that’s great about Serilog at this point is the product of a dedicated bunch of contributors, and all of that will be brought across into Serilog v2. We’ve also landed many PRs in the v2 branch that will ship along with the v2 release – thanks everyone who has sent one or more of these!

On v2 specifically though, I have to mention the work done by @khellang on the platform mechanics. .NET Core has been a moving target, and over the course of almost a year we’ve had to adjust to the twists and turns that the new platform has taken while keeping an eye on the eventual end goal. Kristian has helped the project navigate all of this and I don’t think we’d even have a working build right now if not for his guidance and hard work.

The flip-side – maintaining a usable API and intelligible packaging story, has been in large part the work of @MatthewErbs, who’s landed some of the broadest and most detailed PRs we’ve yet attempted in the last few months. Likewise, v2 would be nowhere without Matthew’s efforts – and bulletproof patience! 🙂

What’s done, what’s remaining?

At the time of writing, the v2 Serilog package as well as the previously-built-in sinks work on .NET Core identically to the full .NET Framework. On the packaging side, there’s still some work to go, but things are coming together quickly.

Beyond that set of basic sinks, along with a few extras like the Seq sink and my favourite Literate Console, practically nothing works. If you are working on a .NET Core app and the sinks you prefer aren’t available, helping to update them is a great way to support the project.

We’ve also shipped Serilog.Extensions.Logging, a provider for the logging abstractions used throughout ASP.NET Core. You can read more about it and see some example output in this article on the Seq blog.

Remaining is a large amount of cleanup and polish. One task in the midst of this is removal of .NET 4.0 support from the codebase. 4.0 is a respectable platform, but it is a static one. New features and improvements going into Serilog are mostly in support of new platform features and don’t carry as much value for .NET 4 apps. Couple this with the observation that .NET apps aren’t likely to adopt new technology (including Serilog) in significant numbers, and it is hard to justify carrying forward the #if spaghetti to the new version.

This doesn’t mean Serilog is abandoning .NET 4 – only that Serilog 1.5 will remain the recommended version for this platform. Platform-specific dependency support in NuGet also means that projects wishing to actively support .NET 4 with Serilog can still do so by selecting different required Serilog versions per target framework version in project.json.

When will Serilog 2.0 ship?

The core Serilog package will ship in line with .NET Core/ASP.NET Core. There’s no solid date on this yet, but the ASP.NET Roadmap is the location to watch.

I would guess this post raises as many questions as it answers – please ask away! If you’d like to help us cross the last mile (or two) then the best way to get involved is to grab an issue by leaving a comment, or raising a new issue for work you might want to complete such as porting one of the many remaining sinks.