Structured logging concepts in .NET Series (1)

Between now and the Serilog 2.0 release that’s due sometime in July, I’ve decided to go back to basics and write some posts on the fundamentals of modern structured logging in .NET.

In the complex, distributed, asynchronous systems being built today, answering questions about behavior – what caused this issue? – is a horribly tedious thing to have to do without the right instrumentation. I hope this series will be useful for anyone wanting to build more manageable .NET apps with the new crop of frameworks and tools.

The examples use Serilog, but most of the same concepts appear in other libraries, so where possible I’ll call out the alternative APIs.

A tentative plan of attack, following this post:

  • Events and levels
  • Message templates and properties
  • Event types
  • Context and correlation
  • Serialized data
  • Collection and storage

But now, let’s get the ball rolling!

What is structured logging?

Approaches to application diagnostic logging can be roughly divided into text-based logging and structured, or event-based, logging.

Text logging, sometimes also called ‘printf debugging’ after the C printf() family of functions, is the first one most programmers encounter. As your code executes, it prints lines of text out to the terminal that describe what’s going on.

12:23:10 [INF] Service "loyalty-card-printer" starting up on host 8fd342hkg22u
12:23:11 [INF] Listening on http://8fd342hkg22u.example.com:1234
12:23:20 [INF] Card replacement request received for customer-109972
12:40:55 [INF] Card replacement request received for customer-98048

Because text logs are written as a narrative in natural language, they’re nice and easy to read.

Text is a very expressive format, and tools including full-text indexing and regular expressions can do a lot with it, but answering anything non-trivial with text log data can still take some painstaking work.

Structured logging is an umbrella term for alternatives focusing on easier searching and analysis at scale. A structured log event, rather than being a line of text, is conceptually a timestamped set of name/value properties.

Structured logging systems separate the way the event is captured (the logging API) from the way that it is transported (perhaps JSON) and how it is rendered in a user interface or file.

A rough translation of the sequence above to structured events might be conceptually like:

time=12:23:10, level=INF, service=loyalty-card-printer, host=8fd342hkg22u, activity=started
time=12:23:11, level=INF, endpoint=http://8fd342hkg22u.example.com:1234, activity=listening
time=12:23:20, level=INF, customer=customers-1099972, activity=replacement
time=12:40:55, level=INF, customer=customer-98048, activity=replacement

Structured logs are easier to navigate because the properties in the event are first-class: a log server can return results for a query like activity == "replacement" && customer == "customer-98048" without any configuration to describe how the events should be parsed.

The line between structured and text logs is blurry:

  • Text logs can provide some structure by encoding well-known fields like the timestamp, logging level and thread id at the start of each line, and by including name=value formatted fields in the message itself
  • Structured logs sometimes look a lot like text logs when formatted out to a flat file or the terminal, and often include a message field that is itself free text

But, these similarities are skin deep. Structured logging and text logging require different mindsets in order to use each well.

Historical barriers to adoption

Structured logging isn’t always 100% rainbows-and-unicorns-awesome. My guess is that two warts have kept adoption of structured logging in second place.

First, structured logging implementations can be more labor-intensive at development time than text-based alternatives.

In .NET, the EventSource API illustrates this well. Writing a structured log event with EventSource requires an event definition in the form of a C# method in a separate class, along with the line of code to actually record the event:

[Event(123, Message = "Card replacement request received for: {0}")]
internal void CardReplacementRequested(string customerId)
{
  if (this.IsEnabled()) this.WriteEvent(123, customerId);
}

// Elsewhere:
// eventSource.CardReplacementRequested(customerId);

Switching between files and creating an annotated method just to record a single log event is a tough cost to bear, in comparison with the single line of code to record a typical text log event.

Second, many structured logging implementations produce output that is harder to visually scan. This phenomenon is exaggerated in the earlier example – there are nicer ways to format the same structured events – but it’s still very hard to complete with natural language for readability.

The topic of this series is the most recent structured logging approach for .NET, specifically designed to avoid these issues.

Logging with message templates

Since the technique was introduced by Serilog a few years ago, many .NET applications have adopted a style of logging that captures a structured event, as well as a human-friendly text representation of it, in the same line of code.

// var customerId = "customer-109972";
Log.Information("Card replacement request received for {CustomerId}", customerId);

The description of the event, called the message template, uses named ‘holes’ instead of the anonymous {0} and {1} numeric placeholders found in .NET format strings.

Under the hood, instead of immediately substituting the value of customerId into the template to create a text event, the logging library captures the template and the argument values individually. By matching the named holes in the template with the positional arguments, the customerId value is associated with a property called CustomerId on the event.

You can imagine the resulting event in JSON format might look like:

{
    "time": "2016-06-01T02:23:10Z",
    "level": "INF",
    "template": " Card replacement request received for {CustomerId}",
    "properties": {
        "customerId": " customer-109972"
    }
}

The event can be rendered as text using the template and properties, or stored and searched as structured data, achieving the best of both worlds: events that are optimally human-friendly and optimally machine-readable.

At the time of writing, you can use message templates in .NET with Serilog, Microsoft.Extensions.Logging (ASP.NET Core) and Logary.

(I actually heard the term “modern” used to describe this style of structured logging via Jason Roberts’ Pluralsight course on Serilog and Seq, which is a couple of versions behind but still well worth checking out if you have a subscription.)

It sounds like I’ve told the whole story of modern structured logging already with this post! One line of code is enough to convey the basic idea, but there really is more to know if you want to use it effectively. I’m aiming to fill two installments of this series each week, so if all goes well you won’t have to wait too long for the next post on events and levels….

Serilog Timings

TL;DR: I’ve just kicked off development of Serilog Timings, a tiny package that adds streamlined operation timings on top of Serilog.

Though there are a million ways to collect timing data from applications, it’s often very useful to include it in regular logs:

  • Logs are visible all the time, even in development, while there isn’t always a metrics server handy
  • Logs make it easy to trace individual timings back to the conditions and processes that caused them
  • Structured logs events already carry the contextual properties around which the timings can be sliced and diced

It will come as no surprise that there are already a few options for recording timing information through Serilog. At its most basic, System.Diagnostics.Stopwatch and a handwritten log event can suffice. I’ve used SerilogMetrics, which includes timings along with counters and gauges, and I’ve also written several hand-rolled implementations for various apps.

As I’ve done this, the picture of what I want from operation timings has become a lot clearer:

Limited noise – when timings are recorded, it’s easy to end up with a crowd of log events: one declaring the operation started and assigning it an identifier, one describing the operation with a bunch of app-specific data, and finally the event completing operation with the elapsed time. Serilog Metrics was built with the goal of compressing all of this into a single event.

Natural messages – related to the first goal, I don’t really want generic “operation started” and “operation ended” messages in the log. Logs need to be optimized for human pattern recognition – the message printed by one operation should be visually distinct from any other so that they’re easy to pick out from a large body of text.

Stable event types – furthermore, the event type (essentially a hash of the message template) should be stable for a particular operation, regardless of whether the operation succeeded, failed, or completed with a custom result. Stable (and unique) event types make it much easier to find all instances of an operation, even though “successful” ones may vastly outnumber and hence obscure the unsuccessful ones.

A compact syntax, .NET Core support, LogContext integration, custom results, … okay, I’m particular in the extreme … you kinda get the picture 🙂

I won’t promise it’s the last attempt I’ll ever make to cover all of this, but Serilog Timings ticks all of the boxes right now. Here’s how it’s used.

Once you’ve configured Serilog and installed the NuGet Package, wrap the code you want to time with a using block and describe the Operation:

using (Operation.Time("Submitting payment for {ClaimId}", claim.Id))
{
    // Timed code goes here...
}

When the using block is completed, an event will be written to the log that looks like this one:

11:20:01 [INF] Submitting payment for claim-41546 completed in 11.2 ms

The description of the operation is a regular Serilog message template, which Serilog Timings extends with " {Outcome} in {Elapsed:0.0} ms". The properties from the description, plus the Outcome and Elapsed properties, are all included on the event if it’s sent to a structured log collector.

The completion event and any other events written within the block will be tagged with an OperationId property as well, if the Serilog LogContext class is enabled.

Operation.Time() assumes the operation always succeeds, so the message will be identical regardless of whether or not an exception is thrown.

Operations that need to distinguish between timings for successful completion vs failures use Operation.Begin() and Complete():

using (var op = Operation.Begin("Saving invoice {Filename} to blob storage", filename))
{
    // Timed code goes here...
   
    op.Complete();
}

Successfully completing the operation will produce an event exactly like the one in the first example. If Complete() is not called on an operation started with Begin(), perhaps because an exception was thrown, then it will be logged as abandoned:

11:29:37 [WRN] Saving claim-41456 invoice INF_CX7890.pdf to blob storage abandoned in 30978.8 ms

Notice that although the Outcome property is "abandoned" instead of "completed", and the level is warning, the message template (and thus the event type) is identical to the one from the success case. This means that while completed and abandoned timings can still be differentiated, given one of them it’s always simple to find the other.

To bail out of an operation altogether and suppress the warning message, an operation can be Cancel()ed.

Complete() can optionally accept a result property name and value to added to the event, so for example if you’re timing a network request, you can tag the event with BytesReceived, running a SQL query you might tag the event with Rows and so-on. More detailed instructions and documentation are in the project README.

Events generated by Serilog Timings display nicely in log files or the console. If you collect logs using SQL Server or another database you might elevate the Elapsed property to column-level for some interesting analysis opportunities (the Serilog sink for SQL Server recently acquired property-to-column mapping capability).

This screenshot from Seq shows a timing event from my test app with all of its attached information:

Data

The stable event type (B7B24E3B shown as Type in the screenshot above) lets us drill down to just the timings of the operation we’re interested in:

ByType

And once we’ve pushed this over into a signal we can run more interesting queries to follow the Elapsed value over time:

Aggregate

(Seq 3.1 just added the timeseries charting shown here, so my current obsession is trawling through historical log data to see what kinds of trends I can detect 😉 .)

What do you think? I’d love to hear from you if you give Serilog Timings a try, and issue reports or PRs to the GitHub project are welcome. Have fun!

2015 in Review

Another year is almost over! I’ve looked back through the posts on this blog from the past year and made some notes in case you’re looking for some holiday reading.

This was a huge year for Serilog. January kicked of with a few tens of thousands on the NuGet package download counter, but closing up the year it’s over 250k and accelerating. With Serilog-style message template support set to land in ASP.NET 5, I think it is safe to say 2016 is the year we’ll see structured logging hit mainstream in .NET.

Seq has also seen huge (triple-digit) growth this year, especially since v2 shipped in June. Keeping up has been a challenge! Along with a new major version in the first quarter next year, there’s a lot coming for us in 2016 – stay tuned for some updates after the break.

2015

  • Give your instrumentation some love in 2015! — I started this year aware that the vast majority of .NET teams are still writing plain-text logs, collecting them with Remote Desktop and reading them in Notepad. It feels like this is improving but there’s still a long way to go before we’re all using the current generation of tools effectively.
  • Using Serilog with F# Discriminated Unions — Serilog gained some better F# support this year. (Also on the F# front, Adam Chester’s implementation of Message Templates in F# has opened up some possibilities with that language. Logary 4 also promises some Serilog-style structured goodness for F# users sometime in the coming year.)
  • Tagging log events for effective correlation — Some tips for tracing related paths of execution through your application logs.
  • Diagnostic logging in DNX/ASP.NET 5 — The ASP.NET 5/CoreCLR platform has changed significantly since this first tentative post describing Serilog support went out in May, but the fundamentals are still pretty well summed-up here. ASP.NET 5 and CoreCLR are the bit focus areas for Serilog’s upcoming 2.0 release, which has been ticking away on GitHub for a few months now. The platform reset going on in .NET right now is going to take some getting used to, but in a few years we’ll be able to thank the current ASP.NET and CoreFX teams, as well as the mass of community contributors, for the continued relevance and growth of .NET. 2016’s going to be a year for us all to rally and show some support for this work.
  • Seq/2 Update, Seq/2 Beta and Announcing Seq 2 RTW — It’s hard to believe Seq 2 has only been out since June. These few posts track the release of Seq 2, which was a complete UI rewrite and major overhaul of Seq v1. (2.1 followed, as did 2.2 and 2.3. Seq is now at version 2.4).
  • Filtering with Signals in Seq 2 — Explains the new filtering system in Seq 2.
  • Contender for .NET’s Prettiest Console? — If you’re not using Serilog’s Literate Console sink, you need to check out this post.
  • Contextual logger injection for Autofac — If you prefer to inject ILogger using your IoC container, this post is for you.
  • Assigning event types to Serilog events — Seq’s “event type” system can be implemented in simple flat-file logs too, for improved searching/filtering.
  • Aggregate Queries in Seq Part 1: Goals — The first of a series of posts documenting a spike through Seq v3’s SQL Query interface. (Parts 2, 3, 4, 5 and 6.)
  • How to notify Slack using logs from your .NET app — Seq + Slack = <3.

Thanks for visiting and commenting this year. Have a happy and safe holiday season, and see you in 2016!

Set the asterisk in project.json version numbers

I have a feeling I’ve bothered the friendly people on Jabbr twice now about how to set a value for the * (‘wildcard’) placeholder in DNX’s project.json files, so here it is for next time… 🙂

DNX project.json files use a version syntax that makes it easy to set a tag (e.g. the branch name) in the JSON file itself, while adding a unique numeric suffix at build time.

Here’s one in the first property below:

{
    "version": "1.0.0-beta-*",
    "description": "Serilog provider for Microsoft.Framework.Logging",

(I’m setting up CI for a Serilog provider for Microsoft.Framework.Logging that the ASP.NET team put together and contributed to the project.)

These work both in the package’s own version field, and in dependency versions, so multiple projects being built at the same time can depend on each other this way.

However, if you build a project versioned this way in Visual Studio, or at the command line with dnu build you’ll get a version number like 1.0.0-beta. That’s not what you’re after.

The key is setting the DNX_BUILD_VERSION environment variable, e.g. to a counter supplied by your build server:

set DNX_BUILD_VERSION=%APPVEYOR_BUILD_NUMBER%

With this done you’ll get nice unique package versions like 1.0.0-beta-234.

Thanks David Fowler and Alex Koeplinger for the pointer.

Seq update – monthly subscriptions, updated tiers, full-featured single-user license

TL;DR: Seq 1.4 is out, and with it we’re fine tuning the licensing options.

Building a software product is a continual exercise in listening, learning and adapting. Since releasing v1.0 of Seq nearly five months ago, we’ve watched keenly as customers put it to work and to be frank, we now have a much better understanding of how Seq fits in to the tooling landscape.

Part of this learning has shown us how customers fit Seq into their own environments, including how those environments dictate licensing requirements. Feedback has highlighted some discrepancies between how we license Seq and how it’s used, so for the past month we’ve been working on a revision to our model.

First, two important things:

Changes to our free tier

The challenge with licensing is to make a fair cost/benefit proposal: Seq should be less expensive to use when the benefit is less and when the effort in supporting it is less. When the benefit is greater, or when the support requirements are greater, asking more helps us to provide the level of service required.

Unfortunately, the unlimited-usage “Developer Edition” we launched with thwarts this requirement – how much Seq costs to use ended up coming down to whether the production environment was fully-secured or not. In a production environment secured by other means (e.g. a VPN or other network-level security) the price to use Seq might be zero (via the “Developer Edition”), while a small difference in network topology might require the top-level “Enterprise Edition” for the exact same use of Seq – that is, the exact same benefit.

To fix this, we’ve bitten the bullet and decided to license Seq purely on a per-user basis. The old Developer Edition has been replaced with a new, fully-securable Single-User license. Many, if not most, customers using the old Developer Edition will be able to switch to the new Single-User license and take advantage of authentication and SSL to put Seq in more places.

Register to receive your single-user license key

No model is perfect, but we think setting the price on the basis of the number of users is a much better measure of both how much value Seq will provide, and how much support/development we will need to sustain to make the best experience possible. Basing this on whether or not authentication is required was a mistake on our part.

Some teams will be using the old Developer Edition with more than one user, so won’t be able to take advantage of the new Single-User license. If you’re in this position, and can’t switch to one of our subscriptions or per-user tiers, get in touch – supporting you is important to us and we’ll help you across so that you can continue to benefit from new Seq releases.

New per-user tiers

Previously we supplied Seq in five, fifteen and twenty-five user tiers. There turns out to be a big difference between a six-user team and a fourteen-user one, so the new model is more fine-grained, providing five-user increments from 5 all the way to 25 users.

See the new pricing tiers

Three server installations with all tiers

Perhaps more importantly, Seq previously included a single server installation with each of the five- and fifteen-user tiers. It turns out that this penalised users with segregated environments: if “Dev” and “Test” are on-premises, but “Prod” in the cloud, requiring two licenses seems fairly arbitrary – where an environment is situated shouldn’t make such a big impact on pricing.

In the new model, each tier includes three Seq servers. This means a single license should cover most teams regardless of how their environments are organised.

New unlimited-user Enterprise license

In Enterprise environments, counting seats can be annoying and lead to more time spent in the purchasing process as teams change size and shift focus. For Enterprise customers we’re now offering an unlimited-user license, installable on up to ten servers, to make life simpler.

We’re also looking at ways to make an Enterprise SLA available alongside this option – please get in touch via [email protected] if this is needed in your organization.

Subscription pricing

Along with our up-front pricing, we’re now offering Seq on a pay-as-you go basis for those who’d rather pay by the month. We’re making this available primarily as a convenience – the product you get is exactly the same, i.e. you will still host Seq on your own servers.

Subscriptions come with full email support and access to all upgrades during the subscription period.

Subscribe now!

We’re counting on your feedback

We count on your feedback to make Seq a better product and to make it accessible no matter your needs; if you have suggestions, queries or concerns we’d love it if you can email us.

The Must-Have Tablet for 2010

Quartet Cube
Quartet Cube

If you’re a fan of index cards or Post-it notes for work item tracking, here’s another simple helper you might enjoy.

The Quartet Cube is a tiny, portable metallic whiteboard.

On our recent project we’ve kept three in the project room and found them perfect for:

  • Taking notes during meetings
  • Small focused to-do lists (we often use a pink one to track things we’d like to refactor)
  • Sketching ideas
  • Communicating designs (as demonstrated in the photo by the remarkable Steven Nagy)

The added portability makes it easy to take one to your desk – or to somebody else’s.

One blog to rule them all…

Welcome to my new blog, perhaps a little more permanent than my last two. If you’re looking for something to read while I work on new content for this site, why don’t you take a walk down memory lane with some of my earlier articles?

Implementing the Specification Pattern with Linq – a great data access pattern for domain-driven .NET applications. Follow up with some really interesting additions from Luke Marshall, Rinat Abdullin, Steven Burman and Fredrik Kalseth.

State Machines in Domain Models – a primer on the Stateless hierarchical state machine framework. The title is a bit vague – Stateless is useful wherever a problem can be modelled in terms of states and transitions. Follow up with a look at the new features in version 2.

Implementing Optional Exports with MEF Stable Composition – if you’re working with the Managed Extensibility Framework, you need to understand Stable Composition (and how to debug it!)

Where does the Container Belong? – in answer to what is becoming an age-old question of IoC container users. Leads in to a discussion on context dependencies, and the foundation for declarative context adapters like Autofac’s generated factories and MEF’s PartCreator.

And I hear you ask… Where is Autofac in all of this? Much of my recent hiatus from writing owes to work on version 2. You can follow progress on the mailing list, or watch a summary of changes evolve. For those not familiar with the project, the introductory article on CodeProject still applies.

Ciao!