About Nicholas Blumhardt

Bio:
I'm a software engineer living in Brisbane, Australia. In my spare time I take photos, surf like mad, and run the Autofac and Stateless open source projects, which I founded.
Website:
http://nblumhardt.com
Jabber / Google Talk:
admin

Posts by Nicholas Blumhardt

Serilog Tip – Don’t Serialize Arbitrary Objects

Serilog is built around the idea of structured event data, so complex structures are given first-class support.

var identity = new { Name = "Alice", Email = "alice@example.com" };
Log.Information("Started a new session for {@Identity}", identity);

If you’ve spent some time with Serilog you will have encountered the @ ‘destructuring’ operator. This tells Serilog that instead of calling ToString() on identity, the properties of the object should be serialized and stored in structured form.

{"Identity": {"Name": "Alice", "Email": "alice@example.com"}}

You might also have wondered – since Serilog is built around structured data, why isn’t serialization the default – or, why is @ required at all?

If you consider the process of serialization in general, this starts to make sense. Have you ever tried to convert an entity from EntityFramework to JSON with a general-purpose serializer? A System.Type? Give it a try! If you’re lucky enough to get a result, you’ll probably find it’s a very big blob of text indeed. Some serializers will bail out when circular references are detected, others will chew up RAM indefinitely. Most objects aren’t designed for serilalization, so they’re interconnected with many others that themselves link to yet more objects, and so-on.

Serializers – Serilog included – are not made for serializing arbitrarily-connected object graphs. Serilog has some safeguards to make sure your application survives a “mass serialization” event, but the effects on the health of your logging pipeline won’t be pretty.

In a logging library there’s a delicate balance to make between safety and runtime cost, so the best that most Serilog sinks can do is drop or reject events that are too large before trying to ship them elsewhere (the v2 version of the Seq sink now defaults to a 256 KB cap on event bodies).

What’s the TL;DR? When you serialize data as part of a Serilog event, make sure you know exactly what will be included. The @ operator is a powerful but sharp tool that needs to be used with precision.

Serilog 2.0 Progress Update

TL;DR: There are now 2.0.0-rc2-final-x Serilog packages on NuGet. These are primarily for the adventurers building apps for .NET Core and ASP.NET Core (previously ASP.NET 5). If you’re not pushing forwards into the next version of .NET just yet, then Serilog 1.5.x is the version you should use.

Blazing a trail, or curious to hear what we’ve been up to? Great! Read on…

Why Serilog v2?

serilog-nuget

.NET is changing – that’s possibly an understatement. The days of the big, bulky, slow-moving, preinstalled framework are drawing to a close, and in its place we’ll soon be building on a pay-for-play framework that is light enough to ship alongside your app. Targeting .NET Core means your app only has to carry with it the packages that it needs.

This spells trouble for the “batteries-included” packaging approach we took in Serilog 1.0. Including a bunch of sinks, enrichers and extensions in the core Serilog package means that the transitive closure of dependencies is large. Most apps only use one or two logging sinks, so shipping every package required by every core sink is a waste. In a year’s time, no one will want to use a logging framework that pulls in multiple megabytes of unnecessary junk.

So, fundamentally, v2 of Seriog is about packaging. Instead of the Serilog package carrying the rolling file and console sinks for example, those will now live in Serilog.Sinks.RollingFile and Serilog.Sinks.Console, respectively. Without going too crazy, other bits and pieces like <appSettings> support will also be moving out.

Since a good package factoring will mean a few breaking changes, we’ll capitalize on the opportunity to make a number of smaller improvements that couldn’t otherwise be included in 1.5.

Who is working on this?

There are dozens of active contributors to the various Serilog sub-projects. I’d love to list them here This Week in Rust-style, but I’m not organized enough to get that right. Everything that’s great about Serilog at this point is the product of a dedicated bunch of contributors, and all of that will be brought across into Serilog v2. We’ve also landed many PRs in the v2 branch that will ship along with the v2 release – thanks everyone who has sent one or more of these!

On v2 specifically though, I have to mention the work done by @khellang on the platform mechanics. .NET Core has been a moving target, and over the course of almost a year we’ve had to adjust to the twists and turns that the new platform has taken while keeping an eye on the eventual end goal. Kristian has helped the project navigate all of this and I don’t think we’d even have a working build right now if not for his guidance and hard work.

The flip-side – maintaining a usable API and intelligible packaging story, has been in large part the work of @MatthewErbs, who’s landed some of the broadest and most detailed PRs we’ve yet attempted in the last few months. Likewise, v2 would be nowhere without Matthew’s efforts – and bulletproof patience! :-)

What’s done, what’s remaining?

At the time of writing, the v2 Serilog package as well as the previously-built-in sinks work on .NET Core identically to the full .NET Framework. On the packaging side, there’s still some work to go, but things are coming together quickly.

Beyond that set of basic sinks, along with a few extras like the Seq sink and my favourite Literate Console, practically nothing works. If you are working on a .NET Core app and the sinks you prefer aren’t available, helping to update them is a great way to support the project.

We’ve also shipped Serilog.Extensions.Logging, a provider for the logging abstractions used throughout ASP.NET Core. You can read more about it and see some example output in this article on the Seq blog.

Remaining is a large amount of cleanup and polish. One task in the midst of this is removal of .NET 4.0 support from the codebase. 4.0 is a respectable platform, but it is a static one. New features and improvements going into Serilog are mostly in support of new platform features and don’t carry as much value for .NET 4 apps. Couple this with the observation that .NET apps aren’t likely to adopt new technology (including Serilog) in significant numbers, and it is hard to justify carrying forward the #if spaghetti to the new version.

This doesn’t mean Serilog is abandoning .NET 4 – only that Serilog 1.5 will remain the recommended version for this platform. Platform-specific dependency support in NuGet also means that projects wishing to actively support .NET 4 with Serilog can still do so by selecting different required Serilog versions per target framework version in project.json.

When will Serilog 2.0 ship?

The core Serilog package will ship in line with .NET Core/ASP.NET Core. There’s no solid date on this yet, but the ASP.NET Roadmap is the location to watch.

I would guess this post raises as many questions as it answers – please ask away! If you’d like to help us cross the last mile (or two) then the best way to get involved is to grab an issue by leaving a comment, or raising a new issue for work you might want to complete such as porting one of the many remaining sinks.

2015 in Review

Another year is almost over! I’ve looked back through the posts on this blog from the past year and made some notes in case you’re looking for some holiday reading.

This was a huge year for Serilog. January kicked of with a few tens of thousands on the NuGet package download counter, but closing up the year it’s over 250k and accelerating. With Serilog-style message template support set to land in ASP.NET 5, I think it is safe to say 2016 is the year we’ll see structured logging hit mainstream in .NET.

Seq has also seen huge (triple-digit) growth this year, especially since v2 shipped in June. Keeping up has been a challenge! Along with a new major version in the first quarter next year, there’s a lot coming for us in 2016 – stay tuned for some updates after the break.

2015

  • Give your instrumentation some love in 2015! — I started this year aware that the vast majority of .NET teams are still writing plain-text logs, collecting them with Remote Desktop and reading them in Notepad. It feels like this is improving but there’s still a long way to go before we’re all using the current generation of tools effectively.
  • Using Serilog with F# Discriminated Unions — Serilog gained some better F# support this year. (Also on the F# front, Adam Chester’s implementation of Message Templates in F# has opened up some possibilities with that language. Logary 4 also promises some Serilog-style structured goodness for F# users sometime in the coming year.)
  • Tagging log events for effective correlation — Some tips for tracing related paths of execution through your application logs.
  • Diagnostic logging in DNX/ASP.NET 5 — The ASP.NET 5/CoreCLR platform has changed significantly since this first tentative post describing Serilog support went out in May, but the fundamentals are still pretty well summed-up here. ASP.NET 5 and CoreCLR are the bit focus areas for Serilog’s upcoming 2.0 release, which has been ticking away on GitHub for a few months now. The platform reset going on in .NET right now is going to take some getting used to, but in a few years we’ll be able to thank the current ASP.NET and CoreFX teams, as well as the mass of community contributors, for the continued relevance and growth of .NET. 2016’s going to be a year for us all to rally and show some support for this work.
  • Seq/2 Update, Seq/2 Beta and Announcing Seq 2 RTW — It’s hard to believe Seq 2 has only been out since June. These few posts track the release of Seq 2, which was a complete UI rewrite and major overhaul of Seq v1. (2.1 followed, as did 2.2 and 2.3. Seq is now at version 2.4).
  • Filtering with Signals in Seq 2 — Explains the new filtering system in Seq 2.
  • Contender for .NET’s Prettiest Console? — If you’re not using Serilog’s Literate Console sink, you need to check out this post.
  • Contextual logger injection for Autofac — If you prefer to inject ILogger using your IoC container, this post is for you.
  • Assigning event types to Serilog events — Seq’s “event type” system can be implemented in simple flat-file logs too, for improved searching/filtering.
  • Aggregate Queries in Seq Part 1: Goals — The first of a series of posts documenting a spike through Seq v3’s SQL Query interface. (Parts 2, 3, 4, 5 and 6.)
  • How to notify Slack using logs from your .NET app — Seq + Slack = <3.

Thanks for visiting and commenting this year. Have a happy and safe holiday season, and see you in 2016!

How to notify Slack using logs from your .NET app

If your team uses Slack, it’s a great place to centralize notifications that might otherwise end up cluttering your email inbox. Commits, build results, deployments, incoming bug reports – Slack keeps your team informed without making everyone feel overloaded with information, which I why I think I like it so much.

The next step down the road to notification heaven, after setting up integrations for the third party apps and services you use, is to integrate your own apps into Slack.

Doing this directly – posting to the Slack API when interesting things happen – is a bit too much coupling, but if your app writes log events these can be used to trigger notifications in Slack with very little effort.

EventInSlack

So that Slack isn’t flooded with irrelevant events, we’ll forward them to Seq first so that interesting events can be selected from the stream.

1. Write and send the log events

In the demo app below, Serilog is configured to send events both to Seq and the local console.

First up install some prerequisite packages:

Install-Package Serilog
Install-Package Serilog.Sinks.Seq
Install-Package Serilog.Sinks.Literate

We’re using Serilog’s Literate Console sink because it reveals more about the structure of the log events (using colour) than the standard console can.

Here’s the code:

class Program
{
    static void Main()
    {
        Log.Logger = new LoggerConfiguration()
            .WriteTo.Seq("http://localhost:5341")
            .WriteTo.LiterateConsole()
            .CreateLogger();

        Log.Information("Starting up");

        var rng = new Random();
        while (true)
        {
            var amount = rng.Next() % 100 + 1;

            Log.Information("Received an order totalling ${Amount:0.00}", amount);

            Thread.Sleep(1000);
        }
    }
}

This program generates a random integer between 1 and 100 each second, and writes this to the log as a property called Amount. Let’s imagine we’re processing sales on a website, or reading measurements from a sensor – the same approach covers a lot of scenarios.

Console

2. Install Seq

If you don’t have Seq running on your development machine, install it from the Seq downloads page – just click through the installer dialog and choose “Browse Seq” at the end to view these log events.

Seq

3. Choose events to notify on

Here’s a twist; so that we’re not overwhelmed with notifications we’ll only raise one if the value of the “sale” is above $90. To find these events in Seq, we first write the filter Amount >= 90. Once we’re confident the right events are selected, the filter can be saved as a signal.

CreatingSignal

The name of the signal is important since we’ll use it to configure the Slack integration in a moment.

4. Add the Slack integration for Seq

The Slack integration for Seq is developed and is maintained on GitHub by David Pfeffer. Seq plug-in apps are published on NuGet – this one has the package id Seq.App.Slack.

To install it into your Seq instance go to Settings, then Apps, and choose Install from NuGet. Enter the package name Seq.App.Slack and install.

Installing

A few seconds later the app should appear in your app list and be ready to configure. To the right of the app name, choose Start new instance….

Installed

5. Configure the WebHook

Give the instance a name like “Big Sales Incoming!” and un-check Only trigger the app manually. The name of the signal we created earlier should now be there in a drop down to select.

AppSetup

The last thing to set is the WebHook URL setting. The easiest way to get one of these is to open the channel you’re posting to in Slack and choose + Add an app or custom integration. This will take you to a Slack page which at the time of writing has just gone through a major overhaul. The current path through the site to add a webhook is:

  1. Choose Build your Own in the top right-hand corner of the page
  2. Under Something just for my team choose Make a Custom Integration
  3. Here’s where you can choose Incoming WebHooks and follow a couple of prompts to get the URL

It’s a bit of an obscure way to do something that’s fairly common – I’m hopeful this will be improved once the redesign settles in :-)

Back to Seq, paste in the URL, Save Changes and you’re good to go! Events will now start appearing in Slack whenever the Amount is $90 or more.

EventInSlack

Happy logging!

Aggregate Queries in Seq Part 6: Serving Data

Let’s get it out of the way up front – I didn’t manage to fit this series into November. Hrmmmmm… sorry! In the end, I prioritised getting an early preview release out ahead of finishing the blog series documenting the process. The upside is – you can grab it now! The basics of aggregate queries, as they’ll appear in Seq 3, are in preview form on the Seq downloads page.

In what will be the final post in this series for now, I want to show you how the results of running an aggregate query are surfaced in Seq’s HTTP API, and also give a shout-out to some handy tools and techniques it uses along the way.

A syntactic digression…

Before we dive into how Seq’s API is structured, there’s one small tweak to the “SQL-like” query language that I should mention. To avoid endless confusion about what that “-like” really means, the new syntax added in Seq 3 will, as much as possible, be a dialect of SQL. Introducing fewer new things means everyone has less to hold in their head while using Seq, which fits Seq’s goal of getting out of the way while you navigate logs.

The most noticeable change here is that all queries (except trivial scalar ones such as select 41 + 1 as Answer) now have a from clause of the form from stream. The “stream” is Seq’s way of describing “whichever stream of events you’re currently looking at”. Someday other resources wll be exposed through the query interface, and the from clause will enable that.

Other changes you’ll spot are support for SQL operators such as and, or, not and even like. Single-quoted strings are supported, as are SQL comparisons = and <> (though a bug in the published build prevents the use of the latter).

By aligning better with SQL I hope Seq’s querying facilities can remain easy to learn while evolving to support more sophisticated uses.

Resources and Links

Back to the API. Seq is developed API-first, a strategy I picked up from Paul Stovell while working on Octopus Deploy, and something that’s made a huge impact on the way I’ve approached web application development ever since.

Seq also employs some of the ideas associated with hypermedia APIs, notably links and URI templates. (In my experiences so far these are the techniques, along with semantic use of HTTP verbs, that have delivered the most value in application development. Some of the more sophisticated hypermedia concepts like custom media types for standard interchange formats are very interesting but I’m not seeing much use day-to-day. Having said that, wurl is one tool I’ve seen that made me wish we all did use HAL or JSON-API.)

Linking means that the entire Seq API can be navigated from a single root document. At https://your-seq-server/api (here if you’re on localhost) there’s a JSON document with links into all the various resources that the server provides.

API Root

The green box (shout out to Greenshot) shows the resource endpoint where queries are going to be exposed as data sets.

My test Seq instance is configured to listen under the /prd URL prefix, but unless you’ve set this up explicitly you won’t see this part of the path on your own instance.

Data Resources

I’ll admit that there isn’t much of hypermedia angle in this particular use case – rowsets aren’t obviously entities the way signals, users and event events are in Seq’s world-view – but using the same machinery for this as the remainder of the API keeps everything working harmoniously.

The awesome thing from a consumer standpoint though, is the self-documenting nature of the whole thing. Note the "Query" link in the image above. This is a URI template describing how calls to the query endpoint should be formatted.

GET http://your-seq-server/api/data?q=select%20count(*)%20from%20stream&rangeStartUtc=2015-12-08T22:20:22.000Z HTTP/1.1

The signalId segment of the URI is optional, so it doesn’t appear in this request.

In the JavaScript client the call looks like:

api.data.query({ q: 'select count(*) from stream', rangeStartUtc: start }).then(rowset => {
  // rowset is the query result
});

Here, if the client did specify a signalId, the URI template ensures it would be formatted into a URL segment rather than being specified as a query string parameter like the other parameters are, but the client code doesn’t have to be aware of the difference. This makes it nice and easy to refactor and improve the URL structure of the API (even during development) without endlessly poking around in JavaScript string concatenation code.

On the client side, URI templates are handled with the Franz Antesberger’s JavaScript implementation. For the most part this means a hash of parameters like the argument passed to query above can be substituted directly into the URI template, and validated for correct naming and so-on along the way.

Serving it up with Nancy

On the server side, NancyFX reigns. Like Octopus Deploy, Seq took a bet on Nancy in its pre-1.0 days and hasn’t looked back. The team behind Nancy is talented, passionate, but above all, highly considerate of its users to the extent that since adopting Nancy in the zero-point-somethings there’s barely been any breakage or churn between versions, let alone bugs. I can’t recommend Nancy highly enough, and consider it the gold standard for building web apps in .NET these days. It looks like I’m not alone in this opinion.

I find Nancy wonderful for building APIs because while it exposes a slightly quirky API of its own (a noble thing!), it isn’t at all opinonated about how you should structure yours. This also means that when dealing with Nancy there are sensible defaults everywhere, but very little deeply built-in policy – so minimal time is spent grappling with the framework itself.

Here’s the route that handles queries:

public class DataModule : SignalContentModule
{
    readonly Lazy<IEventStore> _events;

    public DataModule(Lazy<IEventStore> events)
        : base("data")
    {
        _events = events;

        Post[""] = p => Query(SignalsModule.MapToNewDocument(ReadBody<SignalEntity>()));
        Get["/{signalId}"] = p => Query(LoadAndCheck(p.signalId));
    }

This is a Nancy module that shares some functionality with the events module (though implementation inheritance does feel like a bit of an ugly hack, here as elsewhere). The two different instantiations of the Query URI template we viewed before need two routes in Nancy; the POST version accepts an unsaved signal in the body of the request, which is necessary because signals may be edited in the UI and queries run against them without saving the changes.

There’s one snippet responsible for declaring how the module works:

protected override ResourceGroup DescribeResourceGroup()
{
    var resource = base.DescribeResourceGroup();
    resource.Links.Add("Query", Qualify("{/signalId}{?q,intersectIds,rangeStartUtc,rangeEndUtc}"));
    return resource;
}

(I’ve often thought it would be nice to unify the description with the routing – there’s some duplication between this code and the route configuration above.)

I’ll include for you here the whole implementation of Query() in all its SRP-violating glory (I think the un-refactored version is nicer to read sequentially in a blog post, but as this goes from feature spike to a fully-fledged implementation I see some Ctrl+R Ctrl+M in the very near future):

Response Query(Signal signal = null)
{
    var query = (string)Request.Query.q;
    if (string.IsNullOrWhiteSpace(query))
        return BadRequest("A query parameter 'q' must be supplied.");

    DateTime? rangeStartUtc = TryReadDateTime(Request.Query.rangeStartUtc);
    DateTime rangeEndUtc = TryReadDateTime(Request.Query.rangeEndUtc) ?? DateTime.UtcNow;
    if (rangeStartUtc == null)
        return BadRequest("A from-date parameter 'rangeStartUtc' must be supplied.");

    if (rangeStartUtc.Value >= rangeEndUtc)
        return BadRequest("The queried time span must be of nonzero duration.");

    var filter = GetIntersectedSignalsFilter(signal);

    var result = _events.Value.Query(
        query,
        rangeStartUtc.Value,
        rangeEndUtc,
        filter: filter);

    if (result.HasErrors)
    {
        var response = Json(new QueryResultPart
        {
            Error = "The query could not be executed.",
            Reasons = result.Errors
        });
        response.StatusCode = HttpStatusCode.BadRequest;
        return response;
    }

    var data = new QueryResultPart
    {
        Columns = result.Columns,
        Rows = result.Rowset,
        Slices = result.Slices?.Select(s => new TimeSlicePart
        {
            Time = s.SliceStartUtc.ToIsoString(),
            Rows = s.Rowset                
        }).ToArray(),
        Statistics = new QueryExecutionStatisticsPart
        {
            ElapsedMilliseconds = result.Statistics.Elapsed.TotalMilliseconds,
            MatchingEventCount = result.Statistics.MatchingEventCount,
            ScannedEventCount = result.Statistics.ScannedEventCount,
            UncachedSegmentsScanned = result.Statistics.UncachedSegmentsScanned
        }
    };

    return Json(data);
}

I have come to wonder if anyone out there uses optional = null parameters as a semi-self-documenting “nullable” annotation for parameters like this:

Response Query(Signal signal = null)
{

I picked it up as a habit after Daniel Cazzulino (if my memory serves me correctly) suggested it as a way of marking optional dependencies in Autofac. Using a default value for nullable arguments expresses the intent that ? would, had nullability been first-class in the early days of C#.

The block of code all the way through to the initialization of filter slurps up parameters from the query string. Nancy has a binding system that might knock a line or two of code out here.

The real action is in:

var result = _events.Value.Query(
    query,
    rangeStartUtc.Value,
    rangeEndUtc,
    filter: filter);

IEventStore.Query() is a wrapper method where the parsing, planning and execution steps take place.

Finally, the result is mapped back onto some POCO types to send to the caller. No magic here – but then again, I think the theme of these few blog posts has been to cut through the whole implementation in an anti-magical way. The types like QueryResultPart will eventually make their way to the Seq API client.

And that, my friends, is a wrap! It’s been fun sharing the process of spiking this feature. I’d like to continue this series into the Seq UI, but there’s a lot to do in the next few months as Seq 3 comes together into a shippable product. I’m excited about how much aggregate queries will enable as part of that. In the meantime though I need to report on the progress that’s been happening in Serilog in preparation for cross-platform CoreCLR support and ASP.NET 5 integration – look out for an update on that here soon.

I’ll leave you with a snapshot of an aggregate query in action.

Screenshot

Download the 3.x preview installer here and check out the documentation here.

Aggregate Queries in Seq Part 5: Execution

Part 5 was very nearly the stalling point in this blog series. I’ve got enough of the implementation done that I can see the finish line, and I’m eager to get that build out, but to really finish the story I need to fill in this installment. If this post is a little brief, please read it as a “status report” this time around :-)

I’ve also had a bit of time now to revisit decisions made in the earlier stages of building this feature. I had some honest and valuable feedback from Michael Chandler on Twitter regarding the “SQL-like” nature of the syntax:

Upon reflection I think it will be easier to explain how to use aggregate queries if the language simply is SQL, or a dialect thereof, anyway. So, I’ve been back to rework some of the parser and now in addition to the C#-style expression syntax, typical SQL operators such as =, and, or, like, and not as well as single-quoted strings are available:

select count(*)
where Environment = 'production' and not has(@Exception)
group by time(7d), Application

There are still some questions to answer around how much this flows back the other way into typical filter expressions. On the one hand, it’d be nice if the filter syntax and the where clause syntax were identical so that translating between queries and filters is trivial. On the other hand, keeping the languages a bit tighter seems wise. For now, the syntaxes are the same; I’m going to spend some time using the SQL syntax in filters and see how it goes in practice.

Anyway, back to the topic at hand. Now we’re getting somewhere! The aggregate query parser handles the syntax, the planner can produce a query plan, and we need to turn that into a result set.

This post considers three questions:

  • What inputs are fed into the executor?
  • What does the result set look like?
  • How is the result computed?

The first turns out to be predictably easy, given the efforts expended so far to generate a plan, and the existing event storage infrastructure. Keeping things as simple as possible:

static class QueryExecutor
{
    public static QueryResult Execute(
        QueryPlan plan,
        IEventStore store,
        DateTime rangeStartUtc,
        DateTime rangeEndUtc)

Here plan is the output of the last step, store is a high-level interface to Seq’s time-ordered disk/RAM event storage, and the two range parameters the time slice to search. (The implementation as it stands includes a little more detail, but nothing significant.)

A QueryResult lists the column names produced, and either a list of rows, or a list of time slices that each carry a list of rows:

Results

I decided for now to keep the concept of “time slice” or sample separate (time could simply have been another column in the rowset) because it makes for a friendlier API. I’m not sure if this decision will stick, since tabular result sets are ubiquitously popular, but when “series” are added as a first-class concept it is likely they’ll have their own more optimal representation too.

In between these two things – an input query plan and an output result – magic happens. No, just kidding actually. It’s funny how, once you start implementing something, the magic is stripped away and things that initially seem impenetrably complex are made up of simple components.

The core of the query executor inspects events one by one and feeds the matching ones into a data structure carrying the state of the computation:

AggregationState

First, the group that the event belongs to is determined by calculating each grouping expression and creating a group key. Against this, state is stored for each aggregate column being computed. The subclasses of Aggregation are themselves quite simple, like count():

class CountAggregation : Aggregation
{
    long _count;

    public override void Update(object value)
    {
        if (value == null)
            return;

        ++_count;
    }

    public override object Calculate()
    {
        return (decimal)_count;
    }
}

The value to be aggregated is passed to the Update() method (in the case of count(*) the * evaluates to a non-null constant) and the aggregation adds this to the internal state.

Once all of the events have been processed for a time range, Calculate() is used to determine the final value of the column. It’s not hard to map count(), sum(), min(), max() and so-on to this model.

Things are a little trickier for aggregates that themselves produce a rowset, like distinct(), but the basic approach is the same. (Considering all aggregate operators as producing a rowset would also work and produce a more general internal API, but the number of object[] allocations gets a little out of hand.)

Once results have been computed for a time slice, it’s possible to iterate over the groups that were created and output rows in the shape of the QueryResult structure shown earlier.

There’s obviously a lot of room for optimisation, but the goals of a feature spike are to “make it work” ahead of making it work fast, so this is where things will sit while I move on towards the UI.

One more thing is nagging at me here. How do we prevent an over-eager query from swamping the server? Eventually I’d like to be able to reject “silly” queries at the planning stage, but for now it’s got to be the job of the query executor to provide timeouts and cancellation. These are sitting in a Trello card waiting for attention…

In the next post (Part 6!) we’ll look more closely at Seq’s API and finally see some queries in action. Until then, happy logging!

Read Part 6: Serving Data

Aggregate Queries in Seq Part 4: Planning

Seq is a log server designed to collect structured log events from .NET apps. This month I’m working on adding support for aggregate queries and blogging my progress as a diary here. This is the fourth installment – you can catch up on Goals, Syntax, and Parsing in the first three posts.

So, this post and the next are about “planning” and “execution”. We left off a week ago having turned a query like:

select count(*)
where ApplicationName == "Admissions"
group by time(1d), ExceptionType

Into an expression tree:

QueryAST

Pretty as it looks, it’s not obvious how to take a tree structure like this, run it over the event stream, and output a rowset. We’ll break the problem into two parts – creating an execution plan, then running it. The first task is the topic of this post.

Planning

In a relational database, “planning” is the process of taking a query and converting it into internal data structures that describe a series of executable operations to perform against the tables and indexes, eventually producing a result. The hard parts, if you were to implement one, seem mostly to revolve around choosing an optimal plan given heuristics that estimate the cost of each step.

Things are much simpler in Seq’s data model, where there’s just a single stream of events indexed by (timestamp, arrival order), and given the absence of joins in our query language so far. The goal of planning our aggregate queries is pretty much the same, but the target data structures (the “plans”) only need to describe a small set of pre-baked execution strategies. Here they are.

Simple projections

Let’s take just about the simplest query the engine will support:

select MachineName, ThreadId

This query isn’t an aggregation at all: it doesn’t have any aggregate operators in the list of columns, so the whole rowset can be computed by running along the event stream and plucking out the two values from each event. We’ll refer to this as the “simple projection” plan.

A simple projection plan is little more than a filter (in the case that there’s a where clause present) and a list of (expression, label) pairs representing the columns. In Seq this looks much like:

class SimpleProjectionPlan : QueryPlan
{
    public FilterExecutionPlan Filter { get; }
    public ProjectedColumn[] Columns { get; }

    public SimpleProjectionPlan(
        ProjectedColumn[] columns,
        FilterExecutionPlan filter = null)
    {
        if (columns == null) throw new ArgumentNullException(nameof(columns));
        Columns = columns;
        Filter = filter;
    }
}

We won’t concern ourselves much with FilterExecutionPlan right now; it’s shared with Seq’s current filter-based queries and holds things like the range in the event stream to search, a predicate expression, and some information allowing events to be efficiently skipped if the filter specifies any required or excluded event types.

Within the plan, expressions can be stored in their compiled forms. Compilation can’t be done any earlier because of the ambiguity posed by a construct like max(Items): syntactically this could be either an aggregate operator or a scalar function call (like length(Items) would be). Once the planner has decided what the call represents, it can be converted into the right representation. Expression compilation is another piece of the existing Seq filtering infrastructure that can be conveniently reused.

Aggregations

Stepping up the level of complexity one notch:

select distinct(MachineName) group by Environment

Now we’re firmly into aggregation territory. There are two parts to an aggregate query – the aggregates to compute, like distinct(MachineName), and the groupings over which the aggregates are computed, like Environment. If there’s no grouping specified, then a single group containing all events is implied.

class AggregationPlan : QueryPlan
{
    public FilterExecutionPlan Filter { get; }
    public AggregatedColumn[] Columns { get; }
    public GroupingInstruction[] Groupings { get; set; }

    public AggregationPlan(
        AggregatedColumn[] columns,
        GroupingInstruction[] groupings,
        FilterExecutionPlan filter = null)
    {
        if (columns == null) throw new ArgumentNullException(nameof(columns));
        if (groupings == null) throw new ArgumentNullException(nameof(groupings));
        Filter = filter;
        Columns = columns;
        Groupings = groupings;
    }
}

This kind of plan can be implemented (naiively perhaps, but that’s fine for a first-cut implementation) by using the groupings to create “buckets” for each group, and in each bucket keeping the intermediate state for the required aggregates until a final result can be produced.

Aggregated columns, in addition to the expression and a label, carry what’s effectively the constructor parameters for creating the bucket to compute the aggregate. This isn’t immediately obvious based on the example of distinct, but given another example the purpose of this becomes clearer:

percentile(Elapsed, 95)

This expression is an aggregation producing the 95th percentile for the Elapsed property. An AggregatedColumn representing this computation has to carry the name of the aggregate ("percentile") and the argument 95.

Time slicing

Finally, the example we began with:

select count(*)
where ApplicationName == "Admissions"
group by time(1d), ExceptionType

Planning this thing out reveals a subtlety around time slices in queries. You’ll note that the time(1d) group is in the first (dominant) position among the grouped columns. It turns out the kind of plan we need is completely different depending on the position of the time grouping.

In the time-dominant example here, the query first breaks the stream up into time slices, then computes an aggregate on each group. Let’s refer to this as the “time slicing plan”.

class TimeSlicingPlan : QueryPlan
{
    public TimeSpan Interval { get; }
    public AggregationPlan Aggregation { get; }

    public TimeSlicingPlan(
        TimeSpan interval,
        AggregationPlan aggregation)
    {
        if (aggregation == null) throw new ArgumentNullException(nameof(aggregation));
        Interval = interval;
        Aggregation = aggregation;
    }
}

The plan is straightforward – there’s an interval over which the time groupings will be created, and an aggregation plan to run on the result.

The output from this query will be a single time series at one-day resolution, where each element in the series is a rowset containing (exception type, count) pairs for that day.

The alternative formulation, where time is specified last, would produce a completely different result.

select count(*)
where ApplicationName == "Admissions"
group by ExceptionType, time(1d)

The output of this query would be a result set where each element contains an exception type and a timeseries with counts of that exception type each day. We’ll refer to this as the “timeseries plan”.

Both data sets contain the same information, but the first form is more efficient when exploring sparse data, while the second is more efficient for retrieving a limited set of timeseries for graphing or analytics.

To keep things simple (this month!) I’m not going to tackle the timeseries formulation of this query, instead working on the time slicing one because I think this is closer to the way aggregations on time will be used in the initial log exploration scenarios that the feature is targeting.

Putting it all together

So, to recap – what’s the planning component? For our purposes, planning will take the syntax tree of a query and figure out which of the three plans above – simple projection, aggregation, or time slicing – should be used to execute it.

The planner itself is a few hundred lines of fairly uninteresting code; I’ll leave you with one of the tests for it which, like many of the tests for Seq, is heavily data-driven.

[Test]
[TestCase("select MachineName", typeof(SimpleProjectionPlan))]
[TestCase("select max(Elapsed)", typeof(AggregationPlan))]
[TestCase("select MachineName where Elapsed > 10", typeof(SimpleProjectionPlan))]
[TestCase("select StartsWith(MachineName, \"m\")", typeof(SimpleProjectionPlan))]
[TestCase("select max(Elapsed) group by MachineName", typeof(AggregationPlan))]
[TestCase("select percentile(Elapsed, 90) group by MachineName", typeof(AggregationPlan))]
[TestCase("select max(Elapsed) group by MachineName, Environment", typeof(AggregationPlan))]
[TestCase("select distinct(ProcessName) group by MachineName", typeof(AggregationPlan))]
[TestCase("select max(Elapsed) group by time(1s)", typeof(TimeSlicingPlan))]
[TestCase("select max(Elapsed) group by time(1s), MachineName", typeof(TimeSlicingPlan))]
[TestCase("select count(*)", typeof(AggregationPlan))]
public void QueryPlansAreDetermined(string query, Type planType)
{
    var tree = QueryParser.ParseExact(query);

    QueryPlan plan;
    string[] errors;
    Assert.IsTrue(QueryPlanner.Plan(tree, out plan, out errors));
    Assert.IsInstanceOf(planType, plan);
}

Part five will look at what has to be done to turn the plan into a rowset – the last thing to do before the API can be hooked up!

Read Part 6: Execution

Aggregate Queries in Seq Part 3: An Opportunistic Parser

It turns out the parser wasn’t a huge departure from Seq’s existing filter parser. Seq already uses Sprache to parse filter expressions, and Sprache parsers compose very nicely.

After making the current FilterExpressionParser “root” expression public, and defining some new AST nodes like Projection and so-on, things just get bolted together:

static readonly Parser<ExpressionValue> ExpressionValue =
    FilterExpressionParser.Expr.Token().Select(e => new ExpressionValue(e));

static readonly Parser<Projection> Projection =
    from v in AggregateValue.Or(ExpressionValue)
    from l in Label.Optional()
    select new Projection(v, l.GetOrDefault());

Here you can see the way a projection like the count(*) as Total column is constructed from a parser for values, and a parser for optional ‘as’ labels. I had to define a separate parser for some aggregations, like count(*) that aren’t otherwise valid Seq filter syntax, but any existing expression that FilterExpressionParser supports can be used as the value of a projected column.

Heading further up towards the root of the grammar, we get something like:

static readonly Parser<Query> Query =
    from @select in Select
    from @where in Where.XOptional()
    from groupBy in GroupBy.XOptional()
    select new Query(@select, @where.GetOrDefault(), groupBy.GetOrDefault());

The resulting Query parser can take some text input and give back a tree of objects representing the parts of the query. Success!

There was one subtle problem here that you can spot by way of the oddly-named XOptional combinator. Sprache on Github provides Optional, which works as advertised, but upon failing a match will backtrack and return success regardless of whether a partial parse was possible or not.

This leads to error messages without a lot of information, for example:

select distinct(ExceptionType) group ApplicationName

is missing the ‘by’ required next to ‘group’. Using Optional the parser reports:

Syntax error (col 31): unexpected 'g'.

Hmmm. Not so good – there’s nothing at all wrong with that ‘g’! The problem is that upon failing to parse the ‘by’, Sprache’s Optional returned a zero-length successful parse, so parsing picks back up at that position and fails because there are no more tokens to match.

The ‘X’ in XOptional is for eXclusive, meaning that the token is optional, but, only if it parses no input whatsoever. As soon as ‘group’ is parsed, the optional branch is considred “taken”, and failures will propagate up. (Sprache ships ‘X’ versions of several parsers already, such as an exclusive Many called XMany.)

Here it is:

public static Parser<IOption<T>> XOptional<T>(this Parser<T> parser)
{
    if (parser == null) throw new ArgumentNullException(nameof(parser));
    return i =>
    {
        var result = parser(i);
        if (result.WasSuccessful)
            return Result.Success(new Some<T>(result.Value), result.Remainder);

        if (result.Remainder.Equals(i))
            return Result.Success(new None<T>(), i);

        return Result.Failure<IOption<T>>(result.Remainder, result.Message, result.Expectations);
    };
}

The divergence from the built-in optional is only succeeding with a zero-length parse if (result.Remainder.Equals(i)).

Using XOptional:

Syntax error (col 38): unexpected 'A', expected keyword 'by'.

Better!

If you haven’t used parser combinators before this whole thing might be a bit surprising – where’s the EBNF? The esoteric command line tools with animal names? It turns out that combinators make parsing into a regular (somewhat imperative) programming task without a lot of mystery surrounding it.

There are some limitations in Sprache’s implementation I’d like to address someday – for example, the error reported on ‘A’ above rather than ‘ApplicationName’ is the result of parsing the raw character stream instead of a tokenised one – but these are minor inconveniences that can be worked around if need be.

If you haven’t looked into combinator-based parsing, there are some great tutorials and examples linked from Sprache’s README. It’s a technique worth adding to your tool belt regardless of the kind of programming you usually do. Little languages are everywhere, waiting to be cracked open!

The most enjoyable and challenging part of any language processing task for me is not so much the parsing though, but taking a tree of syntactic nodes like we have here, and turning it into something executable. That’s coming up next :-)

Read Part 4: Planning

Aggregate Queries in Seq Part 2: Defining a Syntax

So, before we go any farther we’re going to need to pin down a bit more tightly what form aggregate queries will take. There are options, options, options – but hopefully a lot will fall out of how queries are expressed in Seq today.

The Seq filter box accepts predicates – expressions that evaluate to a Boolean in the context of an event.

Environment == "Production"

To express something like a sum, writing:

sum(ItemsOrdered)

…in the filter box seems reasonable, except when predicates get involved again. Combining the two expressions above into something that says “the number of items ordered in the production environment” is not obvious.

To introduce aggregates the filter syntax is going to have to stretch a bit, so that Seq can tell the difference between a simple predicate and a full-fledged query. Here goes:

select sum(ItemsOrdered) where Environment == "Production"

The plan is to reappropriate select and where from SQL to mark out the clauses. SQL-like queries have the strong advantage being familiar, and loosely align with the rest of the “C#-like” syntax by their analogy to LINQ. Starting queries with the keyword select gives the UI a chance to intelligently determine the type of query being written – staying with just a single input box is an explicit goal.

So what else can a SQL-like syntax offer? Grouping the number of items ordered by the item itself:

select sum(ItemsOrdered)
where Environment == "Production"
group by ItemId

Groupings are bread-and-butter for aggregate queries, so it’s handy that they carry over fairly naturally.

What about from? I don’t think I’m going to go after from at this point. The context of a query in Seq will initially be the events viewed in the UI, filtered down to whatever signals are active. There’s lots of room to extend this down the track a bit, but I think filtering and grouping is enough to bite off for a start.

There’s one last core concept the query syntax needs. Log events are time-dimensioned, and dealing with time requires some up-front attention.

Poking around, time groupings seem to have been grafted onto SQL in a few different ways, in traditional databases, event processing systems and timeseries databases. Approaching this the obvious way by making use of the built-in @Timestamp property attached to Seq events could look like:

select sum(ItemsOrdered)
where Environment == "Production"
group by hour(@Timestamp), ItemId

The awkwardness of this approach isn’t apparent until more exotic requirements show up – grouping by 20 hour blocks, or offsetting queries into another (non-UTC) timezone. I’m also not sure I want to type @Timestamp dozens of times a day.

Instead, I’m exploring the idea of a “time expression” syntax like the one used by InfluxDB, where the size of the interval is specified as a literal like time(10s):

select sum(ItemsOrdered)
where Environment == "Production"
group by time(1h), ItemId

Melding this to the existing expression parser is going to be fun!

Read Part 3: An Opportunistic Parser

Aggregate Queries in Seq Part 1: Goals

To add a bit of variety to the format of this blog, I’ve decided to try diarising a month of programming – November 2015 to be exact, if you’re reading this in the future!

This month I’ve got some steep goals to face: I want to ship a preview of Seq’s next major feature – aggregate queries – by the end of the month. I’m not starting from scratch, but pulling together the current progress into a complete feature is still a lot of work and there are many design decisions yet to make. I don’t intend to post an update every day (I’d have no time for actually writing the code ;-)) but hopefully every few days I can get an installment up here.

So, the first of these diary entries: why am I even working on aggregate queries, and what are they, anyway?

Constraint is a wonderful aid to creation, since without the months-end deadline breathing down my neck I’d no doubt have more to say about this here, but in the interest of making progress, it’s quicker and easier to explain by example: the aggregates we’re talking about are count(), distinct(), sum(), min(), max(), mean(), percentile() and some of their lesser-known friends.

Log data is great for answering ad hoc questions about how and app behaves and is used. A big enhancement to Seq’s analytical capabilities today (which otherwise fall back on exporting tabular data to Excel) would be to ask it questions like:

  • Which exception types have occurred today, and how many of each type?
  • Are average transaction processing times improving or degrading?
  • How many items on average do customers check out?

Aggregate queries enable this, and up all kinds of ways to learn more from the data that’s already collected.

Some of these capabilities overlap with what dedicated metrics can also provide. I am a huge believer in the benefits of measuring and dashboarding anything that moves. Metrics and logs aren’t the same thing though, and the scenarios and usage patterns for each can be startlingly different, from collection right through to storage and processing. Seq can be (and already is) used for very light metrics duties, but in the interest of doing one thing well the immediate goal for aggregation in Seq is to answer ad hoc questions from log data rather than perform heavy-duty timeseries crunching.

Implementing aggregates in Seq means implementing from the ground, up. There’s no SQL database behind the scenes to do the heavy lifting – everything from parsing to planning and executing the queries needs to be done by hand in C#. I’m expecting to learn a lot along the way. It should make for an interesting month, wish me luck! :-)

Read Part 2: Defining a Syntax