2015 in Review

Another year is almost over! I’ve looked back through the posts on this blog from the past year and made some notes in case you’re looking for some holiday reading.

This was a huge year for Serilog. January kicked of with a few tens of thousands on the NuGet package download counter, but closing up the year it’s over 250k and accelerating. With Serilog-style message template support set to land in ASP.NET 5, I think it is safe to say 2016 is the year we’ll see structured logging hit mainstream in .NET.

Seq has also seen huge (triple-digit) growth this year, especially since v2 shipped in June. Keeping up has been a challenge! Along with a new major version in the first quarter next year, there’s a lot coming for us in 2016 – stay tuned for some updates after the break.

2015

  • Give your instrumentation some love in 2015! — I started this year aware that the vast majority of .NET teams are still writing plain-text logs, collecting them with Remote Desktop and reading them in Notepad. It feels like this is improving but there’s still a long way to go before we’re all using the current generation of tools effectively.
  • Using Serilog with F# Discriminated Unions — Serilog gained some better F# support this year. (Also on the F# front, Adam Chester’s implementation of Message Templates in F# has opened up some possibilities with that language. Logary 4 also promises some Serilog-style structured goodness for F# users sometime in the coming year.)
  • Tagging log events for effective correlation — Some tips for tracing related paths of execution through your application logs.
  • Diagnostic logging in DNX/ASP.NET 5 — The ASP.NET 5/CoreCLR platform has changed significantly since this first tentative post describing Serilog support went out in May, but the fundamentals are still pretty well summed-up here. ASP.NET 5 and CoreCLR are the bit focus areas for Serilog’s upcoming 2.0 release, which has been ticking away on GitHub for a few months now. The platform reset going on in .NET right now is going to take some getting used to, but in a few years we’ll be able to thank the current ASP.NET and CoreFX teams, as well as the mass of community contributors, for the continued relevance and growth of .NET. 2016’s going to be a year for us all to rally and show some support for this work.
  • Seq/2 Update, Seq/2 Beta and Announcing Seq 2 RTW — It’s hard to believe Seq 2 has only been out since June. These few posts track the release of Seq 2, which was a complete UI rewrite and major overhaul of Seq v1. (2.1 followed, as did 2.2 and 2.3. Seq is now at version 2.4).
  • Filtering with Signals in Seq 2 — Explains the new filtering system in Seq 2.
  • Contender for .NET’s Prettiest Console? — If you’re not using Serilog’s Literate Console sink, you need to check out this post.
  • Contextual logger injection for Autofac — If you prefer to inject ILogger using your IoC container, this post is for you.
  • Assigning event types to Serilog events — Seq’s “event type” system can be implemented in simple flat-file logs too, for improved searching/filtering.
  • Aggregate Queries in Seq Part 1: Goals — The first of a series of posts documenting a spike through Seq v3’s SQL Query interface. (Parts 2, 3, 4, 5 and 6.)
  • How to notify Slack using logs from your .NET app — Seq + Slack = <3.

Thanks for visiting and commenting this year. Have a happy and safe holiday season, and see you in 2016!

How to notify Slack using logs from your .NET app

If your team uses Slack, it’s a great place to centralize notifications that might otherwise end up cluttering your email inbox. Commits, build results, deployments, incoming bug reports – Slack keeps your team informed without making everyone feel overloaded with information, which I why I think I like it so much.

The next step down the road to notification heaven, after setting up integrations for the third party apps and services you use, is to integrate your own apps into Slack.

Doing this directly – posting to the Slack API when interesting things happen – is a bit too much coupling, but if your app writes log events these can be used to trigger notifications in Slack with very little effort.

EventInSlack

So that Slack isn’t flooded with irrelevant events, we’ll forward them to Seq first so that interesting events can be selected from the stream.

1. Write and send the log events

In the demo app below, Serilog is configured to send events both to Seq and the local console.

First up install some prerequisite packages:

Install-Package Serilog
Install-Package Serilog.Sinks.Seq
Install-Package Serilog.Sinks.Literate

We’re using Serilog’s Literate Console sink because it reveals more about the structure of the log events (using colour) than the standard console can.

Here’s the code:

class Program
{
    static void Main()
    {
        Log.Logger = new LoggerConfiguration()
            .WriteTo.Seq("http://localhost:5341")
            .WriteTo.LiterateConsole()
            .CreateLogger();

        Log.Information("Starting up");

        var rng = new Random();
        while (true)
        {
            var amount = rng.Next() % 100 + 1;

            Log.Information("Received an order totalling ${Amount:0.00}", amount);

            Thread.Sleep(1000);
        }
    }
}

This program generates a random integer between 1 and 100 each second, and writes this to the log as a property called Amount. Let’s imagine we’re processing sales on a website, or reading measurements from a sensor – the same approach covers a lot of scenarios.

Console

2. Install Seq

If you don’t have Seq running on your development machine, install it from the Seq downloads page – just click through the installer dialog and choose “Browse Seq” at the end to view these log events.

Seq

3. Choose events to notify on

Here’s a twist; so that we’re not overwhelmed with notifications we’ll only raise one if the value of the “sale” is above $90. To find these events in Seq, we first write the filter Amount >= 90. Once we’re confident the right events are selected, the filter can be saved as a signal.

CreatingSignal

The name of the signal is important since we’ll use it to configure the Slack integration in a moment.

4. Add the Slack integration for Seq

The Slack integration for Seq is developed and is maintained on GitHub by David Pfeffer. Seq plug-in apps are published on NuGet – this one has the package id Seq.App.Slack.

To install it into your Seq instance go to Settings, then Apps, and choose Install from NuGet. Enter the package name Seq.App.Slack and install.

Installing

A few seconds later the app should appear in your app list and be ready to configure. To the right of the app name, choose Start new instance….

Installed

5. Configure the WebHook

Give the instance a name like “Big Sales Incoming!” and un-check Only trigger the app manually. The name of the signal we created earlier should now be there in a drop down to select.

AppSetup

The last thing to set is the WebHook URL setting. The easiest way to get one of these is to open the channel you’re posting to in Slack and choose + Add an app or custom integration. This will take you to a Slack page which at the time of writing has just gone through a major overhaul. The current path through the site to add a webhook is:

  1. Choose Build your Own in the top right-hand corner of the page
  2. Under Something just for my team choose Make a Custom Integration
  3. Here’s where you can choose Incoming WebHooks and follow a couple of prompts to get the URL

It’s a bit of an obscure way to do something that’s fairly common – I’m hopeful this will be improved once the redesign settles in 🙂

Back to Seq, paste in the URL, Save Changes and you’re good to go! Events will now start appearing in Slack whenever the Amount is $90 or more.

EventInSlack

Happy logging!

Aggregate Queries in Seq Part 6: Serving Data

Let’s get it out of the way up front – I didn’t manage to fit this series into November. Hrmmmmm… sorry! In the end, I prioritised getting an early preview release out ahead of finishing the blog series documenting the process. The upside is – you can grab it now! The basics of aggregate queries, as they’ll appear in Seq 3, are in preview form on the Seq downloads page.

In what will be the final post in this series for now, I want to show you how the results of running an aggregate query are surfaced in Seq’s HTTP API, and also give a shout-out to some handy tools and techniques it uses along the way.

A syntactic digression…

Before we dive into how Seq’s API is structured, there’s one small tweak to the “SQL-like” query language that I should mention. To avoid endless confusion about what that “-like” really means, the new syntax added in Seq 3 will, as much as possible, be a dialect of SQL. Introducing fewer new things means everyone has less to hold in their head while using Seq, which fits Seq’s goal of getting out of the way while you navigate logs.

The most noticeable change here is that all queries (except trivial scalar ones such as select 41 + 1 as Answer) now have a from clause of the form from stream. The “stream” is Seq’s way of describing “whichever stream of events you’re currently looking at”. Someday other resources wll be exposed through the query interface, and the from clause will enable that.

Other changes you’ll spot are support for SQL operators such as and, or, not and even like. Single-quoted strings are supported, as are SQL comparisons = and <> (though a bug in the published build prevents the use of the latter).

By aligning better with SQL I hope Seq’s querying facilities can remain easy to learn while evolving to support more sophisticated uses.

Resources and Links

Back to the API. Seq is developed API-first, a strategy I picked up from Paul Stovell while working on Octopus Deploy, and something that’s made a huge impact on the way I’ve approached web application development ever since.

Seq also employs some of the ideas associated with hypermedia APIs, notably links and URI templates. (In my experiences so far these are the techniques, along with semantic use of HTTP verbs, that have delivered the most value in application development. Some of the more sophisticated hypermedia concepts like custom media types for standard interchange formats are very interesting but I’m not seeing much use day-to-day. Having said that, wurl is one tool I’ve seen that made me wish we all did use HAL or JSON-API.)

Linking means that the entire Seq API can be navigated from a single root document. At https://your-seq-server/api (here if you’re on localhost) there’s a JSON document with links into all the various resources that the server provides.

API Root

The green box (shout out to Greenshot) shows the resource endpoint where queries are going to be exposed as data sets.

My test Seq instance is configured to listen under the /prd URL prefix, but unless you’ve set this up explicitly you won’t see this part of the path on your own instance.

Data Resources

I’ll admit that there isn’t much of hypermedia angle in this particular use case – rowsets aren’t obviously entities the way signals, users and event events are in Seq’s world-view – but using the same machinery for this as the remainder of the API keeps everything working harmoniously.

The awesome thing from a consumer standpoint though, is the self-documenting nature of the whole thing. Note the "Query" link in the image above. This is a URI template describing how calls to the query endpoint should be formatted.

GET http://your-seq-server/api/data?q=select%20count(*)%20from%20stream&rangeStartUtc=2015-12-08T22:20:22.000Z HTTP/1.1

The signalId segment of the URI is optional, so it doesn’t appear in this request.

In the JavaScript client the call looks like:

api.data.query({ q: 'select count(*) from stream', rangeStartUtc: start }).then(rowset => {
  // rowset is the query result
});

Here, if the client did specify a signalId, the URI template ensures it would be formatted into a URL segment rather than being specified as a query string parameter like the other parameters are, but the client code doesn’t have to be aware of the difference. This makes it nice and easy to refactor and improve the URL structure of the API (even during development) without endlessly poking around in JavaScript string concatenation code.

On the client side, URI templates are handled with the Franz Antesberger’s JavaScript implementation. For the most part this means a hash of parameters like the argument passed to query above can be substituted directly into the URI template, and validated for correct naming and so-on along the way.

Serving it up with Nancy

On the server side, NancyFX reigns. Like Octopus Deploy, Seq took a bet on Nancy in its pre-1.0 days and hasn’t looked back. The team behind Nancy is talented, passionate, but above all, highly considerate of its users to the extent that since adopting Nancy in the zero-point-somethings there’s barely been any breakage or churn between versions, let alone bugs. I can’t recommend Nancy highly enough, and consider it the gold standard for building web apps in .NET these days. It looks like I’m not alone in this opinion.

I find Nancy wonderful for building APIs because while it exposes a slightly quirky API of its own (a noble thing!), it isn’t at all opinonated about how you should structure yours. This also means that when dealing with Nancy there are sensible defaults everywhere, but very little deeply built-in policy – so minimal time is spent grappling with the framework itself.

Here’s the route that handles queries:

public class DataModule : SignalContentModule
{
    readonly Lazy<IEventStore> _events;

    public DataModule(Lazy<IEventStore> events)
        : base("data")
    {
        _events = events;

        Post[""] = p => Query(SignalsModule.MapToNewDocument(ReadBody<SignalEntity>()));
        Get["/{signalId}"] = p => Query(LoadAndCheck(p.signalId));
    }

This is a Nancy module that shares some functionality with the events module (though implementation inheritance does feel like a bit of an ugly hack, here as elsewhere). The two different instantiations of the Query URI template we viewed before need two routes in Nancy; the POST version accepts an unsaved signal in the body of the request, which is necessary because signals may be edited in the UI and queries run against them without saving the changes.

There’s one snippet responsible for declaring how the module works:

protected override ResourceGroup DescribeResourceGroup()
{
    var resource = base.DescribeResourceGroup();
    resource.Links.Add("Query", Qualify("{/signalId}{?q,intersectIds,rangeStartUtc,rangeEndUtc}"));
    return resource;
}

(I’ve often thought it would be nice to unify the description with the routing – there’s some duplication between this code and the route configuration above.)

I’ll include for you here the whole implementation of Query() in all its SRP-violating glory (I think the un-refactored version is nicer to read sequentially in a blog post, but as this goes from feature spike to a fully-fledged implementation I see some Ctrl+R Ctrl+M in the very near future):

Response Query(Signal signal = null)
{
    var query = (string)Request.Query.q;
    if (string.IsNullOrWhiteSpace(query))
        return BadRequest("A query parameter 'q' must be supplied.");

    DateTime? rangeStartUtc = TryReadDateTime(Request.Query.rangeStartUtc);
    DateTime rangeEndUtc = TryReadDateTime(Request.Query.rangeEndUtc) ?? DateTime.UtcNow;
    if (rangeStartUtc == null)
        return BadRequest("A from-date parameter 'rangeStartUtc' must be supplied.");

    if (rangeStartUtc.Value >= rangeEndUtc)
        return BadRequest("The queried time span must be of nonzero duration.");

    var filter = GetIntersectedSignalsFilter(signal);

    var result = _events.Value.Query(
        query,
        rangeStartUtc.Value,
        rangeEndUtc,
        filter: filter);

    if (result.HasErrors)
    {
        var response = Json(new QueryResultPart
        {
            Error = "The query could not be executed.",
            Reasons = result.Errors
        });
        response.StatusCode = HttpStatusCode.BadRequest;
        return response;
    }

    var data = new QueryResultPart
    {
        Columns = result.Columns,
        Rows = result.Rowset,
        Slices = result.Slices?.Select(s => new TimeSlicePart
        {
            Time = s.SliceStartUtc.ToIsoString(),
            Rows = s.Rowset                
        }).ToArray(),
        Statistics = new QueryExecutionStatisticsPart
        {
            ElapsedMilliseconds = result.Statistics.Elapsed.TotalMilliseconds,
            MatchingEventCount = result.Statistics.MatchingEventCount,
            ScannedEventCount = result.Statistics.ScannedEventCount,
            UncachedSegmentsScanned = result.Statistics.UncachedSegmentsScanned
        }
    };

    return Json(data);
}

I have come to wonder if anyone out there uses optional = null parameters as a semi-self-documenting “nullable” annotation for parameters like this:

Response Query(Signal signal = null)
{

I picked it up as a habit after Daniel Cazzulino (if my memory serves me correctly) suggested it as a way of marking optional dependencies in Autofac. Using a default value for nullable arguments expresses the intent that ? would, had nullability been first-class in the early days of C#.

The block of code all the way through to the initialization of filter slurps up parameters from the query string. Nancy has a binding system that might knock a line or two of code out here.

The real action is in:

var result = _events.Value.Query(
    query,
    rangeStartUtc.Value,
    rangeEndUtc,
    filter: filter);

IEventStore.Query() is a wrapper method where the parsing, planning and execution steps take place.

Finally, the result is mapped back onto some POCO types to send to the caller. No magic here – but then again, I think the theme of these few blog posts has been to cut through the whole implementation in an anti-magical way. The types like QueryResultPart will eventually make their way to the Seq API client.

And that, my friends, is a wrap! It’s been fun sharing the process of spiking this feature. I’d like to continue this series into the Seq UI, but there’s a lot to do in the next few months as Seq 3 comes together into a shippable product. I’m excited about how much aggregate queries will enable as part of that. In the meantime though I need to report on the progress that’s been happening in Serilog in preparation for cross-platform CoreCLR support and ASP.NET 5 integration – look out for an update on that here soon.

I’ll leave you with a snapshot of an aggregate query in action.

Screenshot

Download the 3.x preview installer here and check out the documentation here.

Aggregate Queries in Seq Part 5: Execution

Part 5 was very nearly the stalling point in this blog series. I’ve got enough of the implementation done that I can see the finish line, and I’m eager to get that build out, but to really finish the story I need to fill in this installment. If this post is a little brief, please read it as a “status report” this time around 🙂

I’ve also had a bit of time now to revisit decisions made in the earlier stages of building this feature. I had some honest and valuable feedback from Michael Chandler on Twitter regarding the “SQL-like” nature of the syntax:

Upon reflection I think it will be easier to explain how to use aggregate queries if the language simply is SQL, or a dialect thereof, anyway. So, I’ve been back to rework some of the parser and now in addition to the C#-style expression syntax, typical SQL operators such as =, and, or, like, and not as well as single-quoted strings are available:

select count(*)
where Environment = 'production' and not has(@Exception)
group by time(7d), Application

There are still some questions to answer around how much this flows back the other way into typical filter expressions. On the one hand, it’d be nice if the filter syntax and the where clause syntax were identical so that translating between queries and filters is trivial. On the other hand, keeping the languages a bit tighter seems wise. For now, the syntaxes are the same; I’m going to spend some time using the SQL syntax in filters and see how it goes in practice.

Anyway, back to the topic at hand. Now we’re getting somewhere! The aggregate query parser handles the syntax, the planner can produce a query plan, and we need to turn that into a result set.

This post considers three questions:

  • What inputs are fed into the executor?
  • What does the result set look like?
  • How is the result computed?

The first turns out to be predictably easy, given the efforts expended so far to generate a plan, and the existing event storage infrastructure. Keeping things as simple as possible:

static class QueryExecutor
{
    public static QueryResult Execute(
        QueryPlan plan,
        IEventStore store,
        DateTime rangeStartUtc,
        DateTime rangeEndUtc)

Here plan is the output of the last step, store is a high-level interface to Seq’s time-ordered disk/RAM event storage, and the two range parameters the time slice to search. (The implementation as it stands includes a little more detail, but nothing significant.)

A QueryResult lists the column names produced, and either a list of rows, or a list of time slices that each carry a list of rows:

Results

I decided for now to keep the concept of “time slice” or sample separate (time could simply have been another column in the rowset) because it makes for a friendlier API. I’m not sure if this decision will stick, since tabular result sets are ubiquitously popular, but when “series” are added as a first-class concept it is likely they’ll have their own more optimal representation too.

In between these two things – an input query plan and an output result – magic happens. No, just kidding actually. It’s funny how, once you start implementing something, the magic is stripped away and things that initially seem impenetrably complex are made up of simple components.

The core of the query executor inspects events one by one and feeds the matching ones into a data structure carrying the state of the computation:

AggregationState

First, the group that the event belongs to is determined by calculating each grouping expression and creating a group key. Against this, state is stored for each aggregate column being computed. The subclasses of Aggregation are themselves quite simple, like count():

class CountAggregation : Aggregation
{
    long _count;

    public override void Update(object value)
    {
        if (value == null)
            return;

        ++_count;
    }

    public override object Calculate()
    {
        return (decimal)_count;
    }
}

The value to be aggregated is passed to the Update() method (in the case of count(*) the * evaluates to a non-null constant) and the aggregation adds this to the internal state.

Once all of the events have been processed for a time range, Calculate() is used to determine the final value of the column. It’s not hard to map count(), sum(), min(), max() and so-on to this model.

Things are a little trickier for aggregates that themselves produce a rowset, like distinct(), but the basic approach is the same. (Considering all aggregate operators as producing a rowset would also work and produce a more general internal API, but the number of object[] allocations gets a little out of hand.)

Once results have been computed for a time slice, it’s possible to iterate over the groups that were created and output rows in the shape of the QueryResult structure shown earlier.

There’s obviously a lot of room for optimisation, but the goals of a feature spike are to “make it work” ahead of making it work fast, so this is where things will sit while I move on towards the UI.

One more thing is nagging at me here. How do we prevent an over-eager query from swamping the server? Eventually I’d like to be able to reject “silly” queries at the planning stage, but for now it’s got to be the job of the query executor to provide timeouts and cancellation. These are sitting in a Trello card waiting for attention…

In the next post (Part 6!) we’ll look more closely at Seq’s API and finally see some queries in action. Until then, happy logging!

Read Part 6: Serving Data

Aggregate Queries in Seq Part 4: Planning

Seq is a log server designed to collect structured log events from .NET apps. This month I’m working on adding support for aggregate queries and blogging my progress as a diary here. This is the fourth installment – you can catch up on Goals, Syntax, and Parsing in the first three posts.

So, this post and the next are about “planning” and “execution”. We left off a week ago having turned a query like:

select count(*)
where ApplicationName == "Admissions"
group by time(1d), ExceptionType

Into an expression tree:

QueryAST

Pretty as it looks, it’s not obvious how to take a tree structure like this, run it over the event stream, and output a rowset. We’ll break the problem into two parts – creating an execution plan, then running it. The first task is the topic of this post.

Planning

In a relational database, “planning” is the process of taking a query and converting it into internal data structures that describe a series of executable operations to perform against the tables and indexes, eventually producing a result. The hard parts, if you were to implement one, seem mostly to revolve around choosing an optimal plan given heuristics that estimate the cost of each step.

Things are much simpler in Seq’s data model, where there’s just a single stream of events indexed by (timestamp, arrival order), and given the absence of joins in our query language so far. The goal of planning our aggregate queries is pretty much the same, but the target data structures (the “plans”) only need to describe a small set of pre-baked execution strategies. Here they are.

Simple projections

Let’s take just about the simplest query the engine will support:

select MachineName, ThreadId

This query isn’t an aggregation at all: it doesn’t have any aggregate operators in the list of columns, so the whole rowset can be computed by running along the event stream and plucking out the two values from each event. We’ll refer to this as the “simple projection” plan.

A simple projection plan is little more than a filter (in the case that there’s a where clause present) and a list of (expression, label) pairs representing the columns. In Seq this looks much like:

class SimpleProjectionPlan : QueryPlan
{
    public FilterExecutionPlan Filter { get; }
    public ProjectedColumn[] Columns { get; }

    public SimpleProjectionPlan(
        ProjectedColumn[] columns,
        FilterExecutionPlan filter = null)
    {
        if (columns == null) throw new ArgumentNullException(nameof(columns));
        Columns = columns;
        Filter = filter;
    }
}

We won’t concern ourselves much with FilterExecutionPlan right now; it’s shared with Seq’s current filter-based queries and holds things like the range in the event stream to search, a predicate expression, and some information allowing events to be efficiently skipped if the filter specifies any required or excluded event types.

Within the plan, expressions can be stored in their compiled forms. Compilation can’t be done any earlier because of the ambiguity posed by a construct like max(Items): syntactically this could be either an aggregate operator or a scalar function call (like length(Items) would be). Once the planner has decided what the call represents, it can be converted into the right representation. Expression compilation is another piece of the existing Seq filtering infrastructure that can be conveniently reused.

Aggregations

Stepping up the level of complexity one notch:

select distinct(MachineName) group by Environment

Now we’re firmly into aggregation territory. There are two parts to an aggregate query – the aggregates to compute, like distinct(MachineName), and the groupings over which the aggregates are computed, like Environment. If there’s no grouping specified, then a single group containing all events is implied.

class AggregationPlan : QueryPlan
{
    public FilterExecutionPlan Filter { get; }
    public AggregatedColumn[] Columns { get; }
    public GroupingInstruction[] Groupings { get; set; }

    public AggregationPlan(
        AggregatedColumn[] columns,
        GroupingInstruction[] groupings,
        FilterExecutionPlan filter = null)
    {
        if (columns == null) throw new ArgumentNullException(nameof(columns));
        if (groupings == null) throw new ArgumentNullException(nameof(groupings));
        Filter = filter;
        Columns = columns;
        Groupings = groupings;
    }
}

This kind of plan can be implemented (naiively perhaps, but that’s fine for a first-cut implementation) by using the groupings to create “buckets” for each group, and in each bucket keeping the intermediate state for the required aggregates until a final result can be produced.

Aggregated columns, in addition to the expression and a label, carry what’s effectively the constructor parameters for creating the bucket to compute the aggregate. This isn’t immediately obvious based on the example of distinct, but given another example the purpose of this becomes clearer:

percentile(Elapsed, 95)

This expression is an aggregation producing the 95th percentile for the Elapsed property. An AggregatedColumn representing this computation has to carry the name of the aggregate ("percentile") and the argument 95.

Time slicing

Finally, the example we began with:

select count(*)
where ApplicationName == "Admissions"
group by time(1d), ExceptionType

Planning this thing out reveals a subtlety around time slices in queries. You’ll note that the time(1d) group is in the first (dominant) position among the grouped columns. It turns out the kind of plan we need is completely different depending on the position of the time grouping.

In the time-dominant example here, the query first breaks the stream up into time slices, then computes an aggregate on each group. Let’s refer to this as the “time slicing plan”.

class TimeSlicingPlan : QueryPlan
{
    public TimeSpan Interval { get; }
    public AggregationPlan Aggregation { get; }

    public TimeSlicingPlan(
        TimeSpan interval,
        AggregationPlan aggregation)
    {
        if (aggregation == null) throw new ArgumentNullException(nameof(aggregation));
        Interval = interval;
        Aggregation = aggregation;
    }
}

The plan is straightforward – there’s an interval over which the time groupings will be created, and an aggregation plan to run on the result.

The output from this query will be a single time series at one-day resolution, where each element in the series is a rowset containing (exception type, count) pairs for that day.

The alternative formulation, where time is specified last, would produce a completely different result.

select count(*)
where ApplicationName == "Admissions"
group by ExceptionType, time(1d)

The output of this query would be a result set where each element contains an exception type and a timeseries with counts of that exception type each day. We’ll refer to this as the “timeseries plan”.

Both data sets contain the same information, but the first form is more efficient when exploring sparse data, while the second is more efficient for retrieving a limited set of timeseries for graphing or analytics.

To keep things simple (this month!) I’m not going to tackle the timeseries formulation of this query, instead working on the time slicing one because I think this is closer to the way aggregations on time will be used in the initial log exploration scenarios that the feature is targeting.

Putting it all together

So, to recap – what’s the planning component? For our purposes, planning will take the syntax tree of a query and figure out which of the three plans above – simple projection, aggregation, or time slicing – should be used to execute it.

The planner itself is a few hundred lines of fairly uninteresting code; I’ll leave you with one of the tests for it which, like many of the tests for Seq, is heavily data-driven.

[Test]
[TestCase("select MachineName", typeof(SimpleProjectionPlan))]
[TestCase("select max(Elapsed)", typeof(AggregationPlan))]
[TestCase("select MachineName where Elapsed > 10", typeof(SimpleProjectionPlan))]
[TestCase("select StartsWith(MachineName, "m")", typeof(SimpleProjectionPlan))]
[TestCase("select max(Elapsed) group by MachineName", typeof(AggregationPlan))]
[TestCase("select percentile(Elapsed, 90) group by MachineName", typeof(AggregationPlan))]
[TestCase("select max(Elapsed) group by MachineName, Environment", typeof(AggregationPlan))]
[TestCase("select distinct(ProcessName) group by MachineName", typeof(AggregationPlan))]
[TestCase("select max(Elapsed) group by time(1s)", typeof(TimeSlicingPlan))]
[TestCase("select max(Elapsed) group by time(1s), MachineName", typeof(TimeSlicingPlan))]
[TestCase("select count(*)", typeof(AggregationPlan))]
public void QueryPlansAreDetermined(string query, Type planType)
{
    var tree = QueryParser.ParseExact(query);

    QueryPlan plan;
    string[] errors;
    Assert.IsTrue(QueryPlanner.Plan(tree, out plan, out errors));
    Assert.IsInstanceOf(planType, plan);
}

Part five will look at what has to be done to turn the plan into a rowset – the last thing to do before the API can be hooked up!

Read Part 6: Execution

Aggregate Queries in Seq Part 3: An Opportunistic Parser

It turns out the parser wasn’t a huge departure from Seq’s existing filter parser. Seq already uses Sprache to parse filter expressions, and Sprache parsers compose very nicely.

After making the current FilterExpressionParser “root” expression public, and defining some new AST nodes like Projection and so-on, things just get bolted together:

static readonly Parser ExpressionValue =
FilterExpressionParser.Expr.Token().Select(e => new ExpressionValue(e));

static readonly Parser Projection =
    from v in AggregateValue.Or(ExpressionValue)
    from l in Label.Optional()
    select new Projection(v, l.GetOrDefault());

Here you can see the way a projection like the count(*) as Total column is constructed from a parser for values, and a parser for optional ‘as’ labels. I had to define a separate parser for some aggregations, like count(*) that aren’t otherwise valid Seq filter syntax, but any existing expression that FilterExpressionParser supports can be used as the value of a projected column.

Heading further up towards the root of the grammar, we get something like:

static readonly Parser Query =
    from @select in Select
    from @where in Where.XOptional()
    from groupBy in GroupBy.XOptional()
    select new Query(@select, @where.GetOrDefault(), groupBy.GetOrDefault());

The resulting Query parser can take some text input and give back a tree of objects representing the parts of the query. Success!

There was one subtle problem here that you can spot by way of the oddly-named XOptional combinator. Sprache on Github provides Optional, which works as advertised, but upon failing a match will backtrack and return success regardless of whether a partial parse was possible or not.

This leads to error messages without a lot of information, for example:

select distinct(ExceptionType) group ApplicationName

is missing the ‘by’ required next to ‘group’. Using Optional the parser reports:

Syntax error (col 31): unexpected 'g'.

Hmmm. Not so good – there’s nothing at all wrong with that ‘g’! The problem is that upon failing to parse the ‘by’, Sprache’s Optional returned a zero-length successful parse, so parsing picks back up at that position and fails because there are no more tokens to match.

The ‘X’ in XOptional is for eXclusive, meaning that the token is optional, but, only if it parses no input whatsoever. As soon as ‘group’ is parsed, the optional branch is considred “taken”, and failures will propagate up. (Sprache ships ‘X’ versions of several parsers already, such as an exclusive Many called XMany.)

Here it is:

public static Parser<IOption> XOptional(this Parser parser)
{
    if (parser == null) throw new ArgumentNullException(nameof(parser));
        return i =>
        {
            var result = parser(i);
            if (result.WasSuccessful)
                return Result.Success(new Some(result.Value), result.Remainder);

            if (result.Remainder.Equals(i))
                return Result.Success(new None(), i);

            return Result.Failure<IOption>(result.Remainder, result.Message, result.Expectations);
        };
}

The divergence from the built-in optional is only succeeding with a zero-length parse if (result.Remainder.Equals(i)).

Using XOptional:

Syntax error (col 38): unexpected 'A', expected keyword 'by'.

Better!

If you haven’t used parser combinators before this whole thing might be a bit surprising – where’s the EBNF? The esoteric command line tools with animal names? It turns out that combinators make parsing into a regular (somewhat imperative) programming task without a lot of mystery surrounding it.

There are some limitations in Sprache’s implementation I’d like to address someday – for example, the error reported on ‘A’ above rather than ‘ApplicationName’ is the result of parsing the raw character stream instead of a tokenised one – but these are minor inconveniences that can be worked around if need be.

If you haven’t looked into combinator-based parsing, there are some great tutorials and examples linked from Sprache’s README. It’s a technique worth adding to your tool belt regardless of the kind of programming you usually do. Little languages are everywhere, waiting to be cracked open!

The most enjoyable and challenging part of any language processing task for me is not so much the parsing though, but taking a tree of syntactic nodes like we have here, and turning it into something executable. That’s coming up next 🙂

Read Part 4: Planning

Aggregate Queries in Seq Part 2: Defining a Syntax

So, before we go any farther we’re going to need to pin down a bit more tightly what form aggregate queries will take. There are options, options, options – but hopefully a lot will fall out of how queries are expressed in Seq today.

The Seq filter box accepts predicates – expressions that evaluate to a Boolean in the context of an event.

Environment == "Production"

To express something like a sum, writing:

sum(ItemsOrdered)

…in the filter box seems reasonable, except when predicates get involved again. Combining the two expressions above into something that says “the number of items ordered in the production environment” is not obvious.

To introduce aggregates the filter syntax is going to have to stretch a bit, so that Seq can tell the difference between a simple predicate and a full-fledged query. Here goes:

select sum(ItemsOrdered) where Environment == "Production"

The plan is to reappropriate select and where from SQL to mark out the clauses. SQL-like queries have the strong advantage being familiar, and loosely align with the rest of the “C#-like” syntax by their analogy to LINQ. Starting queries with the keyword select gives the UI a chance to intelligently determine the type of query being written – staying with just a single input box is an explicit goal.

So what else can a SQL-like syntax offer? Grouping the number of items ordered by the item itself:

select sum(ItemsOrdered)
where Environment == "Production"
group by ItemId

Groupings are bread-and-butter for aggregate queries, so it’s handy that they carry over fairly naturally.

What about from? I don’t think I’m going to go after from at this point. The context of a query in Seq will initially be the events viewed in the UI, filtered down to whatever signals are active. There’s lots of room to extend this down the track a bit, but I think filtering and grouping is enough to bite off for a start.

There’s one last core concept the query syntax needs. Log events are time-dimensioned, and dealing with time requires some up-front attention.

Poking around, time groupings seem to have been grafted onto SQL in a few different ways, in traditional databases, event processing systems and timeseries databases. Approaching this the obvious way by making use of the built-in @Timestamp property attached to Seq events could look like:

select sum(ItemsOrdered)
where Environment == "Production"
group by hour(@Timestamp), ItemId

The awkwardness of this approach isn’t apparent until more exotic requirements show up – grouping by 20 hour blocks, or offsetting queries into another (non-UTC) timezone. I’m also not sure I want to type @Timestamp dozens of times a day.

Instead, I’m exploring the idea of a “time expression” syntax like the one used by InfluxDB, where the size of the interval is specified as a literal like time(10s):

select sum(ItemsOrdered)
where Environment == "Production"
group by time(1h), ItemId

Melding this to the existing expression parser is going to be fun!

Read Part 3: An Opportunistic Parser

Aggregate Queries in Seq Part 1: Goals

To add a bit of variety to the format of this blog, I’ve decided to try diarising a month of programming – November 2015 to be exact, if you’re reading this in the future!

This month I’ve got some steep goals to face: I want to ship a preview of Seq’s next major feature – aggregate queries – by the end of the month. I’m not starting from scratch, but pulling together the current progress into a complete feature is still a lot of work and there are many design decisions yet to make. I don’t intend to post an update every day (I’d have no time for actually writing the code ;-)) but hopefully every few days I can get an installment up here.

So, the first of these diary entries: why am I even working on aggregate queries, and what are they, anyway?

Constraint is a wonderful aid to creation, since without the months-end deadline breathing down my neck I’d no doubt have more to say about this here, but in the interest of making progress, it’s quicker and easier to explain by example: the aggregates we’re talking about are count(), distinct(), sum(), min(), max(), mean(), percentile() and some of their lesser-known friends.

Log data is great for answering ad hoc questions about how and app behaves and is used. A big enhancement to Seq’s analytical capabilities today (which otherwise fall back on exporting tabular data to Excel) would be to ask it questions like:

  • Which exception types have occurred today, and how many of each type?
  • Are average transaction processing times improving or degrading?
  • How many items on average do customers check out?

Aggregate queries enable this, and up all kinds of ways to learn more from the data that’s already collected.

Some of these capabilities overlap with what dedicated metrics can also provide. I am a huge believer in the benefits of measuring and dashboarding anything that moves. Metrics and logs aren’t the same thing though, and the scenarios and usage patterns for each can be startlingly different, from collection right through to storage and processing. Seq can be (and already is) used for very light metrics duties, but in the interest of doing one thing well the immediate goal for aggregation in Seq is to answer ad hoc questions from log data rather than perform heavy-duty timeseries crunching.

Implementing aggregates in Seq means implementing from the ground, up. There’s no SQL database behind the scenes to do the heavy lifting – everything from parsing to planning and executing the queries needs to be done by hand in C#. I’m expecting to learn a lot along the way. It should make for an interesting month, wish me luck! 🙂

Read Part 2: Defining a Syntax

Seq 2.4

Hot off the press and ready to download from the Seq site.

Seq 2.4 is largely a maintenance release with a host of bug fixes, but amongst those are some substantial improvements.

Filtering performance improvements

Seq implements its own query engine (soon to get some interesting new capabilities ;-)) optimized for the kind of messy, ad-hoc filtering that we do over log data. Based on the kinds of queries generated through real-world use of the 2.0 “signals” feature, two great little optimizations have landed in 2.4.

First, an overdue implementation of short-circuiting && (AND) and || (OR). This just means that the expression @Level == "Error" && Environment == "UAT" won’t bother evaluating the second comparison if the first one is false. Seq has always done some short-circuiting in queries, but only in certain cases. 2.4 extends this to all logical operations.

Second, and closely related, is term reordering based on expected execution cost. Some predicates are extremely cheap and others costly, for example an event type comparison ($FEE7C01D) can be evaluated several thousand times faster than a full-document text search (Contains(@Document, "penguin"). This means that, given short-circuiting operations, $FEE7C01D && Contains(@Document, "penguin") is much more efficient than the reverse Contains(@Document, "penguin") && $FEE7C01D in the common case of mostly-negative results. Seq 2.4 uses heuristics to weigh up the relative execution cost of each branch of an expression to make sure the fastest comparisons are performed first.

Both of these changes add up to substantial gains when using signals with a large number of applied filters.

Restricted signals

Since Seq uses the signal mechanism for retention processing, it’s possible that an accidental change to a signal used in retention processing could lead to data loss. For this reason Seq 2.4 introduces lockable signals, requiring administrative privileges to modify.

Seq-2.4

Online compaction

Seq uses the ESENT storage engine to manage files on disk. It’s an amazing piece of technology and very mature, however until recently was unable to support compaction of data files during operation. Although retention policies would remove events from the file, Seq would periodically need to take a slice of the event stream offline to free the empty space in the file, and this process copies the old file into a new one. Mostly this background operation would be quick and transparent, but on heavily loaded servers the disk space and I/O required would sometimes significantly impact performance.

The new version 2.4, when running on a capable operating system (Windows 8.1+ or Server 2012 R2+), now takes advantage of ESENT’s sparse file support to perform compaction in real time, spreading out the load and avoiding additional disk usage and I/O spikes.

You can download the new version here on the Seq website.

Happy logging! 🙂

Assigning event types to Serilog events

One of the most powerful benefits of structured logging is the ability to treat log events as though “typed”, so that events generated by the same logging statement can be easily (and mechanically) identified in the log stream.

Given a logging statement parameterized by some data:

var total = 1;
for(var i = 0; i < 3; ++i)
{
   total *= i;
   Log.Information("Computed iteration {Counter}, total is {Total}", i, total);
}

The text representation of each event (“Computed iteration 2, total is 4”) will be different.

A traditional text-based logging system necessitates the use of regular expressions to identify and parse messages created in the loop. This is a bigger problem than it sounds: once the event is interspersed through a large stream of unrelated messages this style of processing is both slow and error-prone, as well as inconvenient.

By contrast, a structured logger like Serilog or SLAB/ETW assigns ids or types to events, so that all events generated by the statement will carry a distinct type as well as the structured fields for Counter and Total. Queries using event types can find all of the events generated by a particular logging statement, even though their text representations may differ.

Enriching events with types

Serilog treats the message template itself as the event type. By attaching "Computed iteration {Counter}, total is {Total}" to each event, all events generated from the same template can be identified.

Long strings like this can be inconvenient to record and type, so it’s often useful to take a hash of this value instead, and record that as the event type alongside other data comprising the event. Seq does this automatically by assigning a type to Serilog events on the server-side. Using Elasticsearch to store events, you might achieve the same thing with a transform.

If you just want to have the convenience of searching by event type in regular flat log files, or if you’re using a log collector without this option, it’s easy to add support for it using a Serilog enricher:

Log.Logger = new LoggerConfiguration()
   .Enrich.With<EventTypeEnricher>()
   .WriteTo.LiterateConsole(
      outputTemplate: "{Timestamp:HH:mm:ss} [{EventType:x8} {Level}] {Message}{NewLine}{Exception}")
   .CreateLogger();

Line 2 adds the EventTypeEnricher that we’ll see below to the logging pipeline.

Line 3 shows the modified outputTemplate that includes the EventType value, in this case a 32-bit value formatted in hexadecimal. (This example writes to the Literate Console sink, which is a great way to visualize the structure of Serilog events even when they’re formatted into text.)

The enricher

In this example, event types are generated using a 32-bit Murmur3 hash. The relative merits of different hash algorithms and sizes for this purpose is a post in itself – we’ll just use a readily-available one. Anything from string.GetHashCode() to SHA-1 might work here, depending on your needs.

The algorithim is from this package, which you’ll need to install first from NuGet.

class EventTypeEnricher : ILogEventEnricher
{
   public void Enrich(LogEvent logEvent, ILogEventPropertyFactory propertyFactory)
   {
      var murmur = MurmurHash.Create32();
      var bytes = Encoding.UTF8.GetBytes(logEvent.MessageTemplate.Text);
      var hash = murmur.ComputeHash(bytes);
      var numericHash = BitConverter.ToUInt32(hash, 0);
      var eventId = propertyFactory.CreateProperty("EventType", numericHash);
      logEvent.AddPropertyIfAbsent(eventId);
   }
}

The enricher retrieves the original message template text, computes its hash, and attaches it the event as the EventType property.

The output

Taking our example again, including a few different events:

Log.Information("Starting up");

var total = 1;
for (var i = 0; i < 3; ++i)
{
    total += i;
    Log.Information("Computed iteration {Counter}, total is {Total}", i, total);
}

Log.Information("All done");

The output is:

EventTypesLiterateConsole

Each loop of the iteration, despite carrying different values for Counter and Total, is tagged with the same event type f20ba6e0. The other two events carry their own distinct event types that identify them.

Summing up

Structured logging is a necessary response to the difficulty of sifting through log events from ever-larger, more distributed, more sophisticated applications. Alongside named property values, event types are a big part of the structured logging value proposition. You can use either raw message templates, or hashes of them, as types when working with Serilog.

If you’re using a solution that supports event types in your log data, it’d be awesome to hear how it has worked for you!