Awesome

Terse Logback Showcase

This is an example project to show how to integrate logging in an application using terse-logback and Blacklite.

It is a Play application written using Play's Java API.

There is an example application running at https://terse-logback-showcase.fly.dev/ which has pictures of cats.

Running Locally

You will need sbt installed to run.

$ sbt

This gets you into the Play console.

To start the play application, type run, and then click on http://localhost:9000/ in a browser when prompted.

[terse-logback-showcase] $ run

--- (Running the application, auto-reloading is enabled) ---

[info] p.c.s.AkkaHttpServer - Listening for HTTP on /0:0:0:0:0:0:0:0:9000

(Server started, use Enter to stop and go back to the console...)

FfRhBqPfTYg6O0Qbm7EAAA 20:41:50.040 [INFO ] p.a.d.DefaultDBApi -  Database [logging] initialized at jdbc:h2:mem:terse-logback;DB_CLOSE_DELAY=-1
FfRhBqPfTZU6O0Qbm7EAAA 20:41:50.054 [INFO ] p.a.d.HikariCPConnectionPool -  Creating Pool for datasource 'logging'
FfRhBqPfTaE6O0Qbm7EAAA 20:41:50.065 [INFO ] c.z.h.HikariDataSource -  HikariPool-1 - Starting...
FfRhBqPfTaE6O0Qbm7EAAB 20:41:50.065 [INFO ] c.z.h.HikariDataSource -  jdbc-appender-pool-1582000909357 - Starting...
FfRhBqPfTbE6O0Qbm7EAAA 20:41:50.081 [INFO ] c.z.h.HikariDataSource -  HikariPool-1 - Start completed.
FfRhBqPfTbg6O0Qbm7EAAA 20:41:50.089 [INFO ] c.z.h.HikariDataSource -  jdbc-appender-pool-1582000909357 - Start completed.
FfRhBqPfTf46O0Qbm7EAAA 20:41:50.159 [INFO ] a.e.s.Slf4jLogger -  Slf4jLogger started
FfRhBqPfTkI6O0Qbm7EAAA 20:41:50.227 [INFO ] p.a.h.EnabledFilters -  Enabled Filters (see <https://www.playframework.com/documentation/latest/Filters>):

    play.filters.csrf.CSRFFilter
    play.filters.headers.SecurityHeadersFilter
    play.filters.hosts.AllowedHostsFilter

FfRhBqPfTmg6O0Qbm7EAAA 20:41:50.264 [INFO ] play.api.Play -  Application started (Dev) (no global state)

You'll be able to see the web page and see the logs page as well.

Logging Configuration

There are four logging appenders configured.

Text: this writes out line oriented output to logs/application.log.
JSON: this writes out JSON output to logs/application.json.
Console: this writes ANSI (colored) line output to stdout.
Blacklite: This writes out JSON to SQLite through the Blacklite appender.

Backtracing with SQLite

The root logging level is set to TRACE, and the Blacklite appender stores diagnostic logging (DEBUG & TRACE) events, offloading using a background thread and configuring SQLite with memory-mapped I/O and a write-ahead-log for better write performance.

This means that you get the complete logs for a request on an error. I am calling this feature "backtracing", because you can go back and see the TRACE and DEBUG messages for a request when necessary.

Because SQLite has built in JSON support, you can query logs by correlation id, and the appender will automatically reap older log entries past the default size.

Flake ID Support

All logging events have a flake id generated and rendered in the appropriate format across different channels. This allows you to get a general overview from console and text files, and be able to drill down to a full view with JSON data.

For example, in application.log, you'll see a single line that starts with FfwJtsNHYSw6O0Qbm7EAAA:

FfwJtsNHYSw6O0Qbm7EAAA 2020-03-14T05:30:14.965+0000 [INFO ] play.api.db.HikariCPConnectionPool in play-dev-mode-akka.actor.default-dispatcher-7 - Creating Pool for datasource 'logging'

You can search for this string in application.json and see more detail on the log record:

{"id":"FfwJtsNHYSw6O0Qbm7EAAA","relative_ns":20921024,"tse_ms":1584163814965,"start_ms":null,"@timestamp":"2020-03-14T05:30:14.965Z","@version":"1","message":"Creating Pool for datasource 'logging'","logger_name":"play.api.db.HikariCPConnectionPool","thread_name":"play-dev-mode-akka.actor.default-dispatcher-7","level":"INFO","level_value":20000}

Flake IDs are also k-ordered, meaning that they are "roughly time-ordered when sorted lexicographically."

Relative Nanoseconds

All logging events also contain a nanosecond increment, using System.nanoTime as the base, relative to the JVM start time. This value may be negative to begin with, but always increments.

LoggingEvent already has a timestamp associated with it, but that timestamp is generated by System.currentTimeMillis. For most appenders, this would be fine, but appending to Blacklite is so fast that several hundred events can be appended in the same millisecond, exceeding the resolution and making direct time comparison difficult. Adding a relative_ns field provides resolution down to the nanosecond, sort of.

Technically, sorting on the flake id would also sort correctly, but would not solve the problem of determining how much after the second record was, and so the best thing to do here is add an extra field.

For example, here's two different records with the same millisecond.

{"id":"FfwJtsNLLWo6O0Qbm7EAAA","relative_ns":11808036,"tse_ms":1584163603315,"start_ms":null,"@timestamp":"2020-03-14T05:26:43.315Z","@version":"1","message":"HikariPool-2 - Start completed.","logger_name":"com.zaxxer.hikari.HikariDataSource","thread_name":"play-dev-mode-akka.actor.default-dispatcher-7","level":"INFO","level_value":20000}
{"id":"FfwJtsNLLWo6O0Qbm7EAAB","relative_ns":11981656,"tse_ms":1584163603315,"start_ms":null,"@timestamp":"2020-03-14T05:26:43.315Z","@version":"1","message":"jdbc-appender-pool-1584163602961 - Start completed.","logger_name":"com.zaxxer.hikari.HikariDataSource","thread_name":"logback-appender-ASYNC_JDBC-2","level":"INFO","level_value":20000}

Note that the timestamp is 2020-03-14T05:26:43.315Z and the time since epoch is 1584163603315. The flake ids distinguish between log entries by using a counter when millisecond precision is exceeded, so the first record is FfwJtsNLLWo6O0Qbm7EAAA ending in A and the second record is FfwJtsNLLWo6O0Qbm7EAAB ending in B. The relative time tells us exactly how much time has elapsed between the two events: 11981656 - 11808036 is 0.17362 milliseconds.

Logging Instrumentation with Byte Buddy

One of the more fun things you can do with Terse Logback is to instrument jar files to add logging entry/exit statements at run time. The full documentation is here.

Note that will need to run Play in production mode to load the byte-buddy agent. The easiest way to do this is to run sbt stage and then run the script:

export PLAY_APP_SECRET=some-long-secret-to-appease-the-entropy-gods
target/universal/stage/bin/terse-logback-showcase -Dconfig.resource=application.prod.conf

See the Procfile for reference.

So, as an example, let's say that we suspect that there's a bug in the assets builder code base. We instrument the controllers.AssetsBuilder class by adding the following to the logback.conf file:

logback.bytebuddy {
  service-name = "terse-logback-showcase"

  tracing {
    # This class doesn't have any tracing built into it, but we can add it using instrumentation
    "controllers.AssetsBuilder" = ["*"]
  }
}

This means that AssetsBuilder will generate log records at TRACE level with messages like:

entering: controllers.AssetsBuilder.play$api$mvc$Results$_setter_$Continue_$eq(play.api.mvc.Result) with arguments=[Result(100, Map())] from source Assets.scala:715
exiting: controllers.AssetsBuilder.play$api$mvc$Results$_setter_$Continue_$eq(play.api.mvc.Result) with arguments=[Result(100, Map())] => (return_type=void return_value=null) from source Assets.scala:715

As with the other trace messages, these will all be written out to the blacklite appender. Note that these traces will not have a correlation id identifying them as part of the request, as the correlation id is added in application code.

Error Reporting with Backtraces

Error Reporting is done through the error handlers. There are two integrations, Sentry and Honeycomb.

Error reporting makes use of diagnostic log entries tied to the request id, which is treated as the correlation id.

Sentry Integration

The SentryHandler uses the out of the box sentry client to send error information to Sentry.

Backtraces are sent by querying the LogEntryFinder for log entries matching the request id, and mapping them into breadcrumbs. The Breadcrumb API doesn't have a "TRACE" level, so all breadcrumbs must be logged at "DEBUG".

Honeycomb Integration

The HoneycombHandler uses the event API through terse-logback honeycomb client.

All errors are sent in as traces to Honeycomb using manual tracing, with the error as the root span. Backtraces are sent by sending span events, since log entries do not have a duration in themselves.

Deploying to fly.io

Make sure you have flyctl installed.

The fly.toml file is already set up, will need to change HONEYCOMB_DATASET and the app name to your instance. Fly will handle the HTTPS automatically.

Set the secrets in fly.io appropriately. You can get some random output by running head -c 32 /dev/urandom | base64.

fly secrets set PLAY_APP_SECRET=<secret>
fly secrets set SENTRY_DSN=$SENTRY_DSN 
fly secrets set HONEYCOMB_API_KEY=$HONEYCOMB_API_KEY

To deploy or launch on fly.io:

sbt clean stage docker:publishLocal
cd target/docker/stage
# fly launch with copied fly.toml if not already running...
fly deploy --app terse-logback-showcase

Starting and stopping the machine is a bit odd, because the id is only available from the "Allocations" section on the monitoring page, and you still need to include the --app flag.

fly machine stop 651b3f43 --app terse-logback-showcase