2017-12-26

Neugram is a scripting language that sticks very close to Go. Go statements are Neugram statements, you can import Go packages, scripts can be compiled to Go programs, and types look just like the equivalent Go types at run time (which means packages built on reflection, like fmt, work as expected). These requirements put a lot of restrictions on the design of Neugram. This post is about one such restriction on methods that I did not discover until I tried to use it without thinking.

Background: Go without declarations

When designing a language for use in a REPL (a read-eval-print-loop like your shell), you want to be able to dive right in and have code executed as quickly as possible. That is, a scripting language should be able to say "Hello, World!" in one reasonable line.

Popular scripting languages like Perl and Python use statements as the topmost grammatical construction. A simple statement can consist of a single expression, like the command to print a string.

Go is different. The topmost grammatical construction in Go is a declaration. Declarations consist of package-wide constants, variables, functions, types and methods. Inside declarations are statements. The statement is in charge of program control flow, and contain some number of expressions. An expression is an actual computation, where we do the work of programming.

The concept of having a layer of declarations above statements is common in programming languages. Both C and Java have declarations. Declarations are useful. The order of top-level declarations in Go does not affect the order of execution of the program. This makes it possible to depend on names defined later in the file (or in an entirely different file in the package) without developing a system of forward declarations or header files.

One of the key changes that makes Neugram a different language from Go is we do not have top-level declarations. Neugram starts with statements. We lose the advantages of declarations in exchange for executing statements quickly

Without declarations, packages are restricted to a single file (to avoid thinking about order of file execution) and referring to names not yet defined is tricky, but the feel of many programs stays the same because in Go most declarations also work as statements. For example:

var V = 4
type T int

Method grammar

The one top-level declaration that we miss in Neugram is the method declaration. In Go you declare a method by writing:

func (t T) String() string {
	return fmt.Sprintf("%d", int(t))
}

Critically, this declaration does not stand on its own. You need another declaration somewhere in your package defining the type T. While type declarations can be made as statements, method declarations cannot. There are several possible arguments for why not, but given the current syntax one is that it would introduce the notion of incomplete types to the run time phase of Go programs. Imagine:

func main() {
	type T int
	var t interface{} = T{}

	_, isReader := t.(io.Reader)
	fmt.Println(isReader) // prints false

	if rand {
		func (t T) Read([]byte) (int, error) {
			return 0, io.EOF
		}
	}

	_, isReader = t.(io.Reader)
	fmt.Println(isReader) // prints ... what?
}

Method declarations in Go break the complete definition of a type out over many top-level declarations. This works in Go because there is no concept of time for declarations, they all happen simultaneously before a program is run. This won’t work in Neugram where all declarations have to be made inside statements that happen during program execution.

Methodik

To resolve this, Neugram introduces a new keyword to define types with all of its methods in a single statement, methodik.

methodik T int {
	func (t) Read([]byte) (int, error) {
		return 0, io.EOF
	}
}

This statement is evaluated in one step. The type T does not exist beforehand, and after the statement is evaluated it exists with all of its methods.

So far so good.

Method closures: You can’t do that in Go

While testing out method declarations, I attempted to reimplement io.LimitReader. The version I came up with didn’t work:

func limit(r io.Reader, n int) io.Reader {
	methodik lr struct{} {
		func (*l) Read(p []byte) (int, error) {
			if n <= 0 {
				return 0, io.EOF
			}
			if len(p) > n {
				p = p[:n]
			}
			rn, err := r.Read(p)
			n -= rn
			return rn, err
		}
	}
	return &lr{}
}

Why not? Using the values r and n in a closure is normal Go programming, but this is something unusual: I am trying to construct a method closure.

An implication of methods only being definable by top-level declaration in Go is that there is no closure equivalent form. There is also no way (presently, issue #16522 may make it possible) to create a method using reflection which would allow closing over variables.

This is not a particularly problematic limitation, we can move the free variables of the closure explicitly into the type being defined to get the same effect:

func limit(r io.Reader, n int) io.Reader {
	methodik lr struct{
		R io.Reader
		N int
	} {
		func (*l) Read(p []byte) (int, error) {
			if l.N <= 0 {
				return 0, io.EOF
			}
			if len(p) > n {
				p = p[:l.N]
			}
			rn, err := l.R.Read(p)
			l.N -= rn
			return rn, err
		}
	}
	return &lr{r, n}
}

Avoiding method closures also avoids some reflection surprises: two different lr types, defined as closing over different values, would probably have to be different types. That means run time creation of new types without the use of the reflect package, which is a category of possibilities I’m glad I don’t have to imagine.

The restriction itself however could be confusing for someone new to Neugram who doesn’t know about the limits of Go underlying it. In particular, consider the interaction with global variables. It is fine for a method defined in Go to refer to globals, and so too in Neugram:

var x = "hello" // a global

methodik obj struct{} {
	func (o) String() string {
		return x // this is fine
	}
}

However if we take this code and try to indent it into a block, the type checker will now have to produce an error, because x is no longer a global variable. This is unfortunate. In Go there is a clear distinction between global (defined by top-level declarations) and non-global variables (defined by statements). In Neugram they look similar, so this is one more thing the programmer has to track themselves.

Surprising expressivity

Accidentally introducing syntax for method closures is a good example of the kind of problem I have spent a lot of time trying to avoid in Neugram. Even the smallest changes to Go result in unexpected ways to write programs. I did not find this particular problem until months after creating the methodik syntax.

2017-12-25

Many Go projects can be built using only Go’s wonderful built-in tooling. However, for many projects, these commands may not sufficient. Maybe you want to use ldflags during the build to embed the commit hash in the binary. Maybe you want to embed some files into the binary. Maybe you want to generate some code. Maybe you want to run a half dozen different linters. That’s where a build tool comes into play.

I used to propose using a normal go file to be run with go run. However, then you’re stuck building out a lot of the CLI handling yourself, which is busy work… no one wants to write another CLI parser for their project, plus error handling, plus handling output etc.

You might consider make, which handles the CLI definition for you, but then you’re stuck with writing Bash. A few months ago, I decided that neither of these build tool options were sufficient, and decided to make a third way. Thus, Mage was born. Mage is a build tool similar to make or rake, but intead of writing bash or ruby, Mage lets you write the logic in Go.

There are many reasons to choose Mage over make. The most important is the language. By definition, the contributors to your Go project already know Go. For many of them, it may be the language they’re most comfortable with. Your build system is just as important as the thing it’s building, so why not make it just as easy to contribute to? Why have a second language in your repo if you can easily avoid it? Not only is bash an esoteric language to start with, make piles even more arcane syntax on top of bash. Now you’re maintaining effectively three different languages in your repo.

One thing I love about Go is how easy it is to make cross platform applications. This is where Mage really shines. Although make is installed by default on Linux and OSX, it is not installed by default on Windows (which, as Stack Overflow notes, is the most prevalent development OS). Even if you install make on Windows, now you have to get bash running, which is non-trivial (yes, you can install the Windows Subsystem for Linux, but now you’re up to a pretty big ask just to build your Go project).

Mage, on the other hand, is a plain old Go application. If you have Go installed (and I presume you do) you can simply go get github.com/magefile/mage. Mage has no dependencies outside the standard library, so you don’t even have to worry about a dependency manager for it. You can also download prebuilt binaries from github, if that’s preferable.

Once Mage is installed, you use Mage much like make in that you write one or more scripts (in this case, normal go files that we call magefiles) which mage then builds and runs for you. A magefile, instead of having a magic name (like Makefile), uses the go build tag //+build mage to indicate that mage should read it. Other than that, there’s nothing spceial abouit magefiles and you can name them whatever you like.

Mage includes all files that have this tag and only files that have this tag in its builds. This has several nice benefits - you can have the code for your build spread across any number of files, and those files will be ignored by the rest of your build commands. In addition, if you have platform-specific build code, you can use go’s build tags to ensure those are included or excluded as per usual. All your existing Go editor integrations, linters, and command line tools work with magefiles just like normal go files, because they are normal go files. Anything you can do with Go, any libraries you want to use, you can use with Mage.

Just like make, Mage uses build targets as CLI commands. For Mage, these targets are simply exported functions that may optionally take a context.Context and may optionally return an error. Any such function is exposed to Mage as a build target. Targets in a magefile are run just like in make

//+build mage

package main

// Creates the binary in the current directory.  It will overwrite any existing
// binary.
func Build() {
    print("building!")
}

// Sends the binary to the server.
func Deploy() error {
    return nil
}

Running mage in the directory with the above file will list the targets:

$ mage
Targets:
  build    Creates the binary in the current directory.
  deploy   Sends the binary to the server

Mage handles errors returned from targets just like you’d hope, printing errors to stderr and exiting with a non-zero exit code. Dependent targets, just like in make, will be run exactly once and starting at the leaves and moving upward through a dynamically generated dependency tree.

Mage has a ton of features - running multiple targets, default targets, target aliases, file targets and sources, shell helpers, and more. However, for this blog post I want to dive more into some of the magic behind how Mage works, not just what it does.

How it Works

When you run mage, the first thing it has to do is figure out what files it should read. It uses the normal go build heuristics (build tags, _platform in filenames, etc) with one little tweak… normally when you build, go grabs all files in a directory without tags. If you specify a tag in the build command it adds any files with that build tag… but it never takes away the files with no build tags. This won’t work for mage, since I wanted it to only include files that had a specific tag. This required some hacking. I ended up copying the entire go/build package into Mage’s repo and inserting some custom code to add the idea of a required tag… which then excludes any files that don’t explicitly specify that tag.

Once that step is done, we have a list of files with the correct build tags. Now, what to do with them? Well, we need to be able to execute the functions inside them. To do that, we need to generate some glue code to call the functions, and build the whole thing into a binary. Since this process can be time consuming the first time it’s run (on the order of 0.3 seconds on my 2017 MBP), we cache the created binary on disk whenever it’s built. Thus, after the first time it’s run, running mage for a project will start instantly like any normal Go binary (on my machine about 0.01s to print out help, for example). To ensure the cached binary exactly matches the code from the magefiles, we hash the input files and some data from the mage binary itself. If a cached version matches the hash (we just use the hash as the filename), we run that, since we know it must have been built using the exact same code.

If there’s no matching binary in the cache, we need to actually do some work. We parse the magefiles using go/types to figure out what our targets are and to look for a few other features (like if there’s a default target and if there’s any aliases). Parsing produces a struct of metadata about the binary, which is then fed into a normal go template which generates the func main() and all the code that produces the help output, the code to determine what target(s) to call, and the error handling.

This generated code is written to a file in the current directory and then it and the magefiles are run through a normal execution of go build to produce the binary, then the temp file is cleaned up.

Now that the glue code and magefiles have been compiled, it’s just a matter of running the binary and passing through the arguments sent to mage (this is the only thing that happens when the binary is cached).

From there, it’s just your go code running, same as always. No surprises, no wacky syntax. The Go we all know and love, working for you and the people on your team.

If you want some examples of magefiles, you can check out ones used by Gnorm, Hugo, and hopefully soon, dep.

Hop on the #mage channel on gopher slack to get all your questions answered, and feel free to take a look at our current issue list and pick up something to hack on.

2017-12-24

I stumbled over this idiom for managing goroutine lifecycles when I was writing OK Log. Since then I’ve found uses for it in nearly every program I’ve written. I thought it’d be nice to share it.

Motivation

My programs tend to have the same structure: they’re built as a set of inter-dependent, concurrent components, each responsible for a distinct bit of behavior. All of these components tend to be modeled in the same way, more or less: whether implemented as structs with methods or free functions, they’re all things that are running: doing stuff, responding to events, changing state, talking to other things, and so on. And when I write programs, in the style of a large func main with explicit dependencies, I generally construct all of the dependencies from the leaves of the dependency tree, gradually working my way up to the higher-order components, and then eventually go the specific things that I want to run.

In this example, I have a state machine, an HTTP server serving an API, some stream processor feeding input to the state machine, and a ctrl-C signal handler.

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

sm := newStateMachine()
go sm.Run(ctx)

api := newAPI(sm)
go http.ListenAndServe(":8080", api)

r := getStreamReader()
go processStream(r, sm)

signalHandler() // maybe we wait for this one to return

We have this setup phase in our func main, where we set up a context. We set up all of the common dependencies, like the logger. And then, we make all of the components, in the order dictated by their dependency relationships. Our state machine has an explicit Run method, which we go to get it started. Our HTTP API needs to have its handler served in its own go routine. Our stream processor is modeled as a function, taking the input stream and state machine as dependencies, which we also go in the background. And our ctrl-C handler also needs to be running, waiting for its signal.

I think this is the best way to structure object graphs and dependencies in Go programs, and I’ve written at length about it before. But there’s some trickiness in the details here. We know that we must never start a goroutine without knowing how it will stop. But how do we actually do this, in a way that’s both intuitive enough for new maintainers to easily grok and extend, and flexible enough to handle the nontrivial use cases we have here?

To me, the complication hinges not on how to start the goroutines, or handle communication between them, but on how to deterministically tear them down. Returning to our example, let’s consider how each of the components might be stopped.

The state machine is clear: since it takes a context.Context, presumably it will return when the context is canceled.

sm := newStateMachine()
go sm.Run(ctx) // stopped via cancel()

But the HTTP server presents a problem: as written, there’s no way to interrupt it. So we need to change it slightly. It turns out http.ListenAndServe is just a small helper function, which combines two things: first binding the listener, and then attaching and running the server. If we do those two steps explicitly ourselves, we get access to the net.Listener, which has a Close method that, when invoked, will trigger the server to return. Or, better still: we can leverage the graceful shutdown functionality added to http.Server in 1.8.

api := newAPI(sm)
server := http.Server{Handler: api}
ln, _ := net.Listen("tcp", ":8080")
server.Serve(ln) // shutdown via server.Shutdown()

In this demonstrative example, we’ll say that the stream processor returns when its stream io.Reader is exhausted. But as written, there’s no way to trigger an e.g. io.EOF on a plain io.Reader. Instead, we’d need to wrap it into an io.ReadCloser, and provide a way to close the stream pre-emptively. Or, perhaps better, the concrete type that implements io.Reader, for example a net.Conn, may also have a Close method that could work.

// r := getStreamReader()
rc := getStreamReadCloser()
go streamProcessor(rc, sm) // stopped via rc.Close()

Finally, the ctrl-C handler also has no way to be interrupted as written. But since it’s our own code, we’re presumably free to modify it to add an interrupt mechanism. I like using a cancel chan for this kind of basic stuff: less surface area than context.Context.

stop := make(chan struct{})
signalHandler(stop) // returns via close(stop) (or ctrl-C)

Look at all the different ways we have to terminate goroutines. I think the only commonality between them is that they’re expressions, or blocks of Go code. And I think anything that manages goroutine lifecycles needs to accommodate this heterogeneity. If we embrance that constraint, and try to design an API around it, what falls out?

run.Group

My guess at an answer is package run, and the run.Group. From the package documentation:

Package run implements an actor-runner with deterministic teardown. It is somewhat similar to package errgroup, except it does not require actor goroutines to understand context semantics. This makes it suitable for use in more circumstances; for example, goroutines which are handling connections from net.Listeners, or scanning input from a closable io.Reader.

With package run and the run.Group, we model each running goroutine as a pair of functions, defined inline. The first function, called the execute function, is launched as a new goroutine. The second function, called the interrupt function, must interrupt the execute function and cause it to return.

Here’s the documentation:

func (g *Group) Add(execute func() error, interrupt func(error))
    Add an actor (function) to the group. Each actor must be pre-emptable by
    an interrupt function. That is, if interrupt is invoked, execute should
    return. Also, it must be safe to call interrupt even after execute has
    returned.

    The first actor (function) to return interrupts all running actors. The
    error is passed to the interrupt functions, and is returned by Run.

func (g *Group) Run() error
    Run all actors (functions) concurrently. When the first actor returns,
    all others are interrupted. Run only returns when all actors have
    exited. Run returns the error returned by the first exiting actor.

And here’s how it looks when we apply it to our example.

var g run.Group // the zero value is useful

sm := newStateMachine()
g.Add(func() error { return sm.Run(ctx) }, func(error) { cancel() })

api := newAPI(sm)
server := http.Server{Handler: api}
ln, _ := net.Listen("tcp", ":8080")
g.Add(func() error { return server.Serve(ln), func(error) { server.Stop(ctx)} })

rc := getStreamReadCloser()
g.Add(func() error { return streamProcessor(rc, sm) }, func(error) { rc.Close() })

stop := make(chan struct{})
g.Add(func() error { return signalHandler(stop) }, func(error) { close(stop) })

log.Print(g.Run())

g.Run blocks until all the actors return. In the normal case, that’ll be when someone hits ctrl-C, triggering the signal handler. If something breaks, say the stream processor, its error will be propegated through. In all cases, the first returned error triggers the interrupt function for all actors. And in this way, we can reliably and coherently ensure that every goroutine that’s Added to the group is stopped, when Run returns.

I designed run.Group to help orchestrate goroutines in func main, but I’ve found several other uses since then. For example, it makes a great alternative to a sync.WaitGroup if you’d otherwise have to construct a bunch of scaffolding. Maybe you’ll find some uses, too.

2017-12-23

Introduction

Ever wondered how are your goroutines being scheduled by the go runtime? Ever tried to understand why adding concurrency to your program has not given it better performance? The go execution tracer can help answer these and other questions to help you diagnose performance issues, e.g, latency, contention and poor parallelization.

The tool is available since go 1.5 and works by instrumenting the go runtime for specific events, such as:

  1. Creation, start and end of goroutines
  2. Events that block/unblock goroutines (syscalls, channels, locks)
  3. Network I/O related events
  4. Syscalls
  5. Garbage collection

All this data is collected by the tracer without any kind of aggregation or sampling. In some busy applications this may result in a large file that can be analyzed afterwards by the go tool trace command.

Go already had the pprof memory and CPU profiler, before the introduction of the execution tracer, so why was it added to the official toolchain? While the CPU profiler does a nice job to telling you what function is spending most CPU time, it does not help you figure out what is preventing a goroutine from running or how are the goroutines being scheduled on the available OS threads. That’s precisely where the tracer really shines. The tracer design doc does a pretty good job explaining the motivations behind the tracer and how it was designed to work.

A tour of Trace

Let’s start with a simple “Hello, world” example for tracing. In this sample, we use the runtime/trace package to start/stop writing the trace data to the standard error output. Tracing output will be written to the standard error output.

package main

import (
	"os"
	"runtime/trace"
)

func main() {
	trace.Start(os.Stderr)
	defer trace.Stop()
	// create new channel of type int
	ch := make(chan int)

	// start new anonymous goroutine
	go func() {
		// send 42 to channel
		ch <- 42
	}()
	// read from channel
	<-ch
}

This example creates an unbuffered channel and initializes a goroutine that will send the number 42 over this channel. The main goroutine blocks until the other goroutines send a value over the channel.

Running this code with go run main.go 2> trace.out sends the tracing output to the file trace.out, which can then be read with: go tool trace trace.out.

Before go 1.8, one needed both the executable binary and the trace data to be able to analyze the trace; for programs compiled with go 1.8 onwards, the trace data contains all the information needed by the go tool trace command.

After running the command, a browser window opens with some options. Each of those opens a different view of the tracer, containing different information about the program’s execution.

Trace

  1. View trace

    The most complex, powerful and interactive visualization shows a timeline of the entire program execution. This view displays, for example, what was running on each of the virtual processors and what was blocked waiting to run. We will dive deeper into this visualization later in this post. Only works on chrome.

  2. Goroutine analysis

    Shows how many of each kind of goroutines were created during the entire execution. After selecting a kind it is possible to see information about each goroutine of that kind. For example, how long each goroutine was blocked while trying to acquire a lock on a mutex, reading from the network, running and etc.

  3. Network/Sync/Syscall blocking profile

    These contain graphs that displays how long goroutines spent blocked on each of these resources. They are pretty close to the ones available on memory/cpu profiler on pprof. This is the perfect place to look to investigate lock contentions, for example.

  4. Scheduler latency profiler

    Provides timing for scheduler level information showing where time is most spent scheduling.

View Trace

Clicking on the “View trace” linking, one is presented with a screen full of information about the whole program execution.

Press “?” to get a list of available shortcuts to help navigating the trace.

The following image highlights the most important parts and each section is described below:

View trace

  1. Timeline

    Shows the time during the execution and the units of time may change depending on the navigation. One can navigate the timeline by using keyboard shortcuts (WASD keys, just like video games).

  2. Heap

    Shows memory allocations during the execution, this can be really useful to find memory leaks and to check how much memory the garbage collection is being able to free at each run.

  3. Goroutines

    Shows how many goroutines are running and how many are runnable (waiting to be scheduled) at each point in time. A high number of runnable goroutines may indicate scheduling contention, e.g, when the program creates too many goroutines and is causing the scheduler to work too hard.

  4. OS Threads

    Shows how many OS threads are being used and how many are blocked by syscalls.

  5. Virtual Processors

    Shows a line for each virtual processor. The number of virtual processors is controlled by the GOMAXPROCS environment variable (defaulting to the number of cores).

  6. Goroutines and events

    Displays where/what goroutine is running on each virtual processor. Lines connecting goroutines represent events. In the example image, we can see that the goroutine “G1 runtime.main” spawned two different goroutines: G6 and G5 (the former is the goroutine responsible for collecting the trace data and the latter is the one we started using the “go” keyword).

    A second row per processor may show additional events such as syscalls and runtime events. This also includes some work that the goroutine does on behalf of the runtime (e.g assisting the garbage collector).

The image below shows information obtained when selecting a particular goroutine.

View goroutine

This information includes:

  • Its “name” (Title)
  • When it started (Start)
  • Its duration (Wall Duration)
  • The stack trace when it started
  • The stack trace when it finished
  • Events generated by this goroutine

We can see that this goroutine created two events: the tracer goroutine and the goroutine that started to send the number 42 on the channel.

View event

By clicking on a particular event (a line in the graph or by selecting the event after clicking on the goroutine), we can see:

  • The stack trace when the event started
  • The duration of the event
  • Goroutines involved in the event

One may click on these goroutines to navigate to their trace data.

Blocking profiles

Another particular view available from a trace are the network/synchronization/syscall blocking profiles. Blocking profiles shows a graph view similar to those available on memory/cpu profiles from pprof. The difference is that instead of showing how much memory each function allocated, those profiles show how long each goroutine spent blocking on a particular resource.

The image below shows the “Synchronization blocking profile” for our sample code.

View trace

This shows us that our main goroutine spent 12.08 microseconds blocked receiving from a channel. This kind of graph is a great way to find lock contentions, when too many goroutines are competing to obtain a lock on a resource.

Collecting Traces

There are three ways to collect tracing information:

  1. Using the runtime/trace pkg

This involved calling trace.Start and trace.Stop and was covered in our “Hello, Tracing” example.

  1. Using -trace=<file> test flag

This is useful to collect trace information about code being tested and the test itself.

  1. Using debug/pprof/trace handler

This is the best method to collect tracing from a running web application.

Tracing a web application

To be able to collect traces from a running web application written in go, one needs to add the /debug/pprof/trace handler. The following code sample shows how this can be done for the http.DefaultServerMux: by simply importing the net/http/pprof package.

package main

import (
	"net/http"
	_ "net/http/pprof"
)

func main() {
	http.Handle("/hello", http.HandlerFunc(helloHandler))

	http.ListenAndServe("localhost:8181", http.DefaultServeMux)
}

func helloHandler(w http.ResponseWriter, r *http.Request) {
	w.Write([]byte("hello world!"))
}

To collect the traces we need to issue a request to the endpoint, e.g, curl localhost:8181/debug/pprof/trace?seconds=10 > trace.out. This request will block for 10 seconds and the trace data will written to the file trace.out. A trace generated like this can be viewed the same way as we did before: go tool trace trace.out.

Security note: beware that exposing pprof handlers to the Internet is not advisable. The recommendation is to expose these endpoints on a different http.Server that is only bound to the loopback interface. This blog post discusses the risks and has code samples on how to properly expose pprof handlers.

Before gathering the trace, let’s start by generating some load on our service using wrk:

$ wrk -c 100 -t 10 -d 60s http://localhost:8181/hello

This will use 100 connections across 10 threads to make requests during 60 seconds. While wrk is running, we can collect 5s of trace data using curl localhost:8181/debug/pprof/trace?seconds=5 > trace.out. This generated a 5MB file (this can quickly grow if we are able to generate more load) on my 4 CPU machine.

Once again, opening the trace is done by the go tool trace command: go tool trace trace.out. As the tool parses the entire content of the file, this will take longer than our previous example. When it completes, the page looks slightly different:

View trace (0s-2.546634537s)
View trace (2.546634537s-5.00392737s)

Goroutine analysis
Network blocking profile
Synchronization blocking profile
Syscall blocking profile
Scheduler latency profile

To guarantee that the browser will be able to render everything, the tool has divided the trace into two continuous parts. Busier applications or longer traces may require the tool to split this in even more parts.

Clicking on “View trace (2.546634537s-5.00392737s)” we can see that there is a lot going on:

View trace web

This particular screenshot shows a GC run that starts at between 1169ms and 1170ms and ends right after 1174ms. During this time, an OS thread (PROC 1) ran a goroutine dedicated to the GC while other goroutines assisted in some GC phases (these are displayed on lines bellow the goroutine and are read MARK ASSIST). By the end of the screenshot, we can see that most of the allocated memory was freed by the GC.

Another particular useful information is the number of goroutines that are in “Runnable” state (13 on the selected time): if this number becomes large over time this can indicate that we need more CPUs to handle the load.

Conclusions

The tracer is a powerful tool for debugging concurrency issues, e.g, contentions and logical races. But it does not solve all problems: it is not the best tool available to track down what piece of code is spending most CPU time or allocations. The go tool pprof is better suited for these use cases.

The tool really shines when you want to understand the behavior of a program over time and to know what each goroutine is doing when NOT running. Collecting traces may have some overhead and can generate a high amount of data to be inspected.

Unfortunately, official documentation is lacking so some experimentation is needed to try and understand what the tracer is showing. This is also an opportunity for contributions to the official documentation and to the community in general (e.g blog posts).

André is a Sr. Software Engineer at Globo.com, working on Tsuru. @andresantostc on twitter, https://andrestc.com on the web

Reference

  1. Go execution tracer (design doc)
  2. Using the go tracer to speed fractal rendering
  3. Go tool trace
  4. Your pprof is showing
2017-12-22

This article is about how we at Mendelics changed our report system from Python to Go using gofpdf library, why we did this change, how we planned it and some insights we got along the way.

Some Context

Before I dive into some technical aspects let me introduce to you guys what Mendelics does. Mendelics is Brazilian laboratory which process DNA analysis in order to find genetic diseases. We use a technique called NGS (Next Generation Sequencing) to process blood samples and at the end of some steps we input all the DNA information into a Go application in a human readable way.

Our physicians will analyse this data and generate a report, basically a PDF file, which will be sent to the patient after a few days.

Our Architecture

Regarding the reports our application was split in two parts:

  • Python API which holds patient’s data, exams information and other business logic;
  • Go application used by physicians to analyse medical information and “create the report”;

I added quotes into “create the report” because under the hood the Go application just send a POST to the Python API to generate it. At this point the Go application doesn’t know how to create a report at all.

Below there is an image which explain it in a better way.

The Python API will use the metadata sent before to create the PDF when the endpoint /report/XPTO is called. In this particular case XPTO is the exam identifier.

Our Problem

The reports was built using Report Lab, a great python library used by players like NASA and Wikipedia, but the way we used it makes changes into report’s structure a nightmare.

Our reports are using background/foreground structure, which maps to a predefined PDF template (background) and the data you wish to add on it (foreground).

Above you see an example of our background template. It worked for a while, but our requirements changed and our reports got more and more complicated. Everytime we need to add/remove a field, for instance, we need to redesign all those kinds of reports because the foreground data needs to be re-aligned over and over again.

Can you imagine other problems that we had using this approach? Let me list a few:

  • Page limit;
  • Text limit;
  • Tables with prefixed rows length;
  • No custom layout;

Our Solution

We want to rewrite it in Go.

Since our main application is written in Go and it’s the only application which needs to know about reports why not to move report’s logic to this end? It’s makes sense, but is there any good Go library out there which is easy to use and handles our known problems in a good way?

The answer is gofpdf.

A great library ported from PHP which have good support for everything we needed at this time. So let’s plan this change.

Proof of Concept

We’ve created a new repo to check if we could reproduce the hardest report using gofpdf and to understand the trade-offs with this new approach.

To start we need to get it as any Go library using go get github.com/jung-kurt/gofpdf

With the library downloaded, the code to create our POC is very simple.

We created a function NewReport and inside of it we configured how this document will looks like (e.g. font family, font color, page size, header and footer data). Let’s look into it:

func NewReport() *Report {
	pdf := gofpdf.New("P", "mm", "A4", "./assets/fonts")
	html := pdf.HTMLBasicNew()
	encodingFunc := pdf.UnicodeTranslatorFromDescriptor("")

	pdf.AddFont("ProximaNova", "", "ProximaNova-Reg-webfont.json")
	pdf.AddFont("ProximaNova", "B", "ProximaNova-Bold-webfont.json")
	pdf.AddFont("ProximaNova-Light", "", "ProximaNova-Light-webfont.json")

	report := &Report{
		htmlContent:  html,
		encodingFunc: encodingFunc,
	}

	pdf.SetFont("ProximaNova", "", fontPtSize)
	pdf.SetTextColor(75, 75, 80)
	pdf.AliasNbPages("")
	pdf.SetHeaderFunc(report.headerFunc)
	pdf.SetFooterFunc(report.footerFunc)

	report.document = pdf
	return report
}

Our struct Report just holds the configured document as you can see in the last line before actually return it.

The idea here is just to show how simple is to create your documents using this library. You could check the docs to understand what each function does. Let’s move one step further.

The main code lives in a HandleFunc, which was created just to be able to see this PDF using a browser instead of generating it in the file system.

http.HandleFunc("/report", func(w http.ResponseWriter, r *http.Request) {
    report := NewReport()
    report.PatientHeader()
    report.Diagnostic()
    report.GeneList()
    report.TechnicalResponsible()
    report.Method()
    report.QualityFlags()
    report.VUS()
    report.Comments()


    if err := report.document.Output(w); err != nil {
        log.Print(err)
    }
})

Each function in the report object (created by our NewReport as we saw above) is a section builder for that given name. We build the patient header, with information like name, age, etc, diagnostic section which is where we explain if the patient has a positive or negative result for that specific exam and so on.

Let’s get the Method function to see how it works under the hood.

func (r *Report) Method() {
	fs := 10.0
	r.document.SetFontSize(fs)

	content := "Captura de exons com Nextera Exome Capture seguida por sequenciamento de nova " +
		"geração com Illumina HiSeq. Alinhamento e identificação de variantes utilizando protocolos " +
		"de bioinformática, tendo como referência a versão GRCh37 do genoma humano. Análise médica " +
		"orientada pelas informações que motivaram a realização deste exame."

	r.drawLine(fs, "<b>Método</b>", 10)

	// MultiCell(width, height, content, border, align, fill)
	r.document.MultiCell(0, 5, r.encodingFunc(content), "", "", false)
	r.lineStroke()
}

We just set the font size for that section, build a text for it and add it to the document using the MultiCell function. Easy! No x/y alignment, text limit, etc. The library will handle it based on our previous configuration at NewReport.

Going back to the main function, at the end we write the final document back to the ResposeWriter and we’re done!

Below you could see the result of this POC: a report 99% similar to what we have in python in just 351 LOC:

From Python to Go

As we saw previously, we need to change how Python and Go communicate with each other. Instead of sending an application/pdf as before, now the python API needs to serve all the data needed to construct the report on the Go side.

We created a new endpoint to serve this new information as we can see below:

With the API ready we changed how our Go application called it. Instead of calling the API to build the report now we’re calling the model endpoint in order to get the metadata necessary to construct it in our end.

model, err := api.GetModel(code, user)

if err != nil {
    msg := "unable to get report model"
    logrus.WithFields(logrus.Fields{
        "code":  code,
        "user":  user,
        "error": err,
    }).Error(msg)
    http.Error(w, msg, http.StatusInternalServerError)
}

After that we pass this data to our report object:

if err := report.New(model).Output(w); err != nil {
    msg := "unable to generate report"
    logrus.WithFields(logrus.Fields{
        "code":  code,
        "error": err,
    }).Error(msg)
    http.Error(w, msg, http.StatusInternalServerError)
}

Did you see how this code is very similar with the one from our POC? Yes, you got this right, we just basically copy&paste these code from POC to our application and just remove the hardcode part related to how we build each section. The Output function is pretty much the same:

func (r *Report) Output(w io.Writer) error {
	if err := r.Error(); err != nil {
		r.document.SetError(err)
		return r.document.Output(w)
	}

	r.patientHeader()
	r.diagnostic()
	r.geneList()
	r.method()
	r.qualityFlags()
	r.vusSection()
	r.comments()
	r.additionalInformation()

	return r.document.Output(w)
}

Bonus: Unit Test For PDF

Your methods to build each PDF’s section are fully tested, but what if I change something like font color, an image position or another visual thing? How can I get feedback about it?

In our case, let’s say we change accidentally the color of an arbitrary element. The right document was:

and now it looks like this:

The Mistake

We changed the background color of an element in the right corner.

Without any unit tests we need to check all documents by eye to guarantee we didn’t break anything, but the library provides an awesome way to test our PDF files! Please check the unit tests inside the library to get the whole idea.

In our end our tests will alarm if we made a mistake like above:

Conclusion

I strongly recommend to you gofpdf if you need to build PDFs in Go.

Also today we can create any kind of project using Go not limited to CLI and APIs and even code that doesn’t need to rely on channels and goroutines to get the job done.

I’m very happy with the Go ecosystem so far and looking forward to see what the community can build using it.

2017-12-21

What if you could use SQL to query any aspect of your infrastructure? Osquery, an open source instrumentation tool released by the Facebook security team allows you to do just that.

For example, SELECT network_name, last_connected, captive_portal FROM wifi_networks WHERE captive_portal=1; will show all captive portal WiFi networks that a laptop has connected to. And SELECT * FROM processes WHERE on_disk = 0; will show any process that is running where the binary has been deleted from disk. When the root password vulnerability became know a few weeks ago, the osquery community quickly crafted a query which would identify vulnerable macs in a fleet of devices. With almost 200 tables available by default and support for macOS, Linux and Windows hosts, osquery is the tool of choice for many security and system administration teams.

Osquery is a powerful tool, but it’s written in C++, so why are we talking about it in a GopherAcademy post? Osquery uses Thrift (a project similar to gRPC) to allow developers to extend osquery through a series of plugin types. Earlier this year our team at Kolide released a set of Go packages with idiomatic interfaces that allow anyone to use the full power of Go to extend osquery. In this blog post, it’s my goal to show you how you can get started with osquery development using the osquery-go SDK.

Writing a custom logger plugin

When a scheduled query like SELECT name, version from deb_packages is executed, the osqueryd daemon will create a JSON log event with the results of the query. By default, a filesystem plugin is used, which logs the results to a local file. Commonly oquery users use aggregation tools like filebeat to send the result logs to a centralized log platform. Other plugins exist too. The tls plugin sends all logs to a remote TLS server like Fleet. The kinesis plugin sends logs results to AWS, allowing advanced monitoring with applications like StreamAlert. But what if you already have a well established logging pipeline with the systemd journal, Splunk, fluentd or any number of proprietary logging systems. With the Thrift bindings to osquery, you can write your own logger. Go, having support for most APIs these days, is an ideal language for implementing a logger.

For the purpose of this tutorial, we’ll implement a systemd journal logger. The go-systemd library from CoreOS has a convenient package we can use to write to journald.

The github.com/kolide/osquery-go/plugin/logger package exposes the following API which we need to implement.

type Plugin struct {}

type LogFunc func(ctx context.Context, typ LogType, log string) error

func NewPlugin(name string, fn LogFunc) *Plugin

To create our own logger, we have to implement a function that satisfies the signature of LogFunc.

For journald the function looks like this:

func JournalLog(_ context.Context, logType logger.LogType, logText string) error {
        return journal.Send(
                logText,
                journal.PriInfo,
                map[string]string{"OSQUERY_LOG_TYPE": logType.String()},
        )
}

Now we can call logger.NewPlugin("journal", JournalLog) to get back a functioning osquery plugin we can register with the Thrift extension server.

Configuring osquery to use our custom extension

We have implemented a logger plugin, but we still have to link it to osqueryd. Osquery has a few specific requirements for registering plugins. Plugins must be be packaged as executables, called extensions. A single extension can bundle one or more plugins. We’ll use a package main to create an extension.

Osquery will call our extension with 4 possible CLI flags, the most important of which is the unix socket we’ll use to communicate back to the process.

        var (
                flSocketPath = flag.String("socket", "", "")
                flTimeout    = flag.Int("timeout", 0, "")
                _            = flag.Int("interval", 0, "")
                _            = flag.Bool("verbose", false, "")
        )
        flag.Parse()

We’ll ignore the interval and verbose flag in this extension, but they still have to be parsed to avoid an error.

Next, we’ll add time.Sleep(2 * time.Second) to wait for the unix socket to become available. In production code we would add a retry with a backoff.

Once the extension file is available, we can bind to the socket by creating a ExtensionManagerServer. The extension will use the socket path provided to us by the osquery process.

        server, err := osquery.NewExtensionManagerServer(
                "go_extension_tutorial",
                *flSocketPath,
                osquery.ServerTimeout(time.Duration(*flTimeout)*time.Second),
        )
        if err != nil {
                log.Fatalf("Error creating extension: %s\n", err)
        }

Next, we can create our logger and register it with the server.

       journal := logger.NewPlugin("journal", JournalLog)
       server.RegisterPlugin(journal)

Finally, we can run the extension. The server.Run() method will block until an error is returned.

      log.Fatal(server.Run())

Now that we created our package main, we can build the binary and start osqueryd with the custom logger. Osquery has a few requirements for executables we have to follow:

  • The executable must have a .ext file extension.
  • The executable path should be added to an extensions.load file which can be passed to the osqueryd --extensions_autoload CLI flag.
  • The extension must be owned by the same user that is running osquery, and the permissions must be read+exec only. This is a precaution against an attacker replacing an extension executable that the osqueryd process runs as root. For development, you can use the --allow_unsafe flag, but we won’t need it here since we’ll be running the osquery process as our current user account.

Putting it all together we get:

echo "$(pwd)/build/tutorial-extension.ext" > /tmp/extensions.load
go build -i -o build/tutorial-extension.ext
osqueryd \
  --extensions_autoload=/tmp/extensions.load \
  --pidfile=/tmp/osquery.pid \
  --database_path=/tmp/osquery.db \
  --extensions_socket=/tmp/osquery.sock \
  --logger_plugin=journal

Immediately we can see our logger working with journalctl

sudo journalctl OSQUERY_LOG_TYPE=status -o export -f |awk -F'MESSAGE=' '/MESSAGE/ {print $2}'

{"s":0,"f":"events.cpp","i":825,"m":"Event publisher not enabled: audit: Publisher disabled via configuration","h":"dev","c":"Mon Dec 18 03:34:31 2017 UTC","u":1513568071}
{"s":0,"f":"events.cpp","i":825,"m":"Event publisher not enabled: syslog: Publisher disabled via configuration","h":"dev","c":"Mon Dec 18 03:34:31 2017 UTC","u":1513568071}
{"s":0,"f":"scheduler.cpp","i":75,"m":"Executing scheduled query foobar: SELECT 1","h":"dev","c":"Mon Dec 18 03:34:38 2017 UTC","u":1513568078}

Adding tables to osquery

Loggers are great, but what if we need to implement a custom table? Let’s stick with the go-systemd package and prototype a systemd table which will list the systemd units and their state.

The github.com/kolide/osquery-go/plugin/table package has a similar API to that of the logger plugin.

type Plugin struct {}

type GenerateFunc func(ctx context.Context, queryContext QueryContext) ([]map[string]string, error)

type ColumnDefinition struct {
    Name string
    Type ColumnType
}

func NewPlugin(name string, columns []ColumnDefinition, gen GenerateFunc) *Plugin

The ColumnDefinition struct defines four SQL column types: TEXT, INTEGER, BIGINT and DOUBLE. To create the table, we’ll have to implement the GenerateFunc which returns the table as a []map[string]string.

We’ll implement the required Generate function using the dbus package, which has a helpful ListUnits() method.

Note: I’m using package globals and ignoring errors to keep the example code short. The full implementation is linked at the end of this post.

var conn *dbus.Conn

func generateSystemdUnitStatus(_ context.Context, _ table.QueryContext) ([]map[string]string, error) {
        units, _ := conn.ListUnits()
        var results []map[string]string
        for _, unit := range units {
                // get the pid value
                var pid int
                p, _ := conn.GetServiceProperty(unit.Name, "MainPID")
                pid = int(p.Value.Value().(uint32))

                // get the stdout path of the service unit
                var stdoutPath string
                p, _ := conn.GetServiceProperty(unit.Name, "StandardOutput")
                stdoutPath = p.Value.String()

                //... a few more getters like this
                // then populate the table rows
                results = append(results, map[string]string{
                        "name":         unit.Name,
                        "load_state":   unit.LoadState,
                        "active_state": unit.ActiveState,
                        "exec_start":   execStart,
                        "pid":          strconv.Itoa(pid),
                        "stdout_path":  stdoutPath,
                        "stderr_path":  stderrPath,
                })
        }
    return results, nil
}

Now we can create the osquery-go *table.Plugin:

func SystemdTable() *table.Plugin {
        columns := []table.ColumnDefinition{
                table.TextColumn("name"),
                table.IntegerColumn("pid"),
                table.TextColumn("load_state"),
                table.TextColumn("active_state"),
                table.TextColumn("exec_start"),
                table.TextColumn("stdout_path"),
                table.TextColumn("stderr_path"),
        }
        return table.NewPlugin("systemd", columns, generateSystemdUnitStatus)
}

Back in our func main, we can register this plugin with the server, similar to how we registered the logger plugin.

systemd := SystemdTable()
server.RegisterPlugin(systemd)

We can now use the systemd service in our queries.

osquery> SELECT process.start_time, systemd.name AS service, process.name, listening.address, listening.port, process.pid FROM processes AS process JOIN listening_ports AS listening ON (process.pid = listening.pid) JOIN systemd ON systemd.pid = process.pid and listening.port = 443;
+------------+------------------+----------+---------+------+-------+
| start_time | service          | name     | address | port | pid   |
+------------+------------------+----------+---------+------+-------+
| 6308708    | nignx.service    | nginx    | ::      | 443  | 25859 |
+------------+------------------+----------+---------+------+-------+

By configuring the query to run on a schedule, and using the logger plugin to aggregate the results centrally, we can begin to instrument our systems and create alerts.

Speaking of configuration, how are you configuring the osquery process? The recommended way is a configuration management tool like Chef, or a dedicated TLS server like Fleet, but maybe you’ve got custom requirements?

Config plugins for osquery

Just like you can log results with a custom logger, you can load configuration through a custom plugin. We’ll implement a plugin which configures the osquery process and schedules a list of schedules queries to run. To keep things simple, we’ll load configuration from a GitHub gist.

By now, you can probably guess what the API of the github.com/kolide/osquery-go/plugin/config looks like.

type Plugin struct {}

type GenerateConfigsFunc func(ctx context.Context) (map[string]string, error)

func NewPlugin(name string, fn GenerateConfigsFunc) *Plugin

Here, we implement the GenerateConfigs function to return one or more config sources as a map, where each value represents the full config JSON file as a string.

var client *github.Client

func (p *Plugin) GenerateConfigs(ctx context.Context) (map[string]string, error) {
        gistID := os.Getenv("OSQUERY_CONFIG_GIST")

        gist, _, err := client.Gists.Get(ctx, p.gistID)
        if err != nil {
                return nil, errors.Wrap(err, "get config gist")
        }
        var config string
        if file, ok := gist.Files["osquery.conf"]; ok {
                config = file.GetContent()
        } else {
                return nil, fmt.Errorf("no osquery.conf file in gist %s", p.gistID)
        }
        return map[string]string{"gist": config}, nil
}

One thing I want to highlight here is that our plugin needs it’s own configuration.

gistID := os.Getenv("OSQUERY_CONFIG_GIST")

You might need to provide configuration like API keys to your plugin, and environment variables provide a convenient way of doing that.

Now that we’ve created the plugin, one thing left to do is register it inside func main and restart osqueryd.

gistConfig := config.NewPlugin("gist", GenerateConfigs)
server.RegisterPlugin(gistConfig)

Restart the osqueryd daemon with two new flags. A refresh interval (in seconds) and the config plugin to use instead of the default filesystem one.

--config_refresh=60 \
--config_plugin=gist

Conclusion

In the article I’ve given an overview of osquery and how to use the Go plugin SDK to write your own custom extensions. Besides creating the plugins we also have to think about packaging, distribution and the platforms we’re running the osquery daemon on. For example, the journal and systemd APIs are not available on macOS or windows, so we have to compile our custom extensions in a different way for each platform. Once again, Go makes this process easy by allowing us to use build tags when writing platform specific plugins.

At Kolide, we’ve been writing our own open source osqueryd extension called Launcher. Launcher implements config, logger and other pugins for osquery using gRPC and the Go kit toolkit to effectively manage osqueryd at scale for various environments. If you’ve found this article interesting, I encourage you to check out the Launcher source. The osquery has a vibrant community of users and developers, most of which hang out on Slack. In addition to the Go SDK, a similar one is available for python.

I’ve described three plugin types logger, table and config, but there’s a fourth plugin type the osquery-go SDK allows you to write, and that’s a distributed plugin. What makes the distributed plugin interesting is that you can schedule queries and get query results from your whole fleet of endpoints in real time. While writing this blog post, I got the idea of implementing the distributed plugin as a Twitter bot. If you tweet a valid query with the #osqueryquery hash tag, you’ll get back a response with the results. Although I’ve left out the implementation of this final plugin from the article, It has a very similar API to the plugins I’ve described above.

You can check out the source of all the plugins above, and a few more examples in the Github repo that I’ve created for this post.

2017-12-20

In April 2017, I thought it would be fun to try setting up a system to track the star counts of the top 1000 Go repositories on GitHub. This article describes how I collected this data and some simple analysis of the dataset.

I want to be clear that this was for fun only and I’m not advocating that the number of stars a repository has is the be-all-end-all of its success. There’s many, many repositories I am not mentioning here that are high quality, useful code. There’s also tons of code that lives in private repositories outside GitHub or on Bitbucket and GitLab. In the context of this article, the stars on GitHub are a set of data used to explore the patterns people have when starring repositories.

Collecting the data

The collection of the stars could be an article all its own (and may become one elsewhere) so I will be brief.

The collection is done by a Lambda function on Amazon Web Services using a timer that triggers an execution every 15 minutes. The lambda function collects the star count of the top 1000 Go repositories, which is the maximum the GitHub search API will return. This is a post about Go, so the lambda function is written in… Python, of course! The total resource usage is well below the free tier limits, so running this function is free indefinitely.

The data that the lambda function collects is stored in a DynamoDB table that just holds the repo name, an integer timestamp (epoch time in seconds), and the number of stars at that time. By the time this article was started, I had several hundred megabytes of data in DynamoDB. As with the Lambda function, the usage here is below the free tier limit, so the storage is free as well. For reference, the allocated read and write capacity is 20 units each.

To get the data out of DynamoDB, I used the standard AWS Data Pipeline template. The AWS Documentation covers the process well. From there, I had to download it all locally using the AWS CLI and then write a program (this time in Go) to convert the many separate chunk files into one CSV file. This left me with a 726 MB CSV file.

The go program to do the CSV conversion is incredibly over-engineered, but it was fun to optimize while the next step completed.

I decided that a real database was going to be helpful in doing the analysis, so I set up Postgres locally and imported the giant CSV file into a table. The following is the schema and command to copy the CSV data into Postgres. This would be run at the psql command line.

drop table if exists github_stars;

create unlogged table github_stars (
    repo  varchar(255) not null,
    ts    integer      not null,
    stars integer      not null,
    constraint repo_ts primary key(repo,ts)
);

create index repo_idx  on github_stars(repo);
create index ts_idx    on github_stars(ts);
create index stars_idx on github_stars(stars);

\copy github_stars from 'github-stars.csv' csv

Let’s dig in

At the end of this (lengthy) import I was left with a relatively simple table that has 23,279,479 rows of data about 1412 separate repositories. Note that this number isn’t easily divided by 1000. That’s because the error handling was not extremely robust in the lambda. It’s designed for long term trends, not second-by-second updates. The number of tracked repositories is higher than 1000 because some repositories stayed the same and slipped out of view as others increased. Yet others increased rapidly from 0 to attain a relatively high position in the middle of tracking.

Total stars per repo

I started by looking at the distribution of total stars per repository. Getting the number of stars for each repository at the end of the sample period (for that repository) was fairly straightforward (but also fairly slow). The graph is the ordered list from rank 1 on down.

select gs.repo, gs.stars
from github_stars as gs
inner join (
    select repo, max(ts) as ts
    from github_stars
    group by repo
) as maxts
on gs.repo = maxts.repo and gs.ts = maxts.ts
order by gs.stars desc;

total_stars

Total star gain

All of the repositories in the tracked set gained a total of 533,614 stars over the tracking period. This was done by finding the min and max timestamp for each repository, getting the star counts at those times, finding the difference, and then summing all those differences.

select sum(maxs.stars-mins.stars) as total_increase
from (
    select gs.repo, gs.stars
    from github_stars as gs
    inner join (
        select repo, max(ts) as ts
        from github_stars
        group by repo
    ) as maxts
    on gs.repo = maxts.repo and gs.ts = maxts.ts
) as maxs
inner join (
    select gs.repo, gs.stars
    from github_stars as gs
    inner join (
        select repo, min(ts) as ts
        from github_stars
        group by repo
    ) as mints
    on gs.repo = mints.repo and gs.ts = mints.ts
) as mins
on maxs.repo = mins.repo;

Rate of star count increase

Through some SQL-fu I produced the sorted list of repositories by the number of stars they collected per day during the time they were tracked. Hold on to your hat for this SQL statement:

select rises.repo, cast(rise as float)/cast(run as float)*(24*60*60) as stars_per_day
from (
    select maxs.repo, maxs.stars-mins.stars as rise
    from (
        select gs.repo, gs.stars
        from github_stars as gs
        inner join (
            select repo, max(ts) as ts
            from github_stars
            group by repo
        ) as maxts
        on gs.repo = maxts.repo and gs.ts = maxts.ts
    ) as maxs
    inner join (
        select gs.repo, gs.stars
        from github_stars as gs
        inner join (
            select repo, min(ts) as ts
            from github_stars
            group by repo
        ) as mints
        on gs.repo = mints.repo and gs.ts = mints.ts
    ) as mins
    on maxs.repo = mins.repo
) as rises
inner join
(
    select repo, max(ts)-min(ts) as run
    from github_stars
    group by repo
) as runs
on rises.repo = runs.repo
where runs.run > 0
order by stars_per_day desc;

This produced a distribution much like others you see with this kind of data; there’s a few repositories in the fat section of the graph with the majority of the rate increase with a long tail of repos that are slowly increasing. There is one unfortunate repository that had a -24 out there at the end.

star_gain_per_day

Do bigger repositories grow faster?

I wanted to figure out if the larger repositories grow faster than smaller ones, so I grabbed the number of stars at the end of the tracking period for each repository and charted that against the number of stars increase per day. The SQL this time is a small tweak to the last one:

select rises.repo, rises.stars as stars,
cast(rise as float)/cast(run as float)*(24*60*60) as stars_per_day
from (
    select maxs.repo, maxs.stars, maxs.stars-mins.stars as rise
    from (
        select gs.repo, gs.stars
        from github_stars as gs
        inner join (
            select repo, max(ts) as ts
            from github_stars
            group by repo
        ) as maxts
        on gs.repo = maxts.repo and gs.ts = maxts.ts
    ) as maxs
    inner join (
        select gs.repo, gs.stars
        from github_stars as gs
        inner join (
            select repo, min(ts) as ts
            from github_stars
            group by repo
        ) as mints
        on gs.repo = mints.repo and gs.ts = mints.ts
    ) as mins
    on maxs.repo = mins.repo
) as rises
inner join
(
    select repo, max(ts)-min(ts) as run
    from github_stars
    group by repo
) as runs
on rises.repo = runs.repo
where runs.run > 0
order by stars desc;

spd_vs_total

Yes! they do, generally. Some smaller repositories have a higher rate, but the larger ones (for which there’s less data) are definitely trend higher. To make the data a little easier to see, a log scale can be used on both axes. There’s a definite split between the top repositories and the bottom ones.

spd_vs_total_log

Single Repository Graphs

One of the first things I did early on was to create a python script that pulls the data for any number of repositories and graph it using matplotlib. The details are can be found in the Python code, but the graphs are fun to look at:

docker and moby graph dominikh/go-tools graph Netflix/rend graph

External Effects

One of the more interesting things is seeing how external events can affect individual repositories greatly in the short term. Most projects have short-term jumps in star counts based on blog posts or other external events, like a talk at Gophercon or inclusion in one of the newsletters. That’s mostly speculation, I didn’t spend the time to do correlation.

Case in point: The first graph clearly shows a sharp rise in star count for moby/moby just after the name change from docker/docker and then a pretty linear rise after that.

Conclusions

  1. Bigger repositories do grow faster. There seems to be an inflection point around 5000 stars.
  2. Smaller repositories can also grow fast. The fastest were in the middle of the pack below the inflection point.
  3. Every project is different. Some will be stable and get less press and therefore grow less. Others are constantly being talked about.

Data

The final collected data (as of this writing) is 23,279,479 rows and over 750 MB raw and 100 MB gzipped. Unfortunately, the data is too big to just toss up on a website for download. I have provided all of my source code in links above, though, so anyone could easily set this up.

What next?

I haven’t turned it off, so in the time it took to read this article it’s likely collected another measurement. I plan on leaving it on for as long as I can. I also will likely adjust the lambda function to collect more kinds of data about the people starring repositories. I think looking at the groupings of repositories by people who starred them can uncover similarities between them, or at least show what kinds of things a given user is interested in.

If you’re interested in doing this kind of tracking (or just like playing with data), grab all the code linked above and start it running in your own AWS account. It’s (almost) free and you can have fun playing with the output. If anyone wants the source data or has any other questions, you can contact me on twitter or by email. Let me know what you thought and if you have ideas on what to do with all this data, let me know.

Twitter: @sgmansfield

GitHub: ScottMansfield

Email: sgmansf@gmail.com

2017-12-19

While speaking with coworkers recently, one of them posed a question:

How can we discover the capacity of each memory DIMM in a machine?

Some veteran Linux users may be familiar with the dmidecode utility, which can access SMBIOS/DMI information exposed by hardware. This utility can expose a huge amount of information about the hardware and BIOS software on a machine. But how does it work under the hood?

This post will explain how to access this information, and demonstrate the open source go-smbios package that can be used to retrieve and leverage this information in Go.

Introduction to SMBIOS/DMI

System Management BIOS, or SMBIOS, is a specification that defines data structures that can be used to access information exposed by hardware and the system BIOS in a standardized way.

SMBIOS is often confused with Desktop Management Interface, or DMI, but it is essentially an evolution of the original DMI specification. This is where the Linux dmidecode utility’s name originates.

What kind of information is exposed by this interface? We can use the dmidecode utility to take a peek at some of this information.

$ sudo dmidecode | head -n 12
# dmidecode 3.0
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.
69 structures occupying 3435 bytes.
Table at 0x000E0FC0.

Handle 0x0000, DMI type 4, 42 bytes
Processor Information
        Socket Designation: CPU Socket - U3E1
        Type: Central Processor
        Family: Core i7
        Manufacturer: Intel(R) Corporation

The utility exposes quite a lot of information, but even from this small sample, we can note several important features:

  • the version of SMBIOS present on the machine
  • how many SMBIOS structures are available, and how many bytes they occupy
  • the memory address of the SMBIOS structures table
  • a structure with a type, length, and handle field, and some information

SMBIOS has dozens of different structures, and each can encode a variety of data.

Retrieving SMBIOS information with Go

SMBIOS information consists of two crucial pieces: an “entry point” structure, and a table of data structures which carry SMBIOS information.

On modern Linux machines, the entry point structure and table can be found using two special files in sysfs:

$ ls /sys/firmware/dmi/tables/
DMI  smbios_entry_point

While this is certainly convenient, the standard approach on other UNIX-like operating systems is to directly scan system memory for a magic string, using /dev/mem.

The basic algorithm is:

  • start scanning for the magic prefix “_SM” at memory address 0x000f0000
  • iterate one “paragraph” (16 bytes) of memory at a time until we either find an entry point or reach the memory address 0x000fffff
  • determine if the entry point is the 32-bit or 64-bit variety, and decode it
  • use information from the entry point to find the address and size of the structures table

Discovering and decoding SMBIOS entry points in Go

In simplified Go code (please always check your errors), discovering the entry point looks something like:

// Open /dev/mem and seek to the starting memory address.
const start, end = 0x000f0000, 0x000fffff

mem, _ := os.Open("/dev/mem")
_, _ = mem.Seek(start, io.SeekStart)

// Iterate one "paragraph" of memory at a time until we either find the entry point
// or reach the end bound.
const paragraph = 16
b := make([]byte, paragraph)

var addr int
for addr = start; addr < end; addr += paragraph {
	_, _ = io.ReadFull(mem, b)

	// Both the 32-bit and 64-bit entry point have a similar prefix.
	if bytes.HasPrefix(b, []byte("_SM")) {
		return addr, nil
	}
}

Now that we’ve discovered the location of the entry point in memory, we can begin decoding the entry point structure. Depending on your machine, you may encounter a 32-bit or 64-bit SMBIOS entry point.

// Prevent unbounded reads since this structure should be small.
b, _ := ioutil.ReadAll(io.LimitReader(mem, 64))
if l := len(b); l < 4 {
	return nil, fmt.Errorf("too few bytes for SMBIOS entry point magic: %d", l)
}

// Did we find a 32-bit entry point, or 64-bit entry point?
switch {
case bytes.HasPrefix(b, []byte("_SM_")):
	return parse32(b)
case bytes.HasPrefix(b, []byte("_SM3_")):
	return parse64(b)
}

I’ll spare you the details of each entry point structure, but they contain some key information exposed by the dmidecode utility, as discussed previously:

  • the version of SMBIOS present on the machine
  • how many SMBIOS structures are available, and how many bytes they occupy
  • the memory address of the SMBIOS structures table

With this information, we can finally begin decoding the structures table.

Decoding the SMBIOS structure table in Go

Each SMBIOS structure contains a header that indicates:

  • the type of the structure (BIOS information, memory information, etc.)
  • the length of the structure in bytes
  • a “handle” that can be used to point to related information in another structure

Following the header, each structure contains a “formatted” section that carries arbitrary bytes, and optionally, zero or more “strings” that the formatted section can point to. This data must be decoded in a manner specific to each structure type, as laid out in the SMBIOS specification.

To start: let’s jump to the location of the table and begin decoding it. The structures stream continues until a special “end of table” structure is reached.

// Seek to the address of the structures table, and set up a decoder.
_, _ = mem.Seek(addr, io.SeekStart)
dec := newDecoder(mem)

var ss []*Structure
for {
	s, _ := dec.next()
	ss = append(ss, s)

	// End-of-table structure indicates end of stream.
	if s.Header.Type == typeEndOfTable {
		break
	}
}

Within our decoder.next method, we must deal with each structure’s header, formatted section, and zero or more strings:

// Decode the header structure.
h, _ := dec.parseHeader()

// Length of formatted section is length specified by header, minus
// the length of the header itself.
l := int(h.Length) - headerLen
fb, _ := dec.parseFormatted(l)

// Strings may or may not be present; only advance the decoder
// if they are.
ss, _ := dec.parseStrings()

return &Structure{
	Header:    *h,
	Formatted: fb,
	Strings:   ss,
}

This process continues until the end of table structure is reached, or an EOF is returned by the stream.

Decoding memory DIMM information from SMBIOS structures in Go

As previously mentioned, the formatted section and strings in an SMBIOS structure can be used to retrieve information stored in a specific format. The SMBIOS specification can be used as a reference for the format of individual structures.

With that said, let’s take a look back at our problem statement:

How can we discover the capacity of each memory DIMM in a machine?

The structure we want to decode is the “Memory Device (Type 17)” structure. One of these structures appears in the SMBIOS stream, per DIMM slot on the motherboard.

// Only look at memory devices.
if s.Header.Type != 17 {
	continue
}

// Formatted section contains a variety of data, but only parse the DIMM size.
size := int(binary.LittleEndian.Uint16(s.Formatted[8:10]))
// String 0 is the DIMM slot's identifier.
name := s.Strings[0]

// If 0, no DIMM present in this slot.
if size == 0 {
	fmt.Printf("[% 3s] empty\n", name)
	continue
}

// An extended uint32 DIMM size field appears if 0x7fff is present in size.
if size == 0x7fff {
	size = int(binary.LittleEndian.Uint32(s.Formatted[24:28]))
}

// Size units depend on MSB.  Little endian MSB for uint16 is in second byte.
// 0 means megabytes, 1 means kilobytes.
unit := "KB"
if s.Formatted[9]&0x80 == 0 {
	unit = "MB"
}

fmt.Printf("[% 3s] DIMM: %d %s\n", name, size, unit)

Now that we’ve put this all together, we can see the results from two of my Linux machines at home:

desktop $ sudo ./lsdimms
SMBIOS 2.7.0
[ChannelA-DIMM0] DIMM: 4096 MB
[ChannelA-DIMM1] DIMM: 4096 MB
[ChannelB-DIMM0] DIMM: 4096 MB
[ChannelB-DIMM1] DIMM: 4096 MB
server $ sudo ./lsdimms
SMBIOS 3.0.0
[DIMM 0] empty
[DIMM 1] DIMM: 16384 MB
[DIMM 0] empty
[DIMM 1] DIMM: 16384 MB

There are dozens of other structure type available, but with this information, we can now see the exact configuration and capacity of memory DIMMs in my machines.

Summary

As you can see, a great deal of useful information about your machine can be exposed using SMBIOS. Check out the dmidecode utility to see what kind of information is available! If you’d like to incorporate this data in your Go programs, I recommend that you check out the go-smbios package.

This package handles the nitty-gritty details of exposing SMBIOS data from a variety of operating systems (most UNIX-like systems, but I’d love to add macOS and Windows support!). At this time, it doesn’t contain any code for decoding specific structure types, but this is something I’d love to incorporate in a higher-level package in the future! If you’d like to collaborate, please reach out!

Finally, if you ever find yourself working on a text or binary format parser, I highly encourage trying out go-fuzz to discover any potential crashes in your parser. go-fuzz is an invaluable tool, and liberal use of it will save you many headaches down the road. For a great introduction to go-fuzz, check out Damian Gryski’s detailed walkthrough.

If you have any questions, feel free to contact me: “mdlayher” on Gophers Slack! You can also find me on both GitHub and Twitter with the same username.

Links

2017-12-18

When I’m writing web applications, I don’t tend use the latest JavaScript front-end hotness, instead I prefer to reload the entire page on a request, or at the most, a small section of it. Today, however, many developers love to write JSON back ends and write their front-end logic using JavaScript.

In this article, we’re going to do just that. We’re going to create a small Buffalo application that speaks JSON, and we’ll create a small front-end to talk to that using Vue.js. All I ask is your understanding that I’m not a front-end developer, so I’m sure there are plenty of improvements that could be made to my Vue code. Be gentle.

NOTE: To follow along with this article you will need Buffalo installed along with Node/NPM for the front end assets.

The Go Side

The first step on our journey to front-end greatness starts with creating a new Buffalo application. Since Buffalo is Go code, we have to create our project inside our GOPATH.

$ cd $GOPATH/src/github.com/<username>
$ buffalo new vuer
$ cd vuer
$ buffalo db create -a

When generating the new Buffalo application we could have used the --api, which sets the application up to be a JSON API, and doesn’t install any of the front-end services such as templates, JavaScript, CSS, etc… however, since we need those for this particular application we decided to generated a “full” application, instead of a bare bones API application.

We also, using the buffalo db create -a command, setup all of the databases for this application as defined in the application’s database.yml.

NOTE: If you are following along at home, you might need to change the settings inside of the database.yml to suit your needs.

Generating Resources

Next, let us create some resources for application to talk to. This application will hold information about bands and their members.

$ buffalo generate resource band name bio:text --type=json
$ buffalo generate resource member name instrument band_id:uuid --type=json
$ buffalo db migrate

By default when generating resources in Buffalo they are generated as HTML resources, but since this application wants to speak JSON we can use the --type=json flag to tell the resource generator to use JSON responses, and not create any HTML templates.

Inside of the actions/app.go we should now have two lines that look like this:

// actions/app.go

app.Resource("/bands", BandsResource{&buffalo.BaseResource{}})
app.Resource("/members", MembersResource{&buffalo.BaseResource{}})

Let’s create a new api group and hang these resources off of that. To do that we can create a new group, api, and hang the BandsResource off of it. We’re going to also want to nest the MembersResource under the band group as well.

// actions/app.go

api := app.Group("/api")
band := api.Resource("/bands", BandsResource{&buffalo.BaseResource{}})
band.Resource("/members", MembersResource{&buffalo.BaseResource{}})

With these changes in place if we were to print off our table, buffalo task routes, it would look something similar to this:

METHOD | PATH
------ | ----
get    | /api/bands
post   | /api/bands
get    | /api/bands/new
get    | /api/bands/{band_id}
put    | /api/bands/{band_id}
delete | /api/bands/{band_id}
get    | /api/bands/{band_id}/edit
get    | /api/bands/{band_id}/members
post   | /api/bands/{band_id}/members
get    | /api/bands/{band_id}/members/new
get    | /api/bands/{band_id}/members/{member_id}
put    | /api/bands/{band_id}/members/{member_id}
delete | /api/bands/{band_id}/members/{member_id}
get    | /api/bands/{band_id}/members/{member_id}/edit

Now that we have generated and mapped all of our resources we need to tweak the MembersResource so that it is scoped to the requested band. We don’t want to show members of the Rolling Stones if someone is requested the members of the Beatles.

For example, in MembersResource#List we would change the call that finds all of the members to scope it to the band_id on the request.

// actions/members.go

// before
if err := q.All(members); err != nil {
  return errors.WithStack(err)
}

// after
if err := q.Where("band_id = ?", c.Param("band_id")).All(members); err != nil {
  return errors.WithStack(err)
}

After making these changes in the MembersResource#List, MembersResource#Create, MembersResource#Update, and MembersResource#Destroy actions we are almost finished with the Go side of the application.

The final step, before we can move the JavaScript side is to set a catch-all route. A catch-all route will allow for the application to accept any URL we haven’t already defined and let the Vue router handle those requests instead.

In the actions/app.go file we can add this catch-all route right before the route mapping / to the HomeHandler.

// actions/app.go

app.GET("/{path:.+}", HomeHandler)

With that we are finished with the Go side of the application. Now we can turn our attention to hooking up Vue.js.

The JavaScript Side

To get started on the JavaScript side we first need to install four Node modules, to do this we will use Yarn. These modules will allow us access to Vue, a router for Vue, and a few other pieces like the ability to compile Vue templates.

$ yarn add vue vue-loader vue-router vue-template-compiler

With the proper modules installed we need to tell Webpack how to work with these modules. To do that we need to add the following entry to the webpack.config.js file that Buffalo generates.

// webpack.config.js

// ...
modules.exports = {
  resolve: {
    alias: {
      vue$: `${__dirname}/node_modules/vue/dist/vue.esm.js`,
      router$: `${__dirname}/node_modules/vue-router/dist/vue-router.esm.js`
    }
  },
  // ...
}
// ...

With all the glue in place, and the proper modules installed, we can write our Vue application. Since this article isn’t about learning Vue, and since I’m not a Vue expert, I’m going to simply show you the code I wrote to make my simple application work.

// assets/js/application.js

require("expose-loader?$!expose-loader?jQuery!jquery");
require("bootstrap-sass/assets/javascripts/bootstrap.js");

import Vue from "vue";
import VueRouter from "router";
Vue.use(VueRouter);

import BandComponent from "./components/band.vue";
import MembersComponent from "./components/members.vue";

const routes = [
  {path: "/band/:id", component: MembersComponent, name: "showBand"},
  {path: "/", component: BandComponent}
];

const router = new VueRouter({
  mode: "history",
  routes
});

const app = new Vue({
  router
}).$mount("#app");
// assets/js/components/band.vue

<template>
<div>
  <h1 class="page-header">Bands</h1>

  <ul class="list-unstyled">
    <li v-for="band in bands">
      <router-link :to='{name: "showBand", params: {id: band.id}}'>
        <h2>
          {{ band.name }}
        </h2>
      </router-link>
    </li>
  </ul>
</div>
</template>

<script charset="utf-8">
export default {
  data() {
    return {
      bands: []
    };
  },

  created() {
    this.fetchData();
  },

  watch: {
    $route: "fetchData"
  },

  methods: {
    fetchData: function() {
      let req = $.getJSON("/api/bands");
      req.done(data => {
        this.bands = data;
      });
    }
  }
};
</script>
// assets/js/components/members.vue

<template>
<div>
  <h1 class="page-header">{{band.name}}</h1>

  <blockquote>
    {{band.bio}}
  </blockquote>

  <ul class="list-unstyled">
    <li v-for="member in members">
      <h2>
        {{member.name}} - {{member.instrument}}
      </h2>
    </li>
  </ul>

</div>
</template>

<script charset="utf-8">
export default {
  data() {
    return {
      band: {},
      members: {}
    };
  },

  created() {
    this.fetchData();
  },

  watch: {
    $route: "fetchData"
  },

  methods: {
    fetchData: function() {
      let id = this.$route.params.id;

      let req = $.getJSON(`/api/bands/${id}`);
      req.done(data => {
        this.band = data;
      });

      req = $.getJSON(`/api/bands/${id}/members`);
      req.done(data => {
        this.members = data;
      });
    }
  }
};
</script>

In order to get *.vue files to work with Webpack, we need to add a rule that tells Webpack to use the vue-loader plugin to process those files for us. We can update the webpack.config.js file and add a rule to that affect.

// webpack.config.js

// ...
modules.exports = {
  // ...
  module: {
    rules: [
      // ...
      {
        test: /\.vue/,
        loader: "vue-loader"
      },
      // ...
    ]
    // ...
  }
  // ...
}

Putting It All together

With all of that in place we are almost ready to start our application and try it out. We just need one more bit of glue code, and a script to seed the database with a few bands to start with.

Let’s start with the glue code. In order to start the Vue application we need to give it an HTML element to bind to. To do this we can replace the contents of templates/index.html with the following.

// templates/index.html

<div id="app">
  <router-link to="/">Home</router-link>
  <router-view></router-view>
</div>

The above code will not only allow Vue to bind to the page, but it also provides an element for the Vue router to attach to and replace the content of as we navigate pages.

Finally, let’s add a script to seed the database with a few bands. When we generated the application, Buffalo, created a new file, grifts/db.go. This file contains a db:seed task. The purpose of this task is to let us write a seed script for our database.

We can replace the placeholder of this script with the following:

var _ = grift.Namespace("db", func() {

  grift.Desc("seed", "Seeds a database")
  grift.Add("seed", func(c *grift.Context) error {
    if err := models.DB.TruncateAll(); err != nil {
      return errors.WithStack(err)
    }

    band := &models.Band{
      Name: "The Beatles",
      Bio:  "4 fun loving lads from Liverpool.",
    }
    if err := models.DB.Create(band); err != nil {
      return errors.WithStack(err)
    }
    members := models.Members{
      {Name: "John Lennon", Instrument: "Guitar"},
      {Name: "Paul McCartney", Instrument: "Bass"},
      {Name: "George Harrison", Instrument: "Guitar"},
      {Name: "Ringo Starr", Instrument: "Drums"},
    }
    for _, m := range members {
      m.BandID = band.ID
      if err := models.DB.Create(&m); err != nil {
        return errors.WithStack(err)
      }
    }

    band = &models.Band{
      Name: "The Monkees",
      Bio:  "4 fun loving lads assembled by a TV studio",
    }
    if err := models.DB.Create(band); err != nil {
      return errors.WithStack(err)
    }
    members = models.Members{
      {Name: "Mike Nesmith", Instrument: "Guitar"},
      {Name: "Davy Jones", Instrument: "Voice"},
      {Name: "Peter Tork", Instrument: "Guitar"},
      {Name: "Mikey Dolenz", Instrument: "Drums"},
    }
    for _, m := range members {
      m.BandID = band.ID
      if err := models.DB.Create(&m); err != nil {
        return errors.WithStack(err)
      }
    }

    return nil
  })

})

This script can be run like such:

$ buffalo task db:seed

Buffalo uses the grift task runner for these types of simple, repeatable scripts.

With seed data in place we can launch the application in development mode.

$ buffalo dev

The buffalo dev command will not only start the application at http://localhost:3000, it also watches your Go files, and assets, for changes. If there are changes then Buffalo will recompile your application, and/or assets, and restart it.

Conclusion

In this article we built a brand new, database backed, Buffalo application that speaks JSON. We also quickly built a single page application on top of the Buffalo application, thanks to Buffalo’s Webpack asset pipeline.

Hopefully this article has inspired you to try your favorite framework on top of Buffalo. Perhaps using GopherJS. :)

Full source code can be found at https://github.com/gobuffalo/vuerecipe.


About the Author

Mark is the co-founder of PaperCall.io, a platform for connecting technical events with high quality content and speakers. Mark is also a partner at Gopher Guides, the industry leader for Go training and conferences. In his spare time Mark leads development of the Go web framework Buffalo.

2017-12-17

Today we will see how we can interact with WebAssembly, from Go: how to execute WebAssembly bytecode from Go and how to generate WebAssembly bytecode with Go. But first of all: what is WebAssembly?

WebAssembly

According to webassembly.org, WebAssembly (wasm for short) is a new portable, size- and load-time-efficient format suitable for compilation to the web. In a way, wasm is the next evolution of asm.js and PNaCl: it’s a new way to run code, on the web. But faster.

WebAssembly is currently being designed as an open standard by a W3C Community Group that includes representatives from all major browsers. As of late 2017, WebAssembly is supported by all major web browsers (Chrome, Edge, Firefox and Safari.)

With wasm, developers get:

  • a compact, binary format to send over the wire and incorporate into their project,
  • a near-native performance boost,
  • performant and safe code executed inside the browser sand box,
  • another language (besides JavaScript) to develop their projects: one can target wasm from C/C++, Rust, … (and, eventually, Go.)

wasm format

WebAssembly is formally defined by a set of specifications. The wasm binary format is described here.

A .wasm file is the result of the compilation of C/C++, Rust or Go code with the adequate toolchain. It contains bytecode instructions to be executed by the browser or any other program that can decode and interpret that binary format.

A .wasm file contains a wasm module. Every wasm module starts with a magic number \0asm (ie: []byte{0x00, 0x61, 0x73, 0x6d}) and then a version number (0x1 at the moment.) After that come the different sections of a module:

  • the types section: function signature declarations,
  • the imports section: imports declarations,
  • the functions section: function declarations,
  • the tables section: indirect function table,
  • the memories section,
  • the globals section,
  • the exports section,
  • the start function section: the func main() equivalent,
  • the code segments section: function bodies, and
  • the data segments section.

The full description with all the details is available here. A more gentle and less dry introduction to wasm can be found at https://rsms.me/wasm-intro.

A toolchain like emscripten will thus take a set of C/C++ source code and generate a .wasm file containing type definitions, function definitions, function bodies with the corresponding wasm instructions (e.g.: i32.store to store a signed 32b integer to memory, if and return control instructions, etc…)

We will see how to generate .wasm files in a bit but let us first play with it. Consider this basic.wasm:

$> curl -O -L https://github.com/go-interpreter/wagon/raw/master/exec/testdata/basic.wasm
$> ls -lh ./basic.wasm
-rw-r--r-- 1 binet binet 38 Dec 12 17:01 basic.wasm

$> file ./basic.wasm 
./basic.wasm: WebAssembly (wasm) binary module version 0x1 (MVP)

$> hexdump -C ./basic.wasm 
00000000  00 61 73 6d 01 00 00 00  01 05 01 60 00 01 7f 03  |.asm.......`....|
00000010  02 01 00 07 08 01 04 6d  61 69 6e 00 00 0a 07 01  |.......main.....|
00000020  05 00 41 2a 0f 0b                                 |..A*..|
00000026

So all the tools at our disposal agree: it is indeed a wasm file. But couldn’t we do something to extract some more informations about that file?

Like for object files, there is an objdump-like command that allows to inspect the contents of a binary wasm file. The wabt project provides a wasm-objdump command, written in C/C++. But as this is a Go advent post, we will instead use the one provided by wagon:

$> go get github.com/go-interpreter/wagon/cmd/wasm-dump
$> wasm-dump -help
Usage: wasm-dump [options] file1.wasm [file2.wasm [...]]

ex:
 $> wasm-dump -h ./file1.wasm

options:
  -d	disassemble function bodies
  -h	print headers
  -s	print raw section contents
  -v	enable/disable verbose mode
  -x	show section details

Running it with basic.wasm as argument gives:

$> wasm-dump -h ./basic.wasm
./basic.wasm: module version: 0x1

sections:

     type start=0x0000000a end=0x0000000f (size=0x00000005) count: 1
 function start=0x00000011 end=0x00000013 (size=0x00000002) count: 1
   export start=0x00000015 end=0x0000001d (size=0x00000008) count: 1
     code start=0x0000001f end=0x00000026 (size=0x00000007) count: 1

This basic.wasm file has 4 sections.

Let’s dig deeper:

$> wasm-dump -x ./basic.wasm 
./basic.wasm: module version: 0x1

section details:

type:
 - type[0] <func [] -> [i32]>
function:
 - func[0] sig=0
export:
 - function[0] -> "main"

This wasm module exports a function "main", which takes no argument and returns an int32.

The content of these sections is the following:

$> wasm-dump -s ./basic.wasm 
./basic.wasm: module version: 0x1

contents of section type:
0000000a  01 60 00 01 7f                                    |.`...|

contents of section function:
00000011  01 00                                             |..|

contents of section export:
00000015  01 04 6d 61 69 6e 00 00                           |..main..|

contents of section code:
0000001f  01 05 00 41 2a 0f 0b                              |...A*..|

If you read wasm speak fluently you won’t be surprised by the content of the next snippet, showing the disassembly of the func[0]:

$> wasm-dump -d ./basic.wasm
./basic.wasm: module version: 0x1

code disassembly:

func[0]: <func [] -> [i32]>
 000000: 41 2a 00 00 00             | i32.const 42
 000006: 0f                         | return
 000008: 0b                         | end

It puts the int32 constant 42 on the stack and returns it to the caller. It’s just the wasm equivalent of:

func f0() int32 {
	return 42
}

Can we test this?

Executing wasm with wagon

wagon actually exposes a very limited non-interactive (yet!) interpreter of wasm: wasm-run.

$> go get github.com/go-interpreter/wagon/cmd/wasm-run
$> wasm-run -h
Usage of wasm-run:
  -v	enable/disable verbose mode
  -verify-module
    	run module verification

Let’s try it on our basic.wasm file:

$> wasm-run ./basic.wasm
main() i32 => 42 (uint32)

Victory !

wasm-run is a rather simple and limited (yet!) wasm embedder:

  • it reads the provided wasm file,
  • it (optionally) verifies the wasm module,
  • it creates a VM with go-interpreter/wagon/exec, that VM will execute the start section of the module (if any)
  • it runs all the exported functions that take no input parameters

and voila!

Ok, but wasn’t wasm designed for the web and its browsers?

Executing wasm in the browser

Switching gears, let us write a little web server that will serve a simple wasm module:

package main

import (
	"encoding/hex"
	"flag"
	"fmt"
	"html"
	"io"
	"log"
	"net/http"
)

func main() {
	addr := flag.String("addr", ":5555", "server address:port")
	flag.Parse()
	http.HandleFunc("/", rootHandle)
	http.HandleFunc("/wasm", wasmHandle)

	log.Printf("listening on %q...", *addr)
	log.Fatal(http.ListenAndServe(*addr, nil))
}

func rootHandle(w http.ResponseWriter, r *http.Request) {
	fmt.Fprintf(w, page, html.EscapeString(hex.Dump(wasmAdd)))
}

func wasmHandle(w http.ResponseWriter, r *http.Request) {
	w.Header().Set("Content-Type", "application/wasm")
	n, err := w.Write(wasmAdd)
	if err != nil {
		http.Error(w, err.Error(), http.StatusServiceUnavailable)
		return
	}
	if n != len(wasmAdd) {
		http.Error(w, io.ErrShortWrite.Error(), http.StatusServiceUnavailable)
	}
}

var wasmAdd = []byte{
	0x00, 0x61, 0x73, 0x6d, 0x01, 0x00, 0x00, 0x00,
	0x01, 0x07, 0x01, 0x60, 0x02, 0x7f, 0x7f, 0x01,
	0x7f, 0x03, 0x02, 0x01, 0x00, 0x07, 0x07, 0x01,
	0x03, 0x61, 0x64, 0x64, 0x00, 0x00, 0x0a, 0x09,
	0x01, 0x07, 0x00, 0x20, 0x00, 0x20, 0x01, 0x6a,
	0x0b,
}

const page = `
<html>
	<head>
		<title>Testing WebAssembly</title>
		<script type="text/javascript">

		function fetchAndInstantiate(url, importObject) {
			return fetch(url).then(response =>
				response.arrayBuffer()
			).then(bytes =>
				WebAssembly.instantiate(bytes, importObject)
			).then(results =>
			    results.instance
			);
		}

		var mod = fetchAndInstantiate("/wasm", {});

		window.onload = function() {
			mod.then(function(instance) {
				var div = document.getElementById("wasm-result");
				div.innerHTML = "<code>add(1, 2)= " + instance.exports.add(1, 2) + "</code>";
			});
		};

		</script>
	</head>

	<body>
		<h2>WebAssembly content</h2>
		<div id="wasm-content">
			<pre>%s</pre>
		</div>

		<h2>WebAssembly</h2>
		<div id="wasm-result"><code>add(1, 2)= N/A</code></div>
	</body>
</html>
`

Running this in a terminal:

$> go run ./main.go 
2017/12/14 12:45:21 listening on ":5555"...

and then navigating to that location, you should be presented with:

WebAssembly content

00000000  00 61 73 6d 01 00 00 00  01 07 01 60 02 7f 7f 01  |.asm.......`....|
00000010  7f 03 02 01 00 07 07 01  03 61 64 64 00 00 0a 09  |.........add....|
00000020  01 07 00 20 00 20 01 6a  0b                       |... . .j.|

WebAssembly
add(1, 2)= 3

Victory! again!

For more informations about the JavaScript API that deals with WebAssembly, there are these useful references:

But, up to now, we have only been able to inspect and execute already existing wasm files. How do we create these files?

Generating wasm

We briefly mentioned at the beginning of this post that wasm files could be generated from C/C++ (using emscripten) or from Rust (using cargo or rustup). Instructions related to these tasks are available here:

Compiling Go code to wasm is also doable, but the support for this backend hasn’t been yet integrated into gc. An issue is tracking the progress of this feature: https://github.com/golang/go/issues/18892. As that discussion is quite long, here is the executive summary: a development branch with preliminary support for wasm has been created by @neelance (Richard Musiol) (yeah!).

Here are the instructions to compile a gc toolchain with a GOOS=js GOARCH=wasm environment:

$> cd somewhere
$> git clone https://go.googlesource.com/go
$> cd go
$> git remote add neelance https://github.com/neelance/go
$> git fetch --all
$> git checkout wasm-wip
$> cd src
$> ./make.bash
$> cd ../misc/wasm

The misc/wasm directory contains all the files (save the actual wasm module) to execute a wasm module with nodejs.

Let us compile the following main.go file:

package main

func main() {
	println("Hello World, from wasm+Go")
}

with our new wasm-capable go binary:

$> GOARCH=wasm GOOS=js go build -o test.wasm main.go
$> ll
total 4.0K
-rw-r--r-- 1 binet binet   68 Dec 14 14:30 main.go
-rwxr-xr-x 1 binet binet 947K Dec 14 14:30 test.wasm

Copy over the misc/wasm files under this directory, and then finally, run the following server.go file:

package main

import (
	"flag"
	"log"
	"net/http"
)

func main() {
	addr := flag.String("addr", ":5555", "server address:port")
	flag.Parse()
	srv := http.FileServer(http.Dir("."))
	log.Printf("listening on %q...", *addr)
	log.Fatal(http.ListenAndServe(*addr, srv))
}

Like so:

$> ll
total 968K
-rw-r--r-- 1 binet binet   68 Dec 14 14:30 main.go
-rw-r--r-- 1 binet binet  268 Dec 14 14:38 server.go
-rwxr-xr-x 1 binet binet 947K Dec 14 14:30 test.wasm
-rw-r--r-- 1 binet binet  482 Dec 14 14:32 wasm_exec.html
-rwxr-xr-x 1 binet binet 7.9K Dec 14 14:32 wasm_exec.js

$> go run ./server.go
2017/12/14 14:39:18 listening on ":5555"...

Navigating to localhost:5555/wasm_exec.html will present you with a [Run] button that, when clicked should display "Hello World, from wasm+Go" in the console.

We’ve just had our browser run a wasm module, generated with our favorite compilation toolchain!

Conclusions

In this blog post, we have:

  • learned about some of the internals of the wasm binary format,
  • inspected wasm files,
  • interpreted a wasm file,
  • served a wasm file via net/http, and
  • compiled a wasm module with a modified go toolchain.

I hope this has inspired some of you to try this at home.

WebAssembly is poised to take the Web by storm, bring “native” performances to the Web platform and allow developers to use other languages besides JavaScript to build Web applications.

But even if WebAssembly has been designed for the Web, nothing prevents it from being used outside of the Web. Indeed, the wasm binary format and the bytecode that it contains can very well become a very popular (and effective) Intermediate Representation format, like LLVM’s IR. For example, the go-interpreter/wagon project could build (Help wanted) a complete interpreter in Go, generating wasm bytecode from Go source code, and then executing that wasm bytecode. “Building an interpreter in Go, for Go”… what’s not to love!? This could even be used as a backend for Neugram, but that’s another story…