Go Learnings and Gotchas

In the second half of 2019, I transferred onto a project that was written in Go. This was the highest priority project in the company, and I found myself in a position to have to learn this language as fast as possible in order to remove myself from the critical path.

Luckily, the language itself is actually fairly simple and well documented and the official tutorials are well-structured. I wholeheartedly recommend A Tour of Go followed by Effective Go — reading through these should take about a day or so and give you a sense of what the language can do for you.

But not everything I needed to do was trivial and I was taken aback more than once spending hours trying to get to something working when I felt like it should have taken minutes. Other times, I was actually thrilled to see lesser-known capabilities of the language shine and allow myself to be as expressive as I needed to be. This post is an ongoing collection of those times, and what I’ve learned from them.

Build Constraints

Build constraints are directives to the Go tooling to either include or exclude a given file. There exists a set of default tags but you may also define your own custom tags:

// +build foo

Tags may be OR’d, AND’d and NOT’d:

// +build foo,!bar baz
// +build qux

This evaluates to ((foo AND NOT bar) OR baz) AND qux — in other words, if someone tries to run any Go tool and sets a -tags=bar,qux, the file in the example above will not be considered.

Tags apply to most of the Go toolchain: the compiler (build), but also the testing and the lint tooling. Files will be included by the tooling if their tag directive(s) match the tags currently active in the build. By default, Go has a set of tags that let you match the host OS, the architecture of the CPU, the current version of Go, etc… Any tool that iterates over Go files will have a way to specify the tags you would like to include.

Tags are useful to define category of files that you want to include conditionally — for instance, you may want have a tag for // +build integration tests — since they are generally slower to run and might require a specific setup. As an example, let’s say you have code accessing S3 that you need to write a test for: clients/s3/s3_test.go

// +build integration

package s3

import (
    "testing"
)

func TestS3(t *testing.T) {
    // Actual testing code here
}

Simply running go test ./... would not include the integration test by default, since the constraint is not set. To run the integration test, you would need to run:

go test -tags=integration ./clients/s3/

Field Tags

When you define a struct type, the fields declared in it may be annotated with tags, which are declared as a string literal that follows the field declaration:

type Foo struct {
    bar int "this is a tag"
    baz string `tag declared with backticks in order to use a literal "`
} 

Those tags become available at runtime, through the reflect package. They are represented by the reflect.StructTag type. This type implements a convention that a tag to specify multiple values, each keyed by a name and enclosed in quotes: `foo:"bar" baz:"qux"` is tag with two keys: foo and baz.

Tagging fields is relevant in the context where a struct is a model that may be serialized in and out of the Go runtime. In such cases, the tag for a given field can contain a pseudo configuration for how that field should be written or read. The list of well-known struct tags is informative, and links to popular serialization formats such as JSON and XML. But certain ORMs and cloud providers have also opted for an approach that uses tags to map Go fields to column names.

Tags allow programmers to define the external shape of a type at once and across a range of formats. The directives present in each tags' value can be as complex as the serialization scheme allows. The DynamoDB client is such an example:

type EmailThread struct {
    Labels []string `dynamodbav:"labels,omitemptyelem,stringset"`
    Title string `dynamodbav:"title"`
    LastEmailSentAt time.Time `dynamodbav:"lastEmailSentAt,unixtime"`
} 

Enumerations

This is a head scratcher: there is no strongly-typed enumerations in Go. Enum-like types should be defined as type aliases for int and their values declared as consts whose first value is iota. iota is a special variable in the Go language:

For a language seeking to reduce confusion and surprises, this is immensely confusing and surprising. Practically, you define an enum in the following way:

type HTTPMethod int

const (
    Get HTTPMethod = iota
    Post
    Put
    Delete
)

Since the Go type-system doesn’t provide first-level support for enumerated types, the names of the constants are lost in the compilation. In other words, you must write custom code to get the name an enumerated value at runtime. This is commonly done by defining a String() function:

func (m HTTPMethod) String() string {
    return []string{"GET", "POST", "PUT", "DELETE"}[m]
}

Note that the order of the strings in the array must be in the same order as the one in which the individual constant declarations is done, and there is no compile-time guarantee that you will not go out of bounds, which would happen if you defined a new value in the enumeration but forget to add it to the array of names.

JSON

The encoding/json package encapsulates the logic for serializing structs into JSON and deserializing JSON into structs. It’s a reflection-based engine that will do a decent job by default and that may be customized using field tags. In case you need full control of how struct is serialized, you need to implement custom interfaces.

A good example of such a need is enumerations: if they are defined used iota values, a value will by default be encoded as an integer. If you prefer to have a string, you can make the type implement the Marshaller interface, which has a single MarshalJSON function:

type HTTPMethod int

const (
    Get HTTPMethod = iota
    Post
    Put
    Delete
)

func (m HTTPMethod) String() string {
    return methods[m]
}

var methods = []string{"GET", "POST", "PUT", "DELETE"}
var methodsByName = map[string]HTTPMethod{
    "GET": Get,
    "POST": Post,
    "PUT": Put,
    "DELETE": Delete,
}

func (m HTTPMethod) MarshalJSON() ([]byte, error) {
    return json.Marshal(m.String())
}

Conversely, you would need to implement the Unmarshaler interface to convert the string back into the enumerated value — you might want to refactor the previously defined String() method to share the string constants between the marshalling and unmarshalling functions

func (m *HTTPMethod) UnmarshalJSON(b []byte) error {
    var s string
    if err := json.Unmarshal(b, &s); err != nil {
        return err
    }
    value, ok := methodsByName[s]
    if !ok {
        return errors.Errorf("Unable to unmarshal HTTPMethod value %s", s)
    }
    *m = value
    return nil
}

Time formatting

This is just a complete head scratcher and there’s no way around it except biting the bullet: Go’s built-in time formatting doesn’t use traditional format strings such as YYYY-MM-DD but instead chose an unambiguous fixed point in time that you have to write your desired format in. That point in time is January 2nd 2006, 3:04pm and 5 seconds, in the GMT -7 timezone. Which, written another way is: 01/02 03:04:05PM '06 -0700 — 1, 2, 3, 4…

So, date formats are actually written as a plain date, e.g. ISO 8601 datetime is: 2006-01-02T15:04:05. Ref

Decorating an interface implementation

Go’s type system allows for declaring structs that wrap an interface implementation and decorate specific methods. There’s no special syntax for it — it works just like interface composition. Let’s say you have a naive in-memory database that stores models that have an identifier and a name:

type Model struct {
    Id uint64
    Name string
}

type Database interface {
    Create(name string) *Model
    Save(model *Model)
    Get(id uint64) *Model
}

type database struct {
    models map[uint64]*Model
}

func (db *database) Create(name string) *Model {
    id := rand.Uint64()
    db.models[id] = &Model{Id: id, Name: name}
    return db.models[id]
}

func (db *database) Save(model *Model) {
    db.models[model.Id] = model
}

func (db *database) Get(id uint64) *Model {
    return db.models[id]
}

func NewDatabase() Database {
  return &database{models: make(map[uint64]*Model)}
}

The Database interface is public, and as such may be relied-upon by external code. Now let’s say you want to track the “popularity” of a certain model, say, by the number of times it is accessed by Get. Your options are:

  1. Modify the database implementation to add a counter - but any other implementation of the Database interface will not inherit that behavior.
  2. Implement your own Database, which is not great because there are two methods out of three you don’t care about at all

The third option is to decorate the Database interface:

type trackingDecorator struct {
    Database // wrap an arbitrary implementation of Database 
    getCounter map[string]int32
}

// Override the method you care about
func (td *trackingDecorator) Get(id uint64) *Model {
    model := td.Database.Get(id)
    if model != nil {
        td.getCounter[model.Name] += 1
    }
    return model
}

func NewTrackingDecorator(wrapped Database) Database {
    return &popularityTrackingDatabase{
        Database:     wrapped,
        getCounter:    make(map[string]int32),
    }
}

Declaring the Database interface as an anonymous field of the trackingDecorator struct will automatically make the latter implement the interface. Upon construction, you will need to supply an actual implementation you want wrapped:

Most importantly: in the second case, the methods on the wrapped instance can be called from the overridden code e.g. td.Database.Get(id) in the example above. This lets you decorate select methods of arbitrary interfaces, even if you can’t access the underlying implementation.

Testing

GoMock .Do()

When you use GoMock, you can use custom matchers in order to assert on the correctness of the arguments received by your mocked functions. Matchers are nice but they need to be reusable to prove their worth. There are certain cases when you don’t know (or don’t care) about the full argument, e.g. you just want to make sure that a dictionary containing headers has the one key-value pair you care about, or that a URL contains just the one param you need.

GoMock’s .Do is designed for cases like this: it lets you capture the parameters passed to the mocked function and run arbitrary assertions on them. Say you have a raw client for a “TaskQueue” and a class wrapping that client to provide convenience functions to enqueue messages of a certain type to certain routes:

// This is the mocked interface - not going into the details of what a Task contains but assume
// that it is a complex object with lots of fields and implementation details.
type TaskScheduler interface {
    ScheduleTask(ctx context.Context, task *Task) (*ScheduleResponse, error)
}

func TestEnqueueMyTask(t *testing.T) {
    t.Parallel()
    ctrl := gomock.NewController(t)
    defer ctrl.Finish()

    ts := taskScheduler.NewMockTaskScheduler(ctrl)
    // Pass `gomock.Any()` in lieu of asserting on a fully built Task
    client.EXPECT().ScheduleTask(gomock.Any(), gomock.Any()).
        Return(&tq.ScheduleResponse{}, nil).
        // The signature of the function passed to `.Do` should be the same as `ScheduleTask`
        // but we don't care about the context
        Do(func(_ context.Context, task *Task) {
            // You can run assertions on just the parts of Task that you care about
            assert.Equal(t, tq.Method_PUT, task.Payload.Method)
            // etc…
    })

    controller := MyController(scheduler: ts)
    err := controller.MethodThatSchedulesATaks(context.Background(), )
    assert.NoError(t, err)
}

Skipping tests

It’s useful every now and then to disable a test without wanting to delete the code. You can use Skip or Skipf to do that:

func TestProductionRequestEndToEnd(t *testing.T) {
    t.Skipf("Disabled because of the refactoring in FOO-539")
    ctx := context.Background()
    tc := getTestConfig()
    
}

This is better than renaming the request or commenting the code out because the test runner will print a message every time the test is run, reminding you that it should be re-enabled at some later point in time.

Running a specific test

The go test command has a -run flag which may be used to specify a specific test method to run, but the actual syntax is incredibly subdocumented. You need to pass in the directory in which the test is, not just the test name:

go test -run=TestFoo ./path/to/you/test/file/ -v -count=1

-count=1 will disable test caching, which would otherwise occur when your source files do not require rebuilding. As per the doc: “The idiomatic way to disable test caching explicitly is to use -count=1” 🤷.

Compiling tests without running them

The Go test tooling does not decouple the compilation of tests from their execution. In other words, the philosophy is that you either want to run your tests or… run your tests – and there’s no way to just check that your tests compile, which is the first step before they can be run.

It turns out the -run flag discussed above doesn’t actually just take the name of a test, but a regular expression that matches the names of the test functions to run. If you set that expression to something that cannot match any function, the testing framework will stop just after the compilation step. The most common trick is to use a line beginning (^) immediately followed by a line ending ($)

go test -run=^$ ./...

Here be dragons: the testing framework will not execute any tests but it will execute code that belongs in package initializers: should this code have any side-effects, this command would trigger them.

Subtests

Go’s standard test framework notably lacks the concepts of setup and teardown phases, which in other languages allow for, well, setting up the context of multiple tests and disposing of those resources once the tests have run.

Not all hope is lost though: t.Run() is a method intended to isolate the behavior to test in a dedicated block. Because you can nest them, you can progressively specialize The structure of the code can resemble Ruby’s RSpec

func TestBowling(t *testing.T) {
    bowling := Bowling{}

    t.Run("When there are no strikes or spare", func(t *testing.T) {
        bowling.Frame(3, 2)
        assert.Equal(t, 5, bowling.Score())
    })

    t.Run("When there is a spare", func(t *testing.T) {
        bowling.Frame(7, 3)
        assert.Equal(t, 10, bowling.Score())

        t.Run("Followed by a regular frame", func(t *testing.T) {
            bowling.Frame(5, 4)
            assert.Equal(t, 24, bowling.Score())
        })
    })
}

As you can sort of guess, the problem with this approach is that all tests will share the same instance under test — if that object is stateful, you will end with tests whose sequencing cannot change. This is a pretty big anti-pattern in general, but especially in Go where parallelism is built into the test framework.

t.Run() can still prove useful to break down long test methods into chunks that are easier to digest and help with making common scenarios clearer for the next maintainer.

Precommit

When I get into a new language or stack, I try to write a pre-commit script with the goal of catching problems and correcting them before the code is pushed to a remote branch (I know about pre-commit but personally find it cumbersome). This is the Ruby script I have for my Go projects and here’s what it does:

#!/usr/bin/env ruby

class String
  # colorization
  def colorize(color_code)
    "\e[#{color_code}m#{self}\e[0m"
  end

  def red
    colorize(31)
  end

  def green
    colorize(32)
  end

  def yellow
    colorize(33)
  end
end

require 'pathname'
require 'set'

def generated_file?(file_path)
  !File.open(file_path).read().match(/^\/\/ Code generated .* DO NOT EDIT\.$/).nil?
end

def needs_generation?(file_path)
  !File.open(file_path).read().match(/^\/\/go:generate /).nil?
end

def test_file?(file_path)
  !file_path.match(/_test\.go$/).nil?
end

def abort_unless(command, failure_message='Failure', success_message='Success')
  print command and STDOUT.flush
  output = `#{command} 2>&1`
  if !$?.success?
    puts "  \u{1F44A}  #{failure_message}".red
    STDERR.puts output.red
    abort("\u{1F44A}  run: #{command}")
  else
    puts "  \u{1F44C}  #{success_message}".green
  end
end

class CommandResult
  def initialize(command, output, failure, failure_message='Failure', success_message='Success')
    @command = command
    @output = output
    @failure = failure
    @failure_message = failure_message
    @success_message = success_message
  end

  def result
    print @command and STDOUT.flush
    if @failure
      puts "  \u{1F44A}  #{@failure_message}".red
      STDERR.puts @output.red
      abort("\u{1F44A}  run: #{@command}")
    else
      puts "  \u{1F44C}  #{@success_message}".green
    end
  end
end

def abort_unless_async(command, failure_message='Failure', success_message='Success')
  output = `#{command} 2>&1`
  CommandResult.new(command, output, !$?.success?, failure_message, success_message)
end

def run_go_fmt(files)
  puts "Running go fmt…"
  files.each do |file|
    abort_unless("go fmt #{file}")
  end
  # Eagerly re-add all files, as they may have been modified
  `git add #{files.join(' ')}` unless files.empty?
end

def run_go_imports(files)
  puts "Running goimports…"
  abort_unless("goimports -w #{files.join(' ')}")
  # Eagerly re-add all files, as they may have been modified
  `git add #{files.join(' ')}` unless files.empty?
end

def run_lint(files)
  puts "Running lint"
  abort_unless("golangci-lint run --fix")
  # Eagerly re-add all files, as they may have been modified
  `git add #{files.join(' ')}` unless files.empty?
end

def run_go_generate(go_files)
  puts "Running go generate…"
  # threads = go_files.map { |go_file| Thread.new { abort_unless_async("go generate #{go_file}") } }
  # threads.each do |thread|
  #   thread.value.result
  # end
  go_files.each do |go_file|
    abort_unless("go generate #{go_file}")
  end
  generated_files = `git diff --name-only --diff-filter=AM | grep '\.go$'`.split.select { |file| generated_file?(file) }
  # Re-add all generated files
  `git add #{generated_files.join(' ')}` unless generated_files.empty?
end

def run_go_vet(go_files)
  puts "Running go vet…"
  go_files.each do |file|
    abort_unless("go vet #{file}")
  end
end

def check_tests_for_new_files(new_go_application_files)
  puts "Checking for tests…"
  new_go_application_files.each do |new_go_application_file|
    target_test_file = "#{new_go_application_file.gsub(/\.go$/, '_test.go')}"
    STDERR.puts "#{new_go_application_file} was added in this change but is missing a matching test (#{target_test_file})".yellow unless Pathname.new(target_test_file).exist?
  end
end

def check_fmt_print(go_files)
  puts "Checking for fmt.Print…"
  abort_unless("! grep -Hn 'fmt\.Print' #{go_files.join(' ')}")
end

def run_tests()
  puts 'Running tests…'
  abort_unless("go test ./...")
end

def build_integration_tests()
  puts 'Building integration tests'
  abort_unless("go test -run=^$ -tags 'integration endtoend' ./...")
end

new_go_files = `git diff --cached --name-only --diff-filter=A | grep '\.go$'`.split
new_go_application_files, new_go_test_files = new_go_files.partition { |file| not test_file?(file) }
new_go_application_files, new_go_generated_files = new_go_application_files.partition { |file| not generated_file?(file) }
go_files = `git diff --cached --name-only --diff-filter=ACM | grep '\.go$'`.split
go_application_files, go_test_files = go_files.partition { |file| not test_file?(file) }
go_application_files, go_generated_files = go_application_files.partition { |file| not generated_file?(file) }

def all_go_files()
  # Recursively list all go files in current package and emit lines in the format
  # /absolute/path/to/module/directory:[file.go otherfile.go]
  modules_and_files = `go list -find -f '{{.Dir}}:{{.GoFiles}}' ./...`.split("\n")
  git_root = `git rev-parse --show-toplevel`.strip + '/'
  all_go_files = []
  modules_and_files.each do |module_and_files|
    # Split module path and files
    module_dir, files = module_and_files.split(':')
    # Remove git root directory from absolute path of module
    module_dir = module_dir[git_root.length..-1]
    module_dir = (module_dir.nil? or module_dir.empty?) ? "" : "#{module_dir}/"
    files[1..-2].split.each do |file|
      all_go_files << "#{module_dir}#{file}"
    end
  end
  all_go_files
end

unless go_files.empty?
  run_go_fmt(go_application_files + go_test_files)
  run_go_imports(go_application_files + go_test_files)
  check_fmt_print(go_application_files + go_test_files)
  run_go_vet(go_application_files + go_test_files)
  run_lint(go_application_files + go_test_files)
  check_tests_for_new_files(new_go_application_files)
  run_go_generate(all_go_files().select { |file| needs_generation?(file) })
  run_tests()
  build_integration_tests()
  puts "\u{1F389}  All good!".green
  exit(0)
end