joe shaw

Error handling in Go HTTP applications

2021-05-16T21:00:00-04:00

Nate Finch had a nice blog post on error flags recently, and it caused me to think about error handling in my own greenfield Go project at work.

Much of the Go software I write follows a common pattern: an HTTP JSON API fronting some business logic, backed by a data store of some sort. When an error occurs, I typically want to present a context-aware HTTP status code and an a JSON payload containing an error message. I want to avoid 400 Bad Request and 500 Internal Server Errors whenever possible, and I also don’t want to expose internal implementation details or inadvertently leak information to API consumers.

I’d like to share the pattern I’ve settled on for this type of application.

An API-safe error interface

First, I define a new interface that will be used throughout the application for exposing “safe” errors through the API:

package app

type APIError interface {
    // APIError returns an HTTP status code and an API-safe error message.
    APIError() (int, string)
}

Common sentinel errors

In practice, most of the time there are a limited set of errors that I want to return through the API. Things like a 401 Unauthorized for a missing or invalid API token, or a 404 Not Found when referring to a resource that doesn’t exist in the data store. For these I create a create a private struct that implements APIError:

type sentinelAPIError struct {
    status int
    msg    string
}

func (e sentinelAPIError) Error() string {
    return e.msg
}

func (e sentinelAPIError) APIError() (int, string) {
    return e.status, e.msg
}

And then I publicly define common sentinel errors:

var (
    ErrAuth      = &sentinelAPIError{status: http.StatusUnauthorized, msg: "invalid token"}
    ErrNotFound  = &sentinelAPIError{status: http.StatusNotFound, msg: "not found"}
    ErrDuplicate = &sentinelAPIError{status: http.StatusBadRequest, msg: "duplicate"}
)

Wrapping sentinels

The sentinel errors provide a good foundation for reporting basic information through the API, but how can I associate real errors with them? ErrNoRows from the database/sql package is never going to implement my APIError interface, but I can leverage the error wrapping functionality introduced in Go 1.13.

One of the lesser-known features of error wrapping is the ability to write a custom Is method on your own types. This is perhaps because the implementation is privately hidden within the errors package, and the package documentation doesn’t give much information about why you’d want to use it. But it’s a perfect fit for these sentinel errors.

First, I define a sentinel-wrapped error type:

type sentinelWrappedError struct {
    error
    sentinel *sentinelAPIError
}

func (e sentinelWrappedError) Is(err error) bool {
    return e.sentinel == err
}

func (e sentinelWrappedError) APIError() (int, string) {
    return e.sentinel.APIError()
}

This associates an error from elsewhere in the application with one of my predefined sentinel errors. A key thing to note here is that sentinelWrappedError embeds the original error, meaning its Error method returns the original error’s message, while implementing APIError with the sentinel’s API-safe message. The Is method allows for comparisons of these wrapping errors with the sentinel errors using errors.Is.

Then I need a public function to do the wrapping:

func WrapError(err error, sentinel *sentinelAPIError) error {
    return sentinelWrappedError{error: err, sentinel: sentinel}
}

(If you wanted to include additional context in the APIError, such as a resource name, this would be a good place to add it.)

When other parts of the application encounter an error, they wrap the error with one of the sentinel errors. For example, the database layer might have its own wrapError function that looks something like this:

package db

import "example.com/app"

func wrapError(err error) error {
    switch {
    case errors.Is(err, sql.ErrNoRows):
        return app.WrapError(err, app.ErrNotFound)
    case isMySQLError(err, codeDuplicate):
        return app.WrapError(err, app.ErrDuplicate)
    default:
        return err
    }
}

Because the wrapper implements Is against the sentinel, you can compare errors to sentinels regardless of what the original error is:

err := db.DoAThing()
switch {
case errors.Is(err, ErrNotFound):
    // do something specific for Not Found errors
case errors.Is(err, ErrDuplicate):
    // do something specific for Duplicate errors
}

Handling errors in the API

The final task is to handle these errors and send them safely back through the API. In my api package, I define a helper function that takes an error and serializes it to JSON:

package api

import "example.com/app"

func JSONHandleError(w http.ResponseWriter, err error) {
    var apiErr app.APIError
    if errors.As(err, &apiErr) {
        status, msg := apiErr.APIError()
        JSONError(w, status, msg)
    } else {
        JSONError(w, http.StatusInternalServerError, "internal error")
    }
}

(The elided JSONError function is the one responsible for setting the HTTP status code and serializing the JSON.)

Note that this function can take any error. If it’s not an APIError, it falls back to returning a 500 Internal Server Error. This makes it safe to pass unwrapped and unexpected errors without additional care.

Because sentinelWrappedError embeds the original error, you can also log any error you encounter and get the original error message. This can aid debugging.

An example

Here’s an example HTTP handler function that generates an error, logs it, and returns it to a caller.

package api

func exampleHandler(w http.ResponseWriter, r *http.Request) {
    // A contrived example that always throws an error.  Imagine this
    // is actually a function that calls into a data store.
    err := app.WrapError(fmt.Errorf("user ID %q not found", "archer"), app.ErrNotFound)
    if err != nil {
        log.Printf("exampleHandler: error fetching user: %v", err)
        JSONHandleError(w, err)
        return
    }

    // Happy path elided...
}

Hitting this endpoint will give you this HTTP response:

HTTP/1.1 404 Not Found
Content-Type: application/json

{"error": "not found"}

And send to your logs:

exampleHandler: error fetching user: user ID "archer" not found

If I had forgotten to call app.WrapError, the response instead would have been:

HTTP/1.1 500 Internal Server Error
Content-Type: application/json

{"error": "internal error"}

But the message to the logs would have been the same.

Impact

Adopting this pattern for error handling has reduced the number of error types and scaffolding in my code – the same problems that Nate experienced before adopting his error flags scheme. It’s centralized the errors I expose to the user, reduced the work to expose appropriate and consistent error codes and messages to API consumers, and has an always-on safe fallback for unexpected errors or programming mistakes. I hope you can take inspiration to improve the error handling in your own code.

Abusing go:linkname to customize TLS 1.3 cipher suites

2020-06-11T09:00:00-04:00

This post has been translated into Chinese by a Gopher in Beijing. 巧用go:linkname 定制 TLS 1.3 加密算法套件

When Go 1.12 was released, I was very excited to test out the new opt-in support for TLS 1.3. TLS 1.3 is a major improvement to the main security protocol of the web.

I was eager to try it out in a tool I had written for work which allowed me to scan what TLS parameters were supported by a server. In TLS, the client presents a set of cipher suites to the server that it supports, and the server chooses the best one to use, where “best” is typically a reasonable trade-off of security and performance.

In order to enumerate what cipher suites a server supports, a client must make individual connections, each offering a single cipher suite at a time. If the server rejects the handshake, you know the cipher suite is not supported.

For TLS 1.2 and below, this is pretty straightforward:

func supportedTLS12Ciphers(hostname string) []uint16 {
	// Taken from https://golang.org/pkg/crypto/tls/#pkg-constants
	var allCiphers = []uint16{
		tls.TLS_RSA_WITH_RC4_128_SHA,
		tls.TLS_RSA_WITH_3DES_EDE_CBC_SHA,
		tls.TLS_RSA_WITH_AES_128_CBC_SHA,
		tls.TLS_RSA_WITH_AES_256_CBC_SHA,
		tls.TLS_RSA_WITH_AES_128_CBC_SHA256,
		tls.TLS_RSA_WITH_AES_128_GCM_SHA256,
		tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
		tls.TLS_ECDHE_ECDSA_WITH_RC4_128_SHA,
		tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,
		tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,
		tls.TLS_ECDHE_RSA_WITH_RC4_128_SHA,
		tls.TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA,
		tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
		tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
		tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,
		tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,
		tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
		tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
		tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
		tls.TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
		tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,
		tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,
	}

    var supportedCiphers []uint16
    
	for _, c := range allCiphers {
		cfg := &tls.Config{
			ServerName:   hostname,
			CipherSuites: []uint16{c},
			MinVersion:   tls.VersionTLS12,
			MaxVersion:   tls.VersionTLS12,
		}

		conn, err := net.Dial("tcp", hostname+":443")
		if err != nil {
			panic(err)
		}

		client := tls.Client(conn, cfg)
		client.Handshake()
		client.Close()

		if client.ConnectionState().CipherSuite == c {
			supportedCiphers = append(supportedCiphers, c)
		}
	}

	return supportedCiphers
}

After writing the barebones code to support TLS 1.3 in the tool, I discovered something unfortunate: Go does not allow you to select what TLS 1.3 cipher suites are sent to the server. The rationale makes sense: TLS 1.3 greatly simplified both what is contained within a cipher suite and how many are supported. Unless and until there is a weakness in a TLS 1.3 cipher suite, there’s nothing to be gained in allowing them to be customized.

Still, this tool was one of the rare situations where it makes sense, and I wanted to see if I could hack it in. Enter go:linkname. Buried deep in Go’s compiler documentation:

//go:linkname localname importpath.name

The //go:linkname directive instructs the compiler to use “importpath.name” as the object file symbol name for the variable or function declared as “localname” in the source code. Because this directive can subvert the type system and package modularity, it is only enabled in files that have imported “unsafe”.

Well hello! This looks promising. If there is a function or variable in Go’s standard library that specifies what the list of TLS 1.3 ciphers are, we can override that in our tool by instructing the Go complier to use our local implementation instead of the one in the standard library.

Let’s dig into the standard library’s TLS 1.3 implementation. In crypto/tls/handshake_client.go [link], we have:

if hello.supportedVersions[0] == VersionTLS13 {
		hello.cipherSuites = append(hello.cipherSuites, defaultCipherSuitesTLS13()...)
        // ...
}

Great! Let’s just override this defaultCipherSuitesTLS13() function. In crypto/tls/common.go [link]:

func defaultCipherSuitesTLS13() []uint16 {
	once.Do(initDefaultCipherSuites)
	return varDefaultCipherSuitesTLS13
}

This complicates things a bit. This calls an initialization function lazily on first use, and that function manipulates a bunch of internal default lists beyond just the TLS 1.3 cipher suites list. We don’t want to mess with any of that. But in that initDefaultCipherSuites function, we have this [link]:

varDefaultCipherSuitesTLS13 = []uint16{
    TLS_AES_128_GCM_SHA256,
    TLS_CHACHA20_POLY1305_SHA256,
    TLS_AES_256_GCM_SHA384,
}

Ah ha! A package global variable is assigned the cipher suite values. And because this initialization function is only ever called once, we can initialize the list and then take control of it in our code.

// Using go:linkname requires us to import unsafe
import (
    "crypto/tls"
    _ "unsafe" 
)

// We bring the real defaultCipherSuitesTLS13 function from the
// crypto/tls package into our own package.  This lets us perform
// that lazy initialization of the cipher list when we want.

//go:linkname defaultCipherSuitesTLS13 crypto/tls.defaultCipherSuitesTLS13
func defaultCipherSuitesTLS13() []uint16

// Next we bring the `varDefaultCipherSuitesTLS13` slice into our
// package.  This is what we manipulate to get the cipher suites.

//go:linkname varDefaultCipherSuitesTLS13 crypto/tls.varDefaultCipherSuitesTLS13
var varDefaultCipherSuitesTLS13 []uint16

// Also keep a variable around for the real default set, so we
// can reset it once we're finished.
var realDefaultCipherSuitesTLS13 []uint16

func init() {
    // Initialize the TLS 1.3 ciphersuite set; this populates
    // varDefaultCipherSuitesTLS13 under the covers
    realDefaultCipherSuitesTLS13 = defaultCipherSuitesTLS13()
}

func supportedTLS13Ciphers(hostname string) []uint16 {
	var supportedCiphers []uint16

	for _, c := range realDefaultCipherSuitesTLS13 {
		cfg := &tls.Config{
			ServerName: hostname,
			MinVersion: tls.VersionTLS13,
		}

		// Override the internal slice!
		varDefaultCipherSuitesTLS13 = []uint16{c}

		conn, err := net.Dial("tcp", hostname+":443")
		if err != nil {
			panic(err)
		}

        client := tls.Client(conn, cfg)
		client.Handshake()
		client.Close()

		if client.ConnectionState().CipherSuite == c {
			supportedCiphers = append(supportedCiphers, c)
		}
	}

	// Reset the internal slice back to the full set
	varDefaultCipherSuitesTLS13 = realDefaultCipherSuitesTLS13

	return supportedCiphers
}

As you can see, we used go:linkname to subvert package modularity for both a function and a variable. We use a package init function to populate the default cipher suites list, and then we override it as we iterate and attempt connections with only a single supported cipher suite. Finally, we make sure to clean things up and set the default list back to the full set for any future uses.

Lastly, let’s glue things together:

func main() {
    hostname := os.Args[1]
	fmt.Println("Supported TLS 1.2 ciphers")
	for _, c := range supportedTLS12Ciphers(hostname) {
		fmt.Printf("  %s\n", tls.CipherSuiteName(c))
	}
	fmt.Println()
	fmt.Println("Supported TLS 1.3 ciphers")
	for _, c := range supportedTLS13Ciphers(hostname) {
		fmt.Printf("  %s\n", tls.CipherSuiteName(c))
	}
}

$ go run cipherlist.go joeshaw.org
Supported TLS 1.2 ciphers
  TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
  TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384

Supported TLS 1.3 ciphers
  TLS_AES_128_GCM_SHA256
  TLS_CHACHA20_POLY1305_SHA256
  TLS_AES_256_GCM_SHA384

There you have it.

go:linkname should be used very sparingly. Consider carefully whether you must use it, or whether you can solve your problem another way. For me, the alternative was to import all of crypto/tls to make some minor edits. It would also freeze me into a point in time of the Go TLS stack and put the burden of upgrading onto me. While I know that there are no compatibility guarantees with Go’s crypto/tls internals, using go:linkname allows me to use the TLS stack provided by current and future versions of Go as long as the particular pieces I am using don’t change. I can live with that.

The full code for this test program lives in this Github repository.

Understanding Go panic output

2017-11-09T23:00:00-05:00

Update November 2023: Go 1.17 and 1.18 improved the panic output from my original post. See the update below for details.

My code has a bug. 😭

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x751ba4]
goroutine 58 [running]:
github.com/joeshaw/example.UpdateResponse(0xad3c60, 0xc420257300, 0xc4201f4200, 0x16, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/joeshaw/example/resp.go:108 +0x144
github.com/joeshaw/example.PrefetchLoop(0xacfd60, 0xc420395480, 0x13a52453c000, 0xad3c60, 0xc420257300)
        /go/src/github.com/joeshaw/example/resp.go:82 +0xc00
created by main.runServer
        /go/src/github.com/joeshaw/example/cmd/server/server.go:100 +0x7e0

This panic is caused by dereferencing a nil pointer, as indicated by the first line of the output. These types of errors are much less common in Go than in other languages like C or Java thanks to Go’s idioms around error handling.

If a function could fail, the function must return an error as its last return value. The caller should immediately check for errors from that function.

// val is a pointer, err is an error interface value
val, err := somethingThatCouldFail()
if err != nil {
    // Deal with the error, probably pushing it up the call stack
    return err
}

// By convention, nearly all the time, val is guaranteed to not be
// nil here.

However, there must be a bug somewhere that is violating this implicit API contract.

Before I go any further, a caveat: this is architecture- and operating system-dependent stuff, and I am only running this on amd64 Linux and macOS systems. Other systems can and will do things differently.

Line two of the panic output gives information about the UNIX signal that triggered the panic:

[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x751ba4]

A segmentation fault (SIGSEGV) occurred because of the nil pointer dereference. The code field maps to the UNIX siginfo.si_code field, and a value of 0x1 is SEGV_MAPERR (“address not mapped to object”) in Linux’s siginfo.h file.

addr maps to siginfo.si_addr and is 0x30, which isn’t a valid memory address.

pc is the program counter, and we could use it to figure out where the program crashed, but we conveniently don’t need to because a goroutine trace follows.

goroutine 58 [running]:
github.com/joeshaw/example.UpdateResponse(0xad3c60, 0xc420257300, 0xc4201f4200, 0x16, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /go/src/github.com/joeshaw/example/resp.go:108 +0x144
github.com/joeshaw/example.PrefetchLoop(0xacfd60, 0xc420395480, 0x13a52453c000, 0xad3c60, 0xc420257300)
        /go/src/github.com/joeshaw/example/resp.go:82 +0xc00
created by main.runServer
        /go/src/github.com/joeshaw/example/cmd/server/server.go:100 +0x7e0

The deepest stack frame, the one where the panic happened, is listed first. In this case, resp.go line 108.

The thing that catches my eye in this goroutine backtrace are the arguments to the UpdateResponse and PrefetchLoop functions, because the number doesn’t match up to the function signatures.

func UpdateResponse(c Client, id string, version int, resp *Response, data []byte) error
func PrefetchLoop(ctx context.Context, interval time.Duration, c Client)

UpdateResponse takes 5 arguments, but the panic shows that it takes more than 10. PrefetchLoop takes 3, but the panic shows 5. What’s going on?

To understand the argument values, we have to understand a little bit about the data structures underlying Go types. Russ Cox has two great blog posts on this, one on basic types, structs and pointers, strings, and slices and another on interfaces which describe how these are laid out in memory. Both posts are essential reading for Go programmers, but to summarize:

Strings are two words (a pointer to string data and a length)
Slices are three words (a pointer to a backing array, a length, and a capacity)
Interfaces are two words (a pointer to the type and a pointer to the value)

When a panic happens, the arguments we see in the output include the “exploded” values of strings, slices, and interfaces. In addition, the return values of a function are added onto the end of the argument list.

To go back to our UpdateResponse function, the Client type is an interface, which is 2 values. id is a string, which is 2 values (4 total). version is an int, 1 value (5). resp is a pointer, 1 value (6). data is a slice, 3 values (9). The error return value is an interface, so add 2 more for a total of 11. The panic output limits the number to 10, so the last value is truncated from the output.

Here is an annotated UpdateResponse stack frame:

github.com/joeshaw/example.UpdateResponse(
    0xad3c60,      // c Client interface, type pointer
    0xc420257300,  // c Client interface, value pointer
    0xc4201f4200,  // id string, data pointer
    0x16,          // id string, length (0x16 = 22)
    0x1,           // version int (1)
    0x0,           // resp pointer (nil!)
    0x0,           // data slice, backing array pointer (nil)
    0x0,           // data slice, length (0)
    0x0,           // data slice, capacity (0)
    0x0,           // error interface (return value), type pointer
    ...            // truncated; would have been error interface value pointer
)

This helps confirm what the source suggested, which is that resp was nil and being dereferenced.

Moving up one stack frame to PrefetchLoop: ctx context.Context is an interface value, interval is a time.Duration (which is just an int64), and Client again is an interface.

PrefetchLoop annotated:

github.com/joeshaw/example.PrefetchLoop(
    0xacfd60,       // ctx context.Context interface, type pointer
    0xc420395480,   // ctx context.Context interface, value pointer
    0x13a52453c000, // interval time.Duration (6h0m)
    0xad3c60,       // c Client interface, type pointer
    0xc420257300,   // c Client interface, value pointer
)

As I mentioned earlier, it should not have been possible for resp to be nil, because that should only happen when the returned error is not nil. The culprit was in code which was erroneously using the github.com/pkg/errors Wrapf() function instead of Errorf().

// Function returns (*Response, []byte, error)

if resp.StatusCode != http.StatusOK {
    return nil, nil, errors.Wrapf(err, "got status code %d fetching response %s", resp.StatusCode, url)
}

Wrapf() returns nil if the error passed into it is nil. This function erroneously returned nil, nil, nil when the HTTP status code was not http.StatusOK, because a non-200 status code is not an error and thus err was nil. Replacing the errors.Wrapf() call with errors.Errorf() fixed the bug.

Understanding and contextualizing panic output can make tracking down errors much easier! Hopefully this information will come in handy for you in the future.

Thanks to Peter Teichman, Damian Gryski, and Travis Bischel who all helped me decode the panic output argument lists.

Update

From the Go 1.17 release notes:

The format of stack traces from the runtime (printed when an uncaught panic occurs, or when runtime.Stack is called) is improved. Previously, the function arguments were printed as hexadecimal words based on the memory layout. Now each argument in the source code is printed separately, separated by commas. Aggregate-typed (struct, array, string, slice, interface, and complex) arguments are delimited by curly braces. A caveat is that the value of an argument that only lives in a register and is not stored to memory may be inaccurate. Function return values (which were usually inaccurate) are no longer printed.

And from the 1.18 release notes:

Go 1.17 generally improved the formatting of arguments in stack traces, but could print inaccurate values for arguments passed in registers. This is improved in Go 1.18 by printing a question mark (?) after each value that may be inaccurate.

A colleague recently had a crash similar to our example above. The relevant methods looked like this:

func (s *Service) GetCount(repo string) (count int64, errors []error)
func (s *Service) request(method string, url string, body []byte) (status int, response []byte, errors []error)

where s.GetCount(...) calls s.request(...).

The stack trace looked like this:

github.com/example/service.(*Service).request(0x0, {0x118368d?, 0xc000cd9b20?}, {0xc000588180?, 0x1?}, {0x0, 0x0, 0x0})
	/go/src/github.com/example/service/service.go:38 +0xdc
github.com/example/service.(*Service).GetCount(0xc000896700?, {0xc00084bed0?, 0x1ba03c0?})
	/go/src/github.com/example/service/service.go:69 +0xdc

You can see right away that the new output is a big improvement. The aggregated types (strings and slices in this example) are grouped together. Return values are omitted entirely.

Here it is again with my annotations:

github.com/example/service.(*Service).request(
    0x0,                          // *Service receiver, nil pointer (!)
    {0x118368d?, 0xc000cd9b20?},  // method string: pointer and length
    {0xc000588180?, 0x1?},        // url string: pointer and length
    {0x0, 0x0, 0x0}               // body []byte: pointer, length, capacity
)
	/go/src/github.com/example/service/service.go:38 +0xdc

github.com/example/service.(*Service).GetCount(
    0xc000896700?,                // *Service receiver, pointer
    {0xc00084bed0?, 0x1ba03c0?}   // repo string: pointer and length
)
	/go/src/github.com/example/service/service.go:69 +0xdc

Pretty clearly here you can see that the nil *Service receiver in the call to request is the problem. Something on line 38 is trying to dereference it and causing the crash.

But wait. GetCount calls request and its receiver is not nil (0x0). What’s going on?

The release notes above say that the stack trace could include “inaccurate values for arguments passed in registers” and signifies this by putting a question mark after such values.

GetCount does nothing with the receiver value other than passing it along to the request method. This means that when GetCount gets the receiver passed in as a register, it does not need to load it into RAM and we get the potentially inaccurate value in our stack trace.

request does do something with the value – dereferences it – requiring it to be loaded into RAM. That’s why the value is accurate in the request stack frame.

Testing with os/exec and TestMain

2017-07-25T08:00:00-04:00

If you look at the tests for the Go standard library’s os/exec package, you’ll find a neat trick for how it tests execution:

func helperCommandContext(t *testing.T, ctx context.Context, s ...string) (cmd *exec.Cmd) {
    testenv.MustHaveExec(t)

    cs := []string{"-test.run=TestHelperProcess", "--"}
    cs = append(cs, s...)
    if ctx != nil {
        cmd = exec.CommandContext(ctx, os.Args[0], cs...)
    } else {
        cmd = exec.Command(os.Args[0], cs...)
    }
    cmd.Env = []string{"GO_WANT_HELPER_PROCESS=1"}
    return cmd
}

// TestHelperProcess isn't a real test.
//
// Some details elided for this blog post.
func TestHelperProcess(*testing.T) {
    if os.Getenv("GO_WANT_HELPER_PROCESS") != "1" {
        return
    }
    defer os.Exit(0)

    args := os.Args
    for len(args) > 0 {
        if args[0] == "--" {
            args = args[1:]
            break
        }
        args = args[1:]
    }
    if len(args) == 0 {
        fmt.Fprintf(os.Stderr, "No command\n")
        os.Exit(2)
    }

    cmd, args := args[0], args[1:]
    switch cmd {
    case "echo":
        iargs := []interface{}{}
        for _, s := range args {
            iargs = append(iargs, s)
        }
        fmt.Println(iargs...)

    //// etc...
    }
}

When you run go test, under the covers the toolchain compiles your test code into a temporary binary and runs it. (As an aside, passing -x to the go tool is a great way to learn what the toolchain is actually doing.)

This helper function in exec_test.go sets a GO_WANT_HELPER_PROCESS environment variable and calls itself with a parameter directing it to run a specific test, named TestHelperProcess.

Nate Finch wrote an excellent blog post in 2015 on this pattern in greater detail, and Mitchell Hashimoto’s 2017 GopherCon talk also makes mention of this trick.

I think this can be improved upon somewhat with the TestMain mechanism that was added in Go 1.4, however.

Here it is in action:

package myexec

import (
    "fmt"
    "os"
    "os/exec"
    "strings"
    "testing"
)

func TestMain(m *testing.M) {
    switch os.Getenv("GO_TEST_MODE") {
    case "":
        // Normal test mode
        os.Exit(m.Run())

    case "echo":
        iargs := []interface{}{}
        for _, s := range os.Args[1:] {
            iargs = append(iargs, s)
        }
        fmt.Println(iargs...)
    }
}

func TestEcho(t *testing.T) {
    cmd := exec.Command(os.Args[0], "hello", "world")
    cmd.Env = []string{"GO_TEST_MODE=echo"}
    output, err := cmd.Output()
    if err != nil {
        t.Errorf("echo: %v", err)
    }
    if g, e := string(output), "hello world\n"; g != e {
        t.Errorf("echo: want %q, got %q", e, g)
    }
}

We still set an environment variable and self-execute, but by moving the dispatching to TestMain we avoid the somewhat-hacky special test which only ran when a certain environment variable is set, and which needed to do extra command-line argument handling.

Update: Chris Hines wrote about this and other useful things you can do with TestMain in a post from 2015 that I did not know about!

Don't defer Close() on writable files

2017-06-12T08:00:00-04:00

Update: Another approach suggested by the inimitable Ben Johnson has been added to the end of the post.

Update 2: Discussion about fsync() added to the end of the post.

It’s an idiom that quickly becomes rote to Go programmers: whenever you conjure up a value that implements the io.Closer interface, after checking for errors you immediately defer its Close() method. You see this most often when making HTTP requests:

resp, err := http.Get("https://joeshaw.org")
if err != nil {
    return err
}
defer resp.Body.Close()

or opening files:

f, err := os.Open("/home/joeshaw/notes.txt")
if err != nil {
    return err
}
defer f.Close()

But this idiom is actually harmful for writable files because deferring a function call ignores its return value, and the Close() method can return errors. For writable files, Go programmers should avoid the defer idiom or very infrequent, maddening bugs will occur.

Why would you get an error from Close() but not an earlier Write() call? To answer that we need to take a brief, high-level detour into the area of computer architecture.

Generally speaking, as you move outside and away from your CPU, actions get orders of magnitude slower. Writing to a CPU register is very fast. Accessing system RAM is quite slow in comparison. Doing I/O on disks or networks is an eternity.

If every Write() call committed the data to the disk synchronously, the performance of our systems would be unusably slow. While synchronous writes are very important for certain types of software (like databases), most of the time it’s overkill.

The pathological case is writing to a file one byte at a time. Hard drives – brutish, mechanical devices – need to physically move a magnetic head to the position on the platter and possibly wait for a full platter revolution before the data could be persisted. SSDs, which store data in blocks and have a finite number of write cycles for each block, would quickly burn out as blocks are repeatedly written and overwritten.

Fortunately this doesn’t happen because multiple layers within hardware and software implement caching and write buffering. When you call Write(), your data is not immediately being written to media. The operating system, storage controllers and the media itself are all buffering the data in order to batch smaller writes together, organizing the data optimally for storage on the medium, and deciding when best to commit it. This turns our writes from slow, blocking synchronous operations to quick, asynchronous operations that don’t directly touch the much slower I/O device. Writing a byte at a time is never the most efficient thing to do, but at least we are not wearing out our hardware if we do it.

Of course, the bytes do have to be committed to disk at some point. The operating system knows that when we close a file, we are finished with it and no subsequent write operations are going to happen. It also knows that closing the file is its last chance to tell us something went wrong.

On POSIX systems like Linux and macOS, closing a file is handled by the close system call. The BSD man page for close(2) talks about the errors it can return:

ERRORS
     The close() system call will fail if:

     [EBADF]            fildes is not a valid, active file descriptor.

     [EINTR]            Its execution was interrupted by a signal.

     [EIO]              A previously-uncommitted write(2) encountered an input/output
                        error.

EIO is exactly the error we are worried about. It means that we’ve lost data trying to save it to disk, and our Go programs should absolutely not return a nil error in that case.

The simplest way to solve this is simply not to use defer when writing files:

func helloNotes() error {
    f, err := os.Create("/home/joeshaw/notes.txt")
    if err != nil {
        return err
    }

    if err = io.WriteString(f, "hello world"); err != nil {
        f.Close()
        return err
    }

    return f.Close()
}

This does mean additional bookkeeping of the file in the case of errors: we must explicitly close it in the case where io.WriteString() fails (and ignore its error, because the write error takes precedence). But it’s clear, straightforward, and properly checks the error from the f.Close() call.

There is a way to handle this case with defer by using named return values and a closure:

func helloNotes() (err error) {
    var f *os.File
    f, err = os.Create("/home/joeshaw/notes.txt")
    if err != nil {
        return
    }

    defer func() {
        cerr := f.Close()
        if err == nil {
            err = cerr
        }
    }()

    err = io.WriteString(f, "hello world")
    return
}

The main benefit of this pattern is that it’s not possible to forget to close the file because the deferred closure always executes. In longer functions with more if err != nil conditional branches, this pattern can also result in fewer lines of code and less repetition.

Still, I find this pattern to be a little too magical. I dislike using named return values, and modifying the return value after the core function finishes is not intuitively clear even to experienced Go programmers.

I am willing to accept the tradeoff of more readable and easily understandable code for needing to obsessively review code to ensure that the file is closed in all cases, and that’s the approach I recommend in code reviews I give to others.

Update

On Twitter, Ben Johnson suggested that Close() may be safe to run multiple times on files, like so:

func doSomething() error {
    f, err := os.Create("foo")
    if err != nil {
        return err
    }
    defer f.Close()

    if _, err := f.Write([]byte("bar"); err != nil {
        return err
    }

    if err := f.Close(); err != nil {
        return err
    }
    return nil
}

(gist)

The Go docs on io.Closer explicitly say that at an interface level behavior after the first call is unspecificed, but specific implementations may document their own behavior.

The docs for *os.File unfortunately aren’t clear on its behavior, saying only, “Close closes the File, rendering it unusable for I/O. It returns an error, if any.” The implemenation as of 1.8, however, shows:

func (f *File) Close() error {
    if f == nil {
        return ErrInvalid
    }
    return f.file.close()
}

func (file *file) close() error {
    if file == nil || file.fd == badFd {
        return syscall.EINVAL
    }
    var err error
    if e := syscall.Close(file.fd); e != nil {
        err = &PathError{"close", file.name, e}
    }
    file.fd = -1 // so it can't be closed again

    // no need for a finalizer anymore
    runtime.SetFinalizer(file, nil)
    return err
}

For clarity, badFd is defined as -1, so subsequent attempts to close an *os.File will do nothing and return syscall.EINVAL. But since we are ignoring the error from the defer, this doesn’t matter. It’s not idempotent, exactly, but as Ben put later in the Twitter thread, it “won’t blow shit up if you call it twice."

The implementation is a good, common-sense one and it seems unlikely to change in the future and cause problems. But the lack of documentation about this outcome makes me a little nervous. Maybe a doc update to codify this behavior would be a good task for Go 1.10.

Update 2

Closing the file is the last chance the OS has to tell us about problems, but the buffers are not necessarily going to be flushed when you close the file. It’s entirely possible that flushing the write buffer to disk will happen after you close the file, and a failure there cannot be caught. If this happens, it usually means you have something seriously wrong, like a failing disk.

However, you can force the write to disk with the Sync() method on *os.File, which calls the fsync system call. You should check for errors from that call, but then I think it’s safe to ignore an error from Close(). Calling fsync has serious implications on performance: it’s flushing write buffers out to slow disks. But if you really, really want the data on disk, the best pattern to follow is probably:

func helloNotes() error {
    f, err := os.Create("/home/joeshaw/notes.txt")
    if err != nil {
        return err
    }
    defer f.Close()

    if err = io.WriteString(f, "hello world"); err != nil {
        return err
    }

    return f.Sync()
}

Revisiting context and http.Handler for Go 1.7

2016-08-30T16:10:00-04:00

Go 1.7 was released earlier this month, and the thing I’m most excited about is the incorporation of the context package into the Go standard library. Previously it lived in the golang.org/x/net/context package.

With the move, other packages within the standard library can now use it. The net package’s Dialer and os/exec package’s Command can now utilize contexts for easy cancelation. More on this can be found in the Go 1.7 release notes.

Go 1.7 also brings contexts to the net/http package’s Request type for both HTTP clients and servers. Last year I wrote a post about using context.Context with http.Handler when it lived outside the standard library, but Go 1.7 makes things much simpler and thankfully renders all of the approaches from that post obsolete.

A quick recap

I suggest reading my original post for more background, but one of the main uses of context.Context is to pass around request-scoped data. Things like request IDs, authenticated user information, and other data useful for handlers and middleware to examine in the scope of a single HTTP request.

In that post I examined three different approaches for incorporating context into requests. Since contexts are now attached to http.Request values, this is no longer necessary. As long as you’re willing to require at least Go 1.7, it’s now possible to use the standard http.Handler interface and common middleware patterns with context.Context!

The new approach

Recall that the http.Handler interface is defined as:

type Handler interface {
        ServeHTTP(ResponseWriter, *Request)
}

Go 1.7 adds new context-related methods on the *http.Request type.

func (r *Request) Context() context.Context
func (r *Request) WithContext(ctx context.Context) *Request

The Context method returns the current context associated with the request. The WithContext method creates a new Request value with the provided context.

Suppose we want each request to have an associated ID, pulling it from the X-Request-ID HTTP header if present, and generating it if not. We might implement the context functions like this:

type key int
const requestIDKey key = 0

func newContextWithRequestID(ctx context.Context, req *http.Request) context.Context {
    reqID := req.Header.Get("X-Request-ID")
    if reqID == "" {
        reqID = generateRandomID()
    }

    return context.WithValue(ctx, requestIDKey, reqID)
}

func requestIDFromContext(ctx context.Context) string {
    return ctx.Value(requestIDKey).(string)
}

We can implement middleware that derives a new context with a request ID, create a new Request value from it, and pass it onto the next handler in the chain.

func middleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
        ctx := newContextWithRequestID(req.Context(), req)
        next.ServeHTTP(rw, req.WithContext(ctx))
    })
}

The final handler and any middleware lower in the chain have access to all the previously request-scoped data set in middleware above it.

func handler(rw http.ResponseWriter, req *http.Request) {
    reqID := requestIDFromContext(req.Context())
    fmt.Fprintf(rw, "Hello request ID %v\n", reqID)
}

And that’s it! It’s no longer necessary to implement custom context handlers, adapters to standard http.Handler implementations, or hackily wrap http.ResponseWriter. Everything you need is in the standard library, and right there on the *http.Request type.

Smaller Docker containers for Go apps

2015-07-31T13:00:00-05:00

Update January 2018: Multi-stage builds, which were introduced in Docker 17.05, are an easier way to achieve the same small Docker images.

At litl we use Docker images to package and deploy our Room for More services, using our Galaxy deployment platform. This week I spent some time looking into how we might reduce the size of our images and speed up container deployments.

Most of our services are in Go, and thanks to the fact that compiled Go binaries are mostly-statically linked by default, it’s possible to create containers with very few files within. It’s surely possible to use these techniques to create tighter containers for other languages that need more runtime support, but for this post I’m only focusing on Go apps.

The old way

We built images in a very traditional way, using a base image built on top of Ubuntu with Go 1.4.2 installed. For my examples I’ll use something similar.

Here’s a Dockerfile:

FROM golang:1.4.2
EXPOSE 1717

RUN go get github.com/joeshaw/qotd

# Don't run network servers as root in Docker
USER nobody

CMD qotd

The golang:1.4.2 base image is built on top of Debian Jessie. Let’s build this bad boy and see how big it is.

$ docker build -t qotd .
...
Successfully built ae761b93e656

$ docker images qotd
REPOSITORY     TAG         IMAGE ID          CREATED           VIRTUAL SIZE
qotd           latest      ae761b93e656      3 minutes ago     520.3 MB

Yikes. Half a gigabyte. Ok, what leads us to a container this size?

$ docker history qotd
IMAGE               CREATED BY                                      SIZE
ae761b93e656        /bin/sh -c #(nop) CMD ["/bin/sh" "-c" "qotd"]   0 B
b77d0ca3c501        /bin/sh -c #(nop) USER [nobody]                 0 B
a4b2a01d3e42        /bin/sh -c go get github.com/joeshaw/qotd       3.021 MB
c24802660bfa        /bin/sh -c #(nop) EXPOSE 1717/tcp               0 B
124e2127157f        /bin/sh -c #(nop) COPY file:56695ddefe9b0bd83   2.481 kB
69c177f0c117        /bin/sh -c #(nop) WORKDIR /go                   0 B
141b650c3281        /bin/sh -c #(nop) ENV PATH=/go/bin:/usr/src/g   0 B
8fb45e60e014        /bin/sh -c #(nop) ENV GOPATH=/go                0 B
63e9d2557cd7        /bin/sh -c mkdir -p /go/src /go/bin && chmod    0 B
b279b4aae826        /bin/sh -c #(nop) ENV PATH=/usr/src/go/bin:/u   0 B
d86979befb72        /bin/sh -c cd /usr/src/go/src && ./make.bash    97.4 MB
8ddc08289e1a        /bin/sh -c curl -sSL https://golang.org/dl/go   39.69 MB
8d38711ccc0d        /bin/sh -c #(nop) ENV GOLANG_VERSION=1.4.2      0 B
0f5121dd42a6        /bin/sh -c apt-get update && apt-get install    88.32 MB
607e965985c1        /bin/sh -c apt-get update && apt-get install    122.3 MB
1ff9f26f09fb        /bin/sh -c apt-get update && apt-get install    44.36 MB
9a61b6b1315e        /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B
902b87aaaec9        /bin/sh -c #(nop) ADD file:e1dd18493a216ecd0c   125.2 MB

This is not a very lean container, with a lot of intermediate layers. To reduce the size of our containers, we did two additional steps:

(1) Every repo has a clean.sh script that is run inside the container after it is initially built. Here’s part of a script for one of our Ubuntu-based Go images:

apt-get purge -y software-properties-common byobu curl git htop man unzip vim \
python-dev python-pip python-virtualenv python-dev python-pip python-virtualenv \
python2.7 python2.7 libpython2.7-stdlib:amd64 libpython2.7-minimal:amd64 \
libgcc-4.8-dev:amd64 cpp-4.8 libruby1.9.1 perl-modules vim-runtime \
vim-common vim-tiny libpython3.4-stdlib:amd64 python3.4-minimal xkb-data \
xml-core libx11-data fonts-dejavu-core groff-base eject python3 locales \
python-software-properties supervisor git-core make wget cmake gcc bzr mercurial \
libglib2.0-0:amd64 libxml2:amd64

apt-get clean autoclean
apt-get autoremove -y

rm -rf /usr/local/go
rm -rf /usr/local/go1.*.linux-amd64.tar.gz
rm -rf /var/lib/{apt,dpkg,cache,log}/
rm -rf /var/{cache,log}

(2) We run Jason Wilder’s excellent docker-squash tool. It is especially helpful when combined with the clean.sh script above.

These steps are time intensive. Cleaning and squashing take minutes and dominate the overall build and deploy time.

In the end, we have built a mostly-statically linked Go binary sitting alongside an entire Debian or Ubuntu operating system. We can do better.

Separating containers for building and running

There have been a handful of good blog posts about how to do this in the past, including one by Atlassian this week. Here’s another one from Xebia, and another from Codeship.

However, all these posts focus on building a completely static Go binary. This means you eschew cgo by setting CGO_ENABLED=0 and the benefits that go along with it. On OS X, you lose access to the system’s SSL root CA certificates. On Linux, user.Current() from the os/user package no longer works. And in both cases you must use the Go DNS resolver rather than the one provided by the operating system. If you are not testing your application with CGO_ENABLED=0 prior to building a Docker container with it then you are not testing the code you ship.

We can use a few purpose-built base Docker images and the tricks from Jamie McCrindle’s Dockerception to build two separate Docker containers: one larger container to build our software and another smaller one to run it.

The builder

We create a Dockerfile.build, which is responsible for initializing the build environment and building the software:

FROM golang:1.4.2

RUN go get github.com/joeshaw/qotd
COPY / Dockerfile.run

# This command outputs a tarball which can be piped into
# `docker build -f Dockerfile.run -`
CMD tar -cf - -C / Dockerfile.run -C $GOPATH/bin qotd

This container, when run, will output a tarball to standard out, containing only our qotd binary and Dockerfile.run, used to build the runner.

Dynamically linked binary

Notice that we did not set CGO_ENABLED=0 here, so our binary is still dynamically linked against GNU libc:

$ ldd $GOPATH/bin/qotd
	linux-vdso.so.1 (0x00007ffea6b8a000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6e76e50000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6e76aa7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f6e7706d000)

We need to run this binary in an environment that has glibc available to us. That means we cannot use stock BusyBox (which uses uClibc) or Alpine (which uses musl). However, the BusyBox distribution that ships with Ubuntu is linked against glibc, and that’ll be the foundation for our running container.

The busybox:ubuntu-14.04 image only has a root user, but you should never run network-facing servers as root, even in a container. Use my joeshaw/busybox-nonroot image — which adds a nobody user with UID 1 — instead.

The runner

Now we create a Dockerfile.run, which is responsible for creating the environment in which to run our app:

FROM joeshaw/busybox-nonroot
EXPOSE 1717

COPY qotd /bin/qotd

USER nobody
CMD qotd

Putting them together

First, create the builder image:

docker build -t qotd-builder -f Dockerfile.build .

Next, run the builder container, piping its output into the creation of the runner container:

docker run --rm qotd-builder | docker build -t qotd -f Dockerfile.run -

Now we have a qotd container which has the basic BusyBox environment, plus our qotd binary. The size?

$ docker images qotd
REPOSITORY     TAG         IMAGE ID          CREATED           VIRTUAL SIZE
qotd           latest      92e7def8f105      3 minutes ago     8.611 MB

Under 9 MB. Much improved. Better still, it doesn’t require squashing, which saves us a lot of time.

Conclusion

In this example, we were able to go from a 500 MB image built from golang:1.4.2 and containing a whole Debian installation down to a 9 MB image of just BusyBox and our binary. That’s a 98% reduction in size.

For one of our real services at litl, we reduced the image size from 300 MB (squashed) to 25 MB and the time to build and deploy the container from 8 minutes to 2. That time is now dominated by building the container and software, and not by cleaning and squashing the resulting image. We didn’t have to give up on using cgo and glibc, as some of its features are essential to us. If you’re using Docker to deploy services written in Go, this approach can save you a lot of time and disk space. Good luck!

Go's net/context and http.Handler

2015-05-06T09:38:01-05:00

The approaches in this post are now obsolete thanks to Go 1.7, which adds the context package to the standard library and uses it in the net/http *http.Request type. The background info here may still be helpful, but I wrote a follow-up post that revisits things for Go 1.7 and beyond.

The golang.org/x/net/context package (hereafter referred as net/context although it’s not yet in the standard library) is a wonderful tool for the Go programmer’s toolkit. The blog post that introduced it shows how useful it is when dealing with external services and the need to cancel requests, set deadlines, and send along request-scoped key/value data.

The request-scoped key/value data also makes it very appealing as a means of passing data around through middleware and handlers in Go web servers. Most Go web frameworks have their own concept of context, although none yet use net/context directly.

Questions about using net/context for this kind of server-side context keep popping up on the /r/golang subreddit and the Gopher’s Slack community. Having recently ported a fairly large API surface from Martini to http.ServeMux and net/context, I hope this post can answer those questions.

About `http.Handler`

The basic unit in Go’s HTTP server is its http.Handler interface, which is defined as:

type Handler interface {
        ServeHTTP(ResponseWriter, *Request)
}

http.ResponseWriter is another simple interface and http.Request is a struct that contains data corresponding to the HTTP request, things like URL, headers, body if any, etc.

Notably, there’s no way to pass anything like a context.Context here.

About `context.Context`

Much more detail about contexts can be found in the introductory blog post, but the main aspect I want to call attention to in this post is that contexts are derived from other contexts. Context values become arranged as a tree, and you only have access to values set on your context or one of its ancestor nodes.

For example, let’s take context.Background() as the root of the tree, and derive a new context by attaching the content of the X-Request-ID HTTP header.

type key int
const requestIDKey key = 0

func newContextWithRequestID(ctx context.Context, req *http.Request) context.Context {
    return context.WithValue(ctx, requestIDKey, req.Header.Get("X-Request-ID"))
}

func requestIDFromContext(ctx context.Context) string {
    return ctx.Value(requestIDKey).(string)
}

ctx := context.Background()
ctx = newContextWithRequestID(ctx, req)

This derived context is the one we would then pass to the next layer of the system. Perhaps that would create its own contexts with values, deadlines, or timeouts, or it could extract values we previously stored.

Approaches

These approaches are now obsolete as of Go 1.7. Read my follow-up post that revisits this topic for Go 1.7 and beyond.

So, without direct support for net/context in the standard library, we have to find another way to get a context.Context into our handlers.

There are three basic approaches:

Use a global request-to-context mapping
Create a http.ResponseWriter wrapper struct
Create your own handler types

Let’s examine each.

Global request-to-context mapping

In this approach we create a global map of requests to contexts, and wrap our handlers in a middleware that handles the lifetime of the context associated with a request. This is the approach taken by Gorilla’s context package, although with its own context type rather than net/context.

Because every HTTP request is processed in its own goroutine and Go’s maps are not safe for concurrent access for performance reasons, it is crucial that we protect all map accesses with a sync.Mutex. This also introduces lock contention among concurrently processed requests. Depending on your application and workload, this could become a bottleneck.

In general, though, this approach works well for Gorilla’s context, because its context value is simply a map of key/value pairs. Our context is arranged like a tree, and it’s important that the map always hold a reference to the leaf. This places a burden on the programmer to manually update the pointer’s value as new contexts are derived.

An example usage might look like this:

var cmap = map[*http.Request]*context.Context{}
var cmapLock sync.Mutex

// Note that we are returning a pointer to the context, not the
// context itself.
func contextFromRequest(req *http.Request) *context.Context {
    cmapLock.Lock()
    defer cmapLock.Unlock()
    return cmap[req]
}

// Necessary wrapper around all handlers.  Must be the first middleware.
func contextHandler(ctx context.Context, h http.Handler) http.Handler {
    return http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
        ctx2 := ctx // make a copy of the root context reference
        cmapLock.Lock()
        cmap[req] = &ctx2
        cmapLock.Unlock()

        h.ServeHTTP(rw, req)

        cmapLock.Lock()
        delete(cmap, req)
        cmapLock.Unlock()
    })
}

func middleware(h http.Handler) http.Handler {
    return http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
        ctxp := contextFromRequest(req)
        *ctxp = newContextWithRequestID(*ctxp, req)

        h.ServeHTTP(rw, req)
    })
}

func handler(rw http.ResponseWriter, req *http.Request) {
    ctxp := contextFromRequest(req)

    reqID := requestIDFromContext(*ctxp)
    fmt.Fprintf(rw, "Hello request ID %s\n", reqID)
}

func main() {
    h := contextHandler(context.Background(), middleware(http.HandlerFunc(handler)))
    http.ListenAndServe(":8080", h)
}

Dereferencing the context pointer and updating it by hand is ugly, tedious and error-prone, which is why I don’t recommend this approach.

Update: Good question on Reddit asking why use a pointer to a context.Context here. It’s not necessary, but if you don’t use a pointer you must modify the underlying map any time you derive a new context. Doing so greatly increases the lock contention problem, because you must now lock around the map any time you update the context for a request.

`http.ResponseWriter` wrapper struct

In this approach we create a new struct type that embeds an existing http.ResponseWriter and attaches additional functionality to it. This approach is often used by Go web frameworks to do things like capturing the status code for the purpose of logging it later. Like the first approach, you’ll need to wrap handlers in a middleware that wraps the http.ResponseWriter and passes it into subsequent middleware and your handler.

type contextResponseWriter struct {
    http.ResponseWriter
    ctx context.Context
}

func contextHandler(ctx context.Context, h http.Handler) http.Handler {
    return http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
        crw := &contextResponseWriter{rw, ctx}
        h.ServeHTTP(crw, req)
    })
}

func middleware(h http.Handler) http.Handler {
    return http.HandlerFunc(func(rw http.ResponseWriter, req *http.Request) {
        crw := rw.(*contextResponseWriter)
        crw.ctx = newContextWithRequestID(crw.ctx, req)

        h.ServeHTTP(rw, req)
    })
}

func handler(rw http.ResponseWriter, req *http.Request) {
    crw := rw.(*contextResponseWriter)

    reqID := requestIDFromContext(crw.ctx)
    fmt.Fprintf(rw, "Hello request ID %s\n", reqID)
}

func main() {
    h := contextHandler(context.Background(), middleware(http.HandlerFunc(handler)))
    http.ListenAndServe(":8080", h)
}

This approach just feels dirty to me. The context is associated with the request, not the response, so sticking it on http.ResponseWriter feels out of place. The ResponseWriter’s purpose is simply to give handlers a way to write data to the output socket.

Piggybacking on http.ResponseWriter requires a type assertion to your wrapper struct type before you can access the context. The details of this can be hidden away in a safe helper function, but it doesn’t hide the fact that the runtime assertion is necessary.

There is also another hidden downside. There is a concrete value (with a type internal to package net/http) underlying the http.ResponseWriter that is passed into your handler. That value also implements additional interfaces from the net/http package. If you simply wrap http.ResponseWriter, your wrapper will not be implementing these additional interfaces.

You must implement these interfaces with wrapper functions if you hope to match the base http.ResponseWriter’s functionality. In some cases, like http.Flusher, this is easy with a simple conditional type assertion:

func (crw *contextResponseWriter) Flush() {
    if f, ok := crw.ResponseWriter.(http.Flusher); ok {
        f.Flush()
    }
}

However, http.CloseNotifier is quite a bit harder. Its definition contains a method that returns a <-chan bool. That channel has certain semantics that existing code likely depends upon¹. We have a couple different options here, none of them good:

Ignore the interface and don’t implement it, making the functionality unavailable even if the underlying http.ResponseWriter supports it.
Implement the interface and wrap to the underlying implementation. But what if the underlying http.ResponseWriter does not support this interface? We can’t guarantee the proper semantics of the API.

These are just two interfaces that the standard library implements today. This approach is not future-proof, because additional interfaces may be added to the standard library and implemented internally within net/http.

I don’t recommend this approach because of the interface issue, but if you’re ok with ignoring them, this is probably the simplest to implement.

Custom context handler types

In this approach, we eschew http.Handler for a new type of our own creation. This has obvious downsides: you cannot use existing de facto middleware or handlers without wrappers. Ultimately, though, I think this is the cleanest way to pass a context.Context around.

Let’s create a new ContextHandler type, following in the model of http.Handler. We’ll also create an analog to http.HandlerFunc.

type ContextHandler interface {
    ServeHTTPContext(context.Context, http.ResponseWriter, *http.Request)
}

type ContextHandlerFunc func(context.Context, http.ResponseWriter, *http.Request)

func (h ContextHandlerFunc) ServeHTTPContext(ctx context.Context, rw http.ResponseWriter, req *http.Request) {
    h(ctx, rw, req)
}

Middleware can now derive new contexts from the one passed to the handler, and pass them onto the next middleware or handler in the chain.

func middleware(h ContextHandler) ContextHandler {
    return ContextHandlerFunc(func(ctx context.Context, rw http.ResponseWriter, req *http.Request) {
        ctx = newContextWithRequestID(ctx, req)
        h.ServeHTTPContext(ctx, rw, req)
    })
}

The final context handler has access to all of the request data set by middleware above it.

func handler(ctx context.Context, rw http.ResponseWriter, req *http.Request) {
    reqID := requestIDFromContext(ctx)
    fmt.Fprintf(rw, "Hello request ID %s\n", reqID)
}

The last trick is converting our ContextHandler into something that is http.Handler compatible, so we can use it anywhere standard handlers are used.

type ContextAdapter struct{
    ctx context.Context
    handler ContextHandler
}

func (ca *ContextAdapter) ServeHTTP(rw http.ResponseWriter, req *http.Request) {
    ca.handler.ServeHTTPContext(ca.ctx, rw, req)
}

func main() {
    h := &ContextAdapter{
        ctx: context.Background(),
        handler: middleware(ContextHandlerFunc(handler)),
    }
    http.ListenAndServe(":8080", h)
}

The ContextAdapter type also allows us to use existing http.Handler middleware, as long as they run before it does. Existing logging, panic recovery, and form validation middleware should all continue to work great with our context handlers plus our adapter.

This is my preferred method for integrating net/context with my server. I recently converted an approximately 30-route server from Martini to this method, and things are working great. The code is much cleaner, easier to follow, and performs better. This API service does both HTTP basic and OAuth authentication, passing along client and user information via contexts. It extracts request IDs that are passed across to other services via contexts. Context-aware middleware handles setting CORS headers, handling OPTIONS requests, recovering from panics by returning JSON-encoded errors, logging request and response info, and recording statsd metrics.

Give it a try and let me know on Twitter how it goes. Maybe it’ll become the foundation for the next Go web framework. 😀

“CloseNotify returns a channel that receives a single value when the client connection has gone way” ↩︎

Contributing to GitHub projects

2015-04-20T11:20:28-04:00

I often see people asking how to contribute to an open source project on GitHub. Some are new programmers, some may be new to open source, others aren’t programmers but want to make improvements to documentation or other parts of a project they use everyday.

Using GitHub means you’ll need to use Git, and that means using the command-line. This post gives a gentle introduction using the git command-line tool and a companion tool for GitHub called hub.

Workflow

The basic workflow for contributing to a project on GitHub is:

Clone the project you want to work on
Fork the project you want to work on
Create a feature branch to do your own work in
Commit your changes to your feature branch
Push your feature branch to your fork on GitHub
Send a pull request for your branch on your fork

Clone the project you want to work on

$ hub clone pydata/pandas

(Equivalent to git clone https://github.com/pydata/pandas.git)

This clones the project from the server onto your local machine. When working in git you make changes to your local copy of the repository. Git has a concept of remotes which are, well, remote copies of the repository. When you clone a new project, a remote called origin is automatically created that points to the repository you provide in the command line above. In this case, pydata/pandas on GitHub.

To upload your changes back to the main repository, you push to the remote. Between when you cloned and now changes may have been made to upstream remote repository. To get those changes, you pull from the remote.

At this point you will have a pandas directory on your machine. All of the remaining steps take place inside it, so change into it now:

$ cd pandas

Fork the project you want to work on

The easiest way to do this is with hub.

$ hub fork

This does a couple of things. It creates a fork of pandas in your GitHub account. It establishes a new remote in your local repository with the name of your github username. In my case I now have two remotes: origin, which points to the main upstream repository; and joeshaw, which points to my forked repository. We’ll be pushing to my fork.

Create a feature branch to do your own work in

This creates a place to do your work in that is separate from the main code.

$ git checkout -b doc-work

doc-work is what I’m choosing to name this branch. You can name it whatever you like. Hyphens are idiomatic.

Now make whatever changes you want for this project.

Commit your changes to your feature branch

If you are creating new files, you will need to explicitly add them to the to-be-commited list (also called the index, or staging area):

$ git add file1.md file2.md etc

If you are just editing existing files, you can add them all in one batch:

$ git add -u

Next you need to commit the changes.

$ git commit

This will bring up an editor where you type in your commit message. The convention is usually to type a short summary in the first line (50-60 characters max), then a blank line, then additional details if necessary.

Push your feature branch to your fork in GitHub

Ok, remember that your fork is a remote named after your github username. In my case, joeshaw.

$ git push joeshaw doc-work

This pushes to the joeshaw remote only the doc-work branch. Now your work is publicly visible to anyone on your fork.

Send a pull request for your branch on your fork

You can do this either on the web site or using the hub tool.

$ hub pull-request

This will open your editor again. If you only had one commit on your branch, the message for the pull request will be the same as the commit. This might be good enough, but you might want to elaborate on the purpose of the pull request. Like commits, the first line is a summary of the pull request and the other lines are the body of the PR.

In general you will be requesting a pull from your current branch (in this case doc-work) into the master branch of the origin remote.

If your pull request is accepted as-is, the maintainer will merge it into the official upstream sources. Congratulations! You’ve just made your first open source contribution on GitHub.

(This was adapted from a post I made to the Central Ohio Python User Group mailing list.)

Terrible Vagrant/Virtualbox performance on Mac OS X

2011-09-30T11:00:00-04:00

Update March 2016: There’s a much easier way to enable the host IO cache from the command-line, but it only works for existing VMs. See the update below.

I recently started using Vagrant to test our auto-provisioning of servers with Puppet. Having a simple-yet-configurable system for starting up and accessing headless virtual machines really makes this a much simpler solution than VMware Fusion. (Although I wish Vagrant had a way to take and rollback VM snapshots.)

Unfortunately, as soon as I tried to really do anything in the VM my Mac would completely bog down. Eventually the entire UI would stop updating. In Activity Monitor, the dreaded kernel_task was taking 100% of one CPU, and VBoxHeadless taking most of another. Things would eventually free up whenever the task in the VM (usually apt-get install or puppet apply) would crash with a segmentation fault.

Digging into this, I found an ominous message in the VirtualBox logs:

AIOMgr: Host limits number of active IO requests to 16. Expect a performance impact.

Yeah, no kidding. I tracked this message down to the “Use host I/O cache” setting being off on the SATA Controller in the box. (This is a per-VM setting, and I am using the stock Vagrant “lucid64” box, so the exact setting may be somewhere else for you. It’s probably a good idea to turn this setting on for all storage controllers.)

When it comes to Vagrant VMs, this setting in the VirtualBox UI is not very helpful, though, because Vagrant brings up new VMs automatically and without any UI. To get this to work with the Vagrant workflow, you have to do the following hacky steps:

Turn off any IO-heavy provisioning in your Vagrantfile
vagrant up a new VM
vagrant halt the VM
Open the VM in the VirtualBox UI and change the setting
Re-enable the provisioning in your Vagrantfile
vagrant up again

This is not going to work if you have to bring up new VMs often.

Fortunately this setting is easy to tweak in the base box. Open up ~/.vagrant.d/boxes/base/box.ovf and find the StorageController node. You’ll see an attribute HostIOCache="false". Change that value to true.

Lastly, you’ll have to update the SHA1 hash of the .ovf file in ~/.vagrant.d/boxes/base/box.mf. Get the new hash by running openssl dgst -sha1 ~/.vagrant.d/boxes/base/box.ovf and replace the old value in box.mf with it.

That’s it. All subsequent VMs you create with vagrant up will now have the right setting.

Update

Thanks to this comment on a Vagrant bug report you can enable the host cache more simply from the command-line for an existing VM:

VBoxManage storagectl  --name  --hostiocache on

Where is your vagrant VM name, which you can get from:

VBoxManage list vms

and is probably "SATA Controller".

The VM must be halted for this to work.

You can add a section to your Vagrantfile to do this when new VMs are created:

config.vm.provider "virtualbox" do |v|
    v.customize [
        "storagectl", :id,
        "--name", "SATA Controller",
        "--hostiocache", "on"
    ]
end

And for further reading, here is the relevant section in the Virtualbox manual that goes into more detail about the pros and cons of host IO caching.

Linux input ecosystem

2010-10-01T15:27:24-04:00

Over the past couple of days, I’ve been trying to figure out how input in Linux works on modern systems. There are lots of small pieces at various levels, and it’s hard to understand how they all interact. Things are not helped by the fact that things have changed quite a bit over the past couple of years as HAL – which I helped write – has been giving way to udev, and existing literature is largely out of date. This is my attempt at understanding how things work today, in the Ubuntu Lucid release.

Kernel

In the Linux kernel’s input system, there are two pieces: the device driver and the event driver. The device driver talks to the hardware, obviously. Today, for most USB devices this is handled by the usbhid driver. The event drivers handle how to expose the events generated by the device driver to userspace. Today this is primarily done through evdev, which creates character devices (typically named /dev/input/eventN) and communicates with them through struct input_event messages. See include/linux/input.h for its definition.

A great tool to use for getting information about evdev devices and events is evtest.

A somewhat outdated but still relevant description of the kernel input system can be found in the kernel’s Documentation/input/input.txt file.

udev

When a device is connected, the kernel creates an entry in sysfs for it and generates a hotplug event. That hotplug event is processed by udev, which applies some policy, attaches additional properties to the device, and ultimately creates a device node for you somewhere in /dev.

For input devices, the rules in /lib/udev/rules.d/60-persistent-input.rules are executed. Among the things it does is run a /lib/udev/input_id tool which queries the capabilities of the device from its sysfs node and sets environment variables like ID_INPUT_KEYBOARD, ID_INPUT_TOUCHPAD, etc. in the udev database.

For more information on input_id see the original announcement email to the hotplug list.

X

X has a udev config backend which queries udev for the various input devices. It does this at startup and also watches for hotplugged devices. X looks at the different ID_INPUT_* properties to determine whether it’s a keyboard, a mouse, a touchpad, a joystick, or some other device. This information can be used in /usr/lib/X11/xorg.conf.d files in the form of MatchIsPointer, MatchIsTouchpad, MatchIsJoystick, etc. in InputClass sections to see whether to apply configuration to a given device.

Xorg has a handful of its own drivers to handle input devices, including evdev, synaptics, and joystick. And here is where things start to get confusing.

Linux has this great generic event interface in evdev, which means that very few drivers are needed to interact with hardware, since they’re not speaking device-specific protocols. Of the few needed on Linux nearly all of them speak evdev, including the three I listed above.

The evdev driver provides basic keyboard and mouse functionality, speaking – obviously – evdev through the /dev/input/eventN devices. It also handles things like the lid and power switches. This is the basic, generic input driver for Xorg on Linux.

The synaptics driver is the most confusing of all. It also speaks evdev to the kernel. On Linux it does not talk to the hardware directly, and is in no way Synaptics(tm) hardware-specific. The synaptics driver is simply a separate driver from evdev which adds a lot of features expected of touchpad hardware, for example two-finger scrolling. It should probably be renamed the “touchpad” module, except that on non-Linux OSes it can still speak the Synaptics protocol.

The joystick driver similarly handles joysticky things, but speaks evdev to the kernel rather than some device-specific protocol.

X only has concepts of keyboards and pointers, the latter of which includes mice, touchpads, joysticks, wacom tablets, etc. X also has the concept of the core keyboard and pointer, which is how events are most often delivered to applications. By default all devices send core events, but certain setups might want to make devices non-core.

If you want to receive events for non-core devices, you need to use the XInput or XInput2 extensions for that. XInput exposes core-like events (like DeviceMotionNotify and DeviceButtonPress), so it is not a major difficulty to use, although its setup is annoyingly different than most other X extensions. I have not used XInput2.

Peter Hutterer’s blog is an excellent resource for all things input related in X.

AVCHD to MP4/H.264/AAC conversion

2010-04-10T10:28:03-04:00

For posterity:

I have a Canon HF200 HD video camera, which records to AVCHD format. AVCHD is H.264 encoded video and AC-3 encoded audio in a MPEG-2 Transport Stream (m2ts, mts) container. This format is not supported by Aperture 3, which I use to store my video.

With Blizzard’s help, I figured out an ffmpeg command-line to convert to H.264 encoded video and AAC encoded audio in an MPEG-4 (mp4) container. This is supported by Aperture 3 and other Quicktime apps.

$ ffmpeg -sameq -ab 256k -i input-file.m2ts -s hd1080 output-file.mp4 -acodec aac

Command-line order is important, which is infuriating. If you move the -s or -ab arguments, they may not work. Add -deinterlace if the source videos are interlaced, which mine were originally until I turned it off. The only downside to this is that it generates huge output files, on the order of 4-5x greater than the input file.

Update, 28 April 2010: Alexander Wauck emailed me to say that re-encoding the video isn’t necessary, and that the existing H.264 video could be moved from the m2ts container to the mp4 container with a command-line like this:

$ ffmpeg -i input-file.m2ts -ab 256k -vcodec copy -acodec aac output-file.mp4

And he’s right… as long as you don’t need to deinterlace the video. With the whatever-random-ffmpeg-trunk checkout I have, adding -deinterlace to the command-line segfaults. I actually had tried -vcodec copy early in my experiments but abandoned it after I found that it didn’t deinterlace. I had forgotten to try it again after I moved past my older interlaced videos. Thanks Alex!

Real-time MBTA bus location + Google Maps mashup

2009-11-15T23:31:48-05:00

This weekend I read that the MBTA and Massachusetts Department of Transportation had released a trial real-time data feed for the positioning of vehicles on five of its bus routes. This is very important data to have, and while obviously everyone would like to see more routes added, it’s a start.

I decided to hack together a mashup of this data with Google Maps, to see how easy it would be. In the end it took me a few hours on Saturday to get the site up and running, and a couple more on Sunday adding features like the drawing of routes on the map, colorizing markers for inbound vs. outbound buses, and adding reverse geocoding of the buses themselves.

To do this I used three technologies (Google App Engine, JQuery, Google Maps) and two data sources (the real-time XML feed and the MBTA Google Transit Feed Specification files).

Google App Engine

App Engine is so perfectly suited for smaller, playtime hacks like this that it’s hard to imagine how anyone got anything done before it existed. The tedious, up-front bootstrapping that is required in so many programming projects has been enough to completely turn me off to small, spare-time hacking projects on occasion in the past. The brilliance behind a hosted software environment is obvious, but the amount of work to build a safe, hosted system with a fairly comprehensive set of APIs seems to be such a mountain of work that in many ways I find it surprising that anyone – even, perhaps especially, Google – built it at all.

I chose the Python SDK and the programming was straightforward and easy. It takes some elements from Django, with which I am familiar from work.

JQuery

A no-brainer. Hands down the best JavaScript toolkit available. Making the AJAX calls to get route and vehicle location information was a breeze, and the transparent handling of the XML data of the real-time feed prevents me from losing the will to live – a common feeling when dealing with XML.

My only complaint is with the documentation. While the API reference is good for any given piece of the API, the examples are a little light and there is absolutely zero cross-referencing to other parts, especially ones not a part of JQuery itself. It was not obvious, for example, how to deal with the XML document returned by the AJAX call. It sounds like the docs are getting some work, though, so this will hopefully improve.

Google Maps

This was my first endeavor with the Maps API, and it’s good. It’s not the best API in the world, but it’s hardly the worst either. Adding markers of different colors is annoying, but not so onerous as to make it tedious. The breadth of functionality provided is impressive, but then again it has been around for a few years at this point. Markers are easy to add, drawing the route map is absolutely trivial with a KML file, and even the reverse geocoding – which gives you a street address given a latitude/longitude pair – is straightforward.

The docs suck, though. There’s no indication that a size or anchor position is required when creating an icon for a custom marker – required for colors other than red – and due to the minified JS files tracking down that error took longer than any other task in the project. Reverse geocoding mentions that a Placemark object will be returned, but that class doesn’t appear anywhere in the reference documentation.

Real-time data feed

Lots to like. Straightforward, easy to parse. It’d be nice if I didn’t have to do the reverse geocoding to figure out what the street address is, but it’s not a dealbreaker. Main downside is that it’s XML as opposed to JSON. And of course, it’s only 5 bus routes and zero subway and commuter rail routes.

MBTA Google Transit Feed Specification files

A comprehensive set of data describing every transit route, every stop, and every route in the MBTA system. An impressive set of data encoded in a format designed for Google Transit. There is a set of example tools to view and manipulate this data, and one of those translates this data into a KML file for use with Google Maps. I should have tweaked the tools to output only the KML for the routes I cared about, but I did this by hand instead… not a big deal for only 5 bus lines. These KML files are fed into the Google Maps API to display the route as a blue line on the map when selected.

POKE 47196, 201

This is what a lot of programming is like now, for better and for worse.

On the one hand it is the perfect example of high-level component-oriented programming. Data is formatted in easily parseable interchange formats and plugged into well-defined interfaces. These interfaces plug into other interfaces. The result is a zoomable, pannable map with real-time bus location information that updates every 15 seconds. The lines-of-code count is around 100 including both Python and JavaScript. With a few hours work, I built something modestly useful out of nothing. I stand on the shoulders of giants.

On the other hand I didn’t really build anything. This is just assembly line programming. It was not a particularly creative endeavor, and it wasn’t challenging intellectually. Anybody could have done it. It’s cool, but there is little sense of accomplishment in the end product. It feels a little hollow.

Which is not to say that I didn’t enjoy it, or that it wasn’t worth the effort. I learned new technology, I played with software and data that I hadn’t had the opportunity to before. I broadened my horizons, however slightly. And it got me to write this blog post.

Python daemon threads considered harmful

2009-02-24T17:28:36-05:00

Update April 2015: Reading it again years later, I regret the tone of this post. I was frustrated at the time and it comes across now as just smarmy. Still, I stand by the principal idea: that you should avoid Python’s daemon threads if you can.

Update June 2015: This is Python bug 1856. It was fixed in Python 3.2.1 and 3.3, but the fix was never backported to 2.x. (An attempt to backport to the 2.7 branch caused another bug and it was abandoned.) Daemon threads may be ok in Python >= 3.2.1, but definitely aren’t in earlier versions.

The other day at work we encountered an unusual exception in our nightly pounder test run after landing some new code to expose some internal state via a monitoring API. The problem occurred on shutdown. The new monitoring code was trying to log some information, but was encountering an exception. Our logging code was built on top of Python’s logging module, and we thought perhaps that something was shutting down the logging system without us knowing. We ourselves never explicitly shut it down, since we wanted it to live until the process exited.

The monitoring was done inside a daemon thread. The Python docs say only:

A thread can be flagged as a “daemon thread”. The significance of this flag is that the entire Python program exits when only daemon threads are left."

Which sounds pretty good, right? This thread is just occasionally grabbing some data, and we don’t need to do anything special when the program shuts down. Yeah, I remember when I used to believe in things too.

Despite a global interpreter lock that prevents Python from being truly concurrent anyway, there is a very real possibility that the daemon threads can still execute after the Python runtime has started its own tear-down process. One step of this process appears to be to set the values inside globals() to None, meaning that any module resolution results in an AttributeError attempting to dereference NoneType. Other variations on this cause TypeError to be thrown.

The code which triggered this looked something like this, although with more abstraction layers which made hunting it down a little harder:

try:
    log.info("Some thread started!")
    try:
        do_something_every_so_often_in_a_loop_and_sleep()
    except somemodule.SomeException:
        pass
    else:
        pass
finally:
    log.info("Some thread exiting!")

The exception we were seeing was an AttributeError on the last line, the log.info() call. But that wasn’t even the original exception. It was actually another AttributeError caused by the somemodule.SomeException dereference. Because all the modules had been reset, somemodule was None too.

Unfortunately the docs are completely devoid of this information, at least in the threading sections which you would actually reference. The best information I was able to find was this email to python-list a few years back, and a few other emails which don’t really put the issue front and center.

In the end the solution for us was simply to make them non-daemon threads, notice when the app is being shut down and join them to the main thread. Another possibility for us was to catch AttributeError in our thread wrapper class – which is what the author of the aforementioned email does – but that seems like papering over a real bug and a real error. Because of this misbehavior, daemon threads lose almost all of their appeal, but oddly I can’t find people really publicly saying “don’t use them” except in scattered emails. It seems like it’s underground information known only to the Python cabal. (There is no cabal.)

So, I am going to say it. When I went searching there weren’t any helpful hints in a Google search of “python daemon threads considered harmful”. So, I am staking claim to that phrase. People of The Future: You’re welcome.

joe shaw

Error handling in Go HTTP applications

An API-safe error interface

Common sentinel errors

Wrapping sentinels

Handling errors in the API

An example

Impact

Abusing go:linkname to customize TLS 1.3 cipher suites

Understanding Go panic output

Update

Testing with os/exec and TestMain

Don't defer Close() on writable files

Update

Update 2

Revisiting context and http.Handler for Go 1.7

A quick recap

The new approach

Smaller Docker containers for Go apps

The old way

Separating containers for building and running

The builder

Dynamically linked binary

The runner

Putting them together

Conclusion

Go's net/context and http.Handler

About http.Handler

About context.Context

Approaches

Global request-to-context mapping

http.ResponseWriter wrapper struct

Custom context handler types

Contributing to GitHub projects

Workflow

Clone the project you want to work on

Fork the project you want to work on

Create a feature branch to do your own work in

Commit your changes to your feature branch

Push your feature branch to your fork in GitHub

Send a pull request for your branch on your fork

Terrible Vagrant/Virtualbox performance on Mac OS X

Update

Linux input ecosystem

Kernel

udev

X

AVCHD to MP4/H.264/AAC conversion

Real-time MBTA bus location + Google Maps mashup

Google App Engine

JQuery

Google Maps

Real-time data feed

MBTA Google Transit Feed Specification files

POKE 47196, 201

Python daemon threads considered harmful

About `http.Handler`

About `context.Context`

`http.ResponseWriter` wrapper struct