Abandoning `done`

If you’ve been writing Go for awhile, you might find that writing a handy func for checking context’s done-ness. I definitely thought this was clever and it started showing up in a few of my projects.

func done(ctx context.Context) bool {
	select {
	case <-ctx.Done():
		return true
	default:
		return false
	}
}

What’s nice about it, is that it does a non-blocking check on the normally blocking read from the ctx.Done() channel. I used it primarily in loops that would otherwise be doing non-selectable actions.

for !done(ctx) {
	doWork()
}

Without the helper function this loop would look like this:

for {
    select {
	case <-ctx.Done():
		break
	default:
        doWork()
	}
}

Instead of wrapping everything in a select that relied on the default to become non-blocking, I made the loop check the context before proceeding. In my mind, this satisfied all the logic I was concerned with: it exited when the context was canceled, and it was checked every iteration of the loop.

The refactor factor

It’s easy to imagine a scenario when this pattern becomes a foot-gun: the moment you want a second selectable item to be checked - e.g. any channel operations. If and when you need to start working with a channel, and because context is in the mix that likelihood is already high, you now need to remember to refactor. If you forget to refactor it looks like this:

for !done(ctx) {
	msg := <-msgs
	process(msg)
}

This works until ctx is canceled right after the done(ctx) check but before <-msgs unblocks. Now your function is stuck waiting forever. The proper refactor should be:

for {
	select {
	case <-ctx.Done():
		return
	case msg := <-msgs:
		process(msg)
	}
}

So that’s the obvious code-smell problem, diligence doesn’t scale and you’re leaving potential foot-guns around in your code, for future you and your collaborators.

Test early and often

I’ll admit, this problem wasn’t something I appreciated until I spent some more time with code-bases that are well tested (read: not mine). In a test of our !done wielding loop, how do you verify that the loop exited because of context cancellation and not EOF, socket close, or an internal error? You can’t. The cancellation check is hidden. Your test just sees that the function returned.

func readLoop(ctx context.Context, r io.Reader) error {
	buf := make([]byte, 1024)
	for {
		select {
		case <-ctx.Done():
			return ctx.Err()
		default:
			_, err := r.Read(buf)
			if err != nil {
				return err
			}
		}
	}
}

This updated loop now returns the fact-of context cancellation and our test code will be able to compare the returned err and context.Cancled with errors.Is.

func TestReadLoopReturnsContextError(t *testing.T) {
	pr, _ := io.Pipe()
	ctx, cancel := context.WithCancel(context.Background())

	errCh := make(chan error)
	go func() {
		errCh <- readLoop(ctx, pr)
	}()

	time.Sleep(100 * time.Millisecond)
	cancel()

	select {
	case err := <-errCh:
		if !errors.Is(err, context.Canceled) {
			t.Fatalf("expected context.Canceled, got: %v", err)
		}
	case <-time.After(1 * time.Second):
		t.Fatal("readLoop did not return after context cancel")
	}
}

This test doesn’t just check if the function returned it checks why. That’s the difference. Returning ctx.Err() makes cancellation a real, testable outcome. With done(ctx), you’d never know if it exited due to context or something else.

The lead, unburied

I have to admit that all of this was a meandering path to convince myself that I should stop using this pattern, but some seasoned gophers have been shouting at the screen “ctx.Err() is already a non-blocking done-ness check!” So let’s write the loop in the simple case, where no other selectable items need evaluated and we only need a non-blocking done-ness check.

func run(ctx context.Context) {
	for {
		if ctx.Err() != nil {
			break
		}
		doWork()
	}
}

The done(ctx) pattern may feel like a clever shortcut and in the simplest loops, it kind of is. But once you care about testability, composability, or shared understanding across a team, it starts to work against you. The real tool I was looking for is already in the standard library: ctx.Err() It’s a fast, non-blocking, and idiomatic way to check cancellation. It’s safe to use outside a select, and it won’t block.

Abandoning done

The refactor factor

Test early and often

The lead, unburied

Abandoning `done`