Fixing Unreadable QUIC-go Panic Messages

by SLV Team 41 views

Hey guys! Ever stumble upon a cryptic error message that leaves you scratching your head? I recently ran into a doozy while working with quic-go, and I'm here to share my experience and hopefully save you some debugging time. Specifically, I encountered an unreadable panic message when dealing with excessively large varints. Let's dive in and see how we can make these messages more helpful.

The Problem: An Obscure Panic

I was tweaking some error codes in my application when BAM! I got hit with this: panic: (struct { message string; num uint64 }) 0x140002a4150. Not exactly the most informative message, right? It's like the computer is speaking a foreign language. The message was intermittent because of the nature of my application error. It was like a ghost in the machine. It took me a while before I figured out what was going on. After some digging and peeking at the stack trace, I traced it back to a panic in the qvarint package of quic-go. The specific line of code that triggered the error was in varint.go.

The heart of the problem lies in how quic-go handles variable-length integers (varints). Varints are a clever way to encode numbers, using fewer bytes for smaller values and more bytes for larger ones. But there's a limit. If you try to jam a value that's too big into a varint, things go south. In this case, the error message was supposed to tell me that my value didn't fit within the 62-bit limit, but instead, I got a useless string of hex. This is the pain of unreadable panic messages!

The Root Cause: Exceeding the Varint Limit

My application's error codes were the culprits. I was assigning 64-bit random values to my error constants. It was highly likely that at least one of bits 63 or 64 would be set. I was exceeding the maximum value that could be encoded as a varint. The core issue is that quic-go uses varints to encode certain values, including error codes. When the values exceed the allowed 62-bit limit, the library panics. The issue stems from the fact that the panic message itself isn't particularly helpful. In this case, it just spits out the memory address of the struct. The real intent of the panic was to convey the message "value doesn't fit into 62 bits". Imagine how much easier my debugging would have been if I'd seen that message right away!

To make things even clearer, let's look at the minimal reproducer code I created:

package main

import (
	"context"
	"fmt"
	"time"

	"github.com/madflojo/testcerts"
	"github.com/quic-go/quic-go"
)

func main() {
	ca := testcerts.NewCA()
	hostCerts, err := ca.NewKeyPair("localhost")
	if err != nil {
		panic(err)
	}

	hostTLSConfig, err := hostCerts.ConfigureTLSConfig(ca.GenerateTLSConfig())
	if err != nil {
		panic(err)
	}

	ln, err := quic.ListenAddr("localhost:8888", hostTLSConfig, nil)
	if err != nil {
		panic(err)
	}
	defer ln.Close()

	clientCerts, err := ca.NewKeyPair("localhost")
	if err != nil {
		panic(err)
	}
	clientTLSConfig, err := clientCerts.ConfigureTLSConfig(ca.GenerateTLSConfig())
	if err != nil {
		panic(err)
	}

	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
	defer cancel()

	outboundConn, err := quic.DialAddr(ctx, "localhost:8888", clientTLSConfig, nil)
	if err != nil {
		panic(err)
	}

	// Accept the connection on the host.
	if _, err := ln.Accept(ctx); err != nil {
		panic(err)
	}

	s, err := outboundConn.OpenStream()
	if err != nil {
		panic(err)
	}

	if _, err := s.Write([]byte("test")); err != nil {
		panic(err)
	}

	fmt.Println("About to cancel write stream")
	// This value exceeds the 1<<62 limit on var ints,
	// and the resulting panic is unreadable:
	/*
		panic: (struct { message string; num uint64 }) 0x14000223050
	*/
	s.CancelWrite(1 << 63)

	// Ensure the process stays alive long enough for the cancel to be processed.
	time.Sleep(time.Second)
}

This code sets up a basic quic-go connection and then intentionally tries to cancel a stream with a value that's too large. When you run this, you'll see the unhelpful panic message.

The Solution: Keeping Error Codes Within Bounds

The fix is simple: make sure your error codes (or any values that are encoded as varints) stay within the 62-bit limit. I adjusted my application's error codes to fit this constraint. This prevented the panic from occurring in the first place. You can achieve this by using constants, bitwise operations, or any other method to ensure your values are within the acceptable range.

Improving Error Messages: A General Tip

Here's a general tip that can save you a lot of headaches: If the type passed to panic has a String() string method, that string value is rendered instead of the raw struct value. This means you can customize your error messages. I did not fix this in the quic-go library, but this is a nice trick in Go. You can define a String() method on your custom error types to provide much more informative messages.

Conclusion: Making Debugging Easier

Debugging can be a real pain, especially when you're faced with cryptic error messages. By understanding the limits of varints in quic-go and ensuring that your values stay within those limits, you can avoid this particular issue. Also, remember that a well-crafted error message can save you a ton of time and frustration. Let's make our code and the tools we use more user-friendly, guys! I hope this helps you out. Happy coding!