Skip to main content

Sudarsan's Blog

A little code generation in Go in 2020.

Perl gives me headaches. But I’m fond of one of Larry Wall’s three virtues: Laziness. This is a story of how sometimes laziness lets you do fun things.

Distributed systems and microservices are as ubiquitous as the coronavirus in 2020. This results in a requirement for a lot of integration. And a nice way to maintain the glue code is to use SDKs rather than have multiple services write the transport glue themselves. Like so…

microservices

It was all nice and fun until we had to start writing an SDK for each new microservice.

And each new endpoint.

And each new parameter addition.

Generating these parts would make new additions trivial, keep the SDK codebases homogenous and eliminate the occurence of manual errors.

So the SDK has two components. The component that is manually built and the parts that can be generated. Here is a layout of the project.

---
 - client
 - api.go
 - methods.go
 - go.mod
 - go.sum

client is the http.Client embedded with some retry logic etc. It has two methods that are important. NewRequest which creates a wrapped http request and Send() which is sort of like client.Do().

api.go is a collection of all the methods attached to an interface that the sdk exposes.

methods.go is what we want to generate. It will be a collection of implementations of helpers that call the api and parse and respond readable Go structures. Like so:

https://awesomewebsite/v1/getdetails

func (c *client) GetDetails() (*Details, error) {
	op := client.Operation{
		HTTPMethod: http.MethodGet,
		HTTPPath:   endpoint + "/v1/getdetails",
	}

	var response *Details
	req, err := r.cl.NewRequest(op, &response, nil)
	if err != nil {
		return response, nil
	}
	if err := req.Send(); err != nil {
		return response, nil
	}

	return response, nil
}

If you notice, most of the code is tedious housekeeping that gets repeated. So getrandomstuff becomes

https://awesomewebsite/v1/getrandomstuff

func (c *client) GetRandomStuff() (*RandomStuff, error) {
	op := client.Operation{
		HTTPMethod: http.MethodGet,
		HTTPPath:   endpoint + "/v1/getrandomstuff",
	}

	var response *RandomStuff
	req, err := r.cl.NewRequest(op, &response, nil)
	if err != nil {
		return response, nil
	}
	if err := req.Send(); err != nil {
		return response, nil
	}

	return response, nil
}

Step in code generation. Now what I wanted to do was to use a specification that the user is able to add a configuration that the generator captures and translates into something like the snippets about. Lets define a set of rules to outline this.

type Generator interface {
	Generate(Reader, Writer) error
}

// Reader is a general wrapper on io.Reader to read an input to obtain
// the apis.
type Reader interface {
	ReadAPI(io.Reader) ([]API, error)
}

// Writer is a general wrapper on io.Writer to write the processed file.
type Writer interface {
	WriteAPI(io.Writer, []API) error
}

So, we need something to Read from and somewhere to Write.

Reader

Multiple options presented themselves.

It could be an API specification: But everyone hates swagger so much. It seemed to be punishment for the developers to write the yaml with all of swagger’s quirks.

Writing Swagger yamls are painful and so there is no example.

It could be a JSON/XML based configuration system: Still needed the developer to work on a different “language/syntax” compared to what the rest of the codebase was.

{
    "methods": [
        {
            "name": "GetDetails",
            "endpoint": "v1/getdetails",
            "verb": "Get",
            "inputParams": [],
            "outputParams": [
                {"Details": {}}
            ]
        },
        {
            "name": "GetRandomStuff",
            "endpoint": "v1/getrandomstuff",
            "verb": "Get",
            "inputParams": [],
            "outputParams": [
                {"RandomStuff": {}}
            ]
        }
    ]
}

It could be an interface also written in Go: The user will have to add a method to an interface and define the structures. The downside of this is the http endpoint and verb have to be annoted.

type Client interface{
    // Get v1/getdetails
    GetDetails() (*Details, error) 

    // Get v1/getrandomstuff
    GetRandomStuff() (*RandomStuff, error) 
}

type Details struct{}

type RandomStuff struct{}

Out of these three options, the one I preferred was adding to the interface. It seemed intuitive and granted control to a Go developer. It also presented the biggest challenge of the three. Reading an AST. We could obviously just read the file like a plaintext but that seems very hacky and tightly coupled to a pre-conceived notion of how we expect the signatures to be written. Granted, we still expect annotations to be in a certain way but its much more malleable.

AST - Quick primer

ASTs are very common in compiled languages. An AST is an intermediate form of representation of code that the compiler can read and optimize. If you think about it, what we are doing is what a compiler would do. The only difference here being us translating go code to more go code. This brings us to…

Compilers - Very very quick primer

Most compilers break down into three primary stages: Parsing, Transformation and Code Generation.

Parsing: Parsing is generally broken down into Lexical analysis and synctactic analysis. A lexer generally is responsible for this tokenization process. Tokens are tiny little objects that describe an isolated piece of the syntax. They could be numbers, operators or symbols amongst other things. Syntactic analysis is what we are interested in. It takes the tokens and builds an intermediate representation called the Abstract Syntax Tree. Luckily for us, go allows us to use their tools: go/ast, go/parser, go/token.

I’m not the biggest fan of the internal package in go but code generators seem to be the perfect candidate to put in here.

I created two packages internal/generator/ and internal/cmd to add the generator code.

The parsing part of the AST was already done by Go. All I had to do was to call the already existing functions like so…

    fs := token.NewFileSet()
	p, err := parser.ParseFile(fs, "", r, parser.ParseComments)
	if err != nil {
		return nil, err
    }

Transformation: Transformation is manipulating the AST to make changes to it to enable the final process of…

I wanted to transform the AST into a data model that would be easy for the code generation to happen. I laid out a structure like this because I was dealing with APIs and their methods.

// API is a set of indicatives that we will loop through
// to generate code.
type API struct {
	InterfaceName string
	Methods       []Method
}

// Method represents a set of metadata levels details that belong
// to an api method.
type Method struct {
	Endpoint     string
	MethodName   string
	Path         string
	Verb         string
	Body         string
	InputParams  []Parameter
	OutputParams []Parameter
}

Now that I had the AST parsed thanks to ParseFile, I had to loop through it to obtain:

  • All the interface declarations.
  • All the methods attached to the interface declarations.
  • The input and output parameters of these methods.
  • The code annotation talking about the http verb and endpoint.

Decls is all the declarations. We loop through that and assert for interfaces.

...
	// loop through all the top level declarations of the AST.
	for _, decl := range p.Decls {
		// We don't care about functional declarations, so lets pick
		// only the generic ones.
		decl, ok := decl.(*ast.GenDecl)
		// We don't really need to process token.TYPE.
		if !ok || decl.Tok != token.TYPE {
			continue
		}

		var api API
		for _, spec := range decl.Specs {
			//We only want to read through type declarations.
			typeSpec, ok := spec.(*ast.TypeSpec)
			if !ok {
				continue
			}

			api.InterfaceName = typeSpec.Name.String()

			// Again, we are only interested in Interface definitions.
			interfaceType, ok := typeSpec.Type.(*ast.InterfaceType)
			if !ok {
				continue
			}
...

Now that we have the interfaces we extract the other stuff we mentioned above from 2-4.

    func getMethods(fields []*ast.Field) []Method {
    	var methods = make([]Method, 0)
    	for _, method := range fields {
    		if len(method.Names) == 0 {
    			continue
    		}
    
    		var ignore bool
    		var matches [][]string
    		for _, d := range method.Doc.List {
    			comment := extractComment(d.Text)
    			if comment == "Ignore" {
    				ignore = true
    			} else {
    				matches = verbEndpointRegex.FindAllStringSubmatch(comment, -1)
    			}
    		}
    		if ignore {
    			continue
    		}
    
    		fn, ok := method.Type.(*ast.FuncType)
    		if !ok {
    			continue
    		}
    
    		var methodData Method
    		methodData.Verb = matches[0][0]
    		methodData.Path = matches[1][0]
    		methodData.MethodName = method.Names[0].String()
    		methodData.InputParams = getFields(fn.Params.List)
    		methodData.OutputParams = getFields(fn.Results.List)
    
    		methods = append(methods, methodData)
    	}
    	return methods
    }
    
    func extractComment(comment string) string {
    	slashRemoved := strings.Trim(comment, "//")
    	return strings.Trim(slashRemoved, " ")
    }
    
    func getFields(fields []*ast.Field) []Parameter {
    	params := make([]Parameter, 0)
    	for _, field := range fields {
    		var name string
    		if len(field.Names) > 0 {
    			name = field.Names[0].Name
    		}
    		switch t := field.Type.(type) {
    		case *ast.SelectorExpr:
    			p := Parameter{
    				Name: name,
    				Type: selectorToString(t),
    			}
    			params = append(params, p)
    		case *ast.StarExpr:
    			p := Parameter{
    				Name: name,
    				Type: starToString(t),
    			}
    			params = append(params, p)
    		case *ast.Ident:
    			p := Parameter{
    				Name: name,
    				Type: t.Name,
    			}
    			params = append(params, p)
    		}
    
    	}
    	return params
    }
    
    func selectorToString(t *ast.SelectorExpr) string {
    	var param bytes.Buffer
    	if ident, ok := t.X.(*ast.Ident); ok {
    		param.WriteString(ident.Name)
    	}
    	param.WriteString(".")
    	param.WriteString(t.Sel.Name)
    	return param.String()
    }
    
    func starToString(t *ast.StarExpr) string {
    	var param bytes.Buffer
    	param.WriteString("*")
    	if ident, ok := t.X.(*ast.Ident); ok {
    		param.WriteString(ident.Name)
    	}
    	return param.String()
    }

Code Generation: Is basically generating runnable, optimized code in the format the compiler wants and for that we have the …

Writer

The writer reads the transformation we applied (The API struct) and uses go’s interesting template library to inject the values.

The template library to achieve our example looks like so

const fnTmpl = `
		{{$i := separator ", "}}
		{{$o := separator ", "}}
		{{$length := len .OutputParams}}
		func (c *client) {{.MethodName}} ({{range .InputParams}}{{call $i}}{{.Name}} {{.Type}}{{end}}) ( 
		{{range .OutputParams}} {{call $o}} {{.Name}} {{.Type}} {{end}}){
			op := client.Operation{
				HTTPMethod: http.Method{{.Verb}},
				HTTPPath:   endpoint + "/{{.Path}}",
			}

			{{.Body}}

			var response {{(index .OutputParams 0).Type}}
			req, err := c.cl.NewRequest(op, &response, nil)
			if err != nil {
				{{if eq $length 1}} return nil {{ else }} return response, nil {{ end }} 
			}
			if err := req.Send(); err != nil {
				{{if eq $length 1}} return nil {{ else }} return response, nil {{ end }} 
			}

			{{if eq $length 1}} return nil {{ else }} return response, nil {{ end }} 
		}
`

func separator(s string) func() string {
	i := -1
	return func() string {
		i++
		if i == 0 {
			return ""
		}
		return s
	}
}

The injection looks like this.

for _, api := range apis {
	for _, method := range api.Methods {
		t := template.Must(template.New("func").
			Funcs(template.FuncMap{"separator": separator}).
			Parse(fnTmpl))
		t.Execute(&buf, method)
	}
}

And there you have it. The generation parts. To get this to work with go generate all you have to do is add this to the root file.

//go:generate go run internal/cmd/genmethods/generator.go