Protocol Buffers in Go and JavaScript

Aug 31 2015

As a sequel to my last post on protocol buffers, I’ll now experiment with protocol buffers by sharing data between JavaScript and Go to simulate a client-server transport of the binary data. While Go has built-in marshaling and unmarshaling of JSON and XML which gives us safety, JavaScript certainly does not.

We'll be modeling a Cat message, which is pretty straightforwarded. In my cat.proto file I have:

package demo;

message Cat {
    string name = 1;
    int32 age = 2;

    message Parent {
        string name = 1;
        string email = 2;
    }

    repeated Parent parents = 4;
}

In Go

Let’s get this working in Go first.

We’ll need to first install the protobuf compiler (protoc) which will vary depending on your environment. To get it to work for me on Ubuntu 14.04 I had to compile from source and set the LD_LIBRARY_PATH env var to /usr/local/lib (thx).

Then to install the Go protobuf library we can just:

go get -u github.com/golang/protobuf/{proto,protoc-gen-go}

and we’ve got it.

We’re now ready to compile our Cat definition. We’ll just do:

protoc --go_out=./demo cat.proto

Because I installed the 3.0 alpha version of the compiler, it complained when I first ran this. The changes that I had to make for it to work against version 3.0 of the protobuf spec were:

  • Set the syntax version as the first non-whitespace line of the file. Either syntax = “proto3”; or syntax = “proto2”;. Note: you still set your package after this declaration.
  • The concept of required fields are done away with in 3.0. This means that you also don’t need to designate fields as optional, since all fields can only be optional now. This appears to be a design decision to support better backwards compatibility: if you initially set a field as required but then in the future don’t want to require it to be sent, you’ll break all of your clients. This issue is hinted at in the official docs: “Some engineers at Google have come to the conclusion that using required does more harm than good; they prefer to use only optional and repeated.” The change is referenced in the release notes.

I set the output directory to the name of the .proto file’s package name so that I can easily import it into my Go code and another Go project could easily import it. Here’s what the generated file, cat.pb.go, looks like:

// Code generated by protoc-gen-go.
// source: cat.proto
// DO NOT EDIT!

/*
Package demo is a generated protocol buffer package.

It is generated from these files:
  cat.proto
  dog.proto

It has these top-level messages:
  Cat
*/
package demo

import proto "github.com/golang/protobuf/proto"
import fmt "fmt"
import math "math"

// Reference imports to suppress errors if they are not otherwise used.
var _ = proto.Marshal
var _ = fmt.Errorf
var _ = math.Inf

type Cat struct {
  Name    string        `protobuf:"bytes,1,opt,name=name" json:"name,omitempty"`
  Age     int32         `protobuf:"varint,2,opt,name=age" json:"age,omitempty"`
  Parents []*Cat_Parent `protobuf:"bytes,4,rep,name=parents" json:"parents,omitempty"`
}

func (m *Cat) Reset()         { *m = Cat{} }
func (m *Cat) String() string { return proto.CompactTextString(m) }
func (*Cat) ProtoMessage()    {}

func (m *Cat) GetParents() []*Cat_Parent {
  if m != nil {
    return m.Parents
  }
  return nil
}

type Cat_Parent struct {
  Name  string `protobuf:"bytes,1,opt,name=name" json:"name,omitempty"`
  Email string `protobuf:"bytes,2,opt,name=email" json:"email,omitempty"`
}

func (m *Cat_Parent) Reset()         { *m = Cat_Parent{} }
func (m *Cat_Parent) String() string { return proto.CompactTextString(m) }
func (*Cat_Parent) ProtoMessage()    {}

Now let’s try using it:

package main

import (
  "encoding/json"
  "fmt"
  "github.com/davewalk/protobuf-demo-go/demo"
  "github.com/golang/protobuf/proto"
  "log"
)

func getcat() (data []byte, err error) {
  dave := &demo.Cat_Parent{
    Name:  *proto.String("Dave"),
    Email: *proto.String("doesntmatter@gmail.com"),
  }
  cat := &demo.Cat{
    Name:    *proto.String("Rocky"),
    Age:     *proto.Int32(11),
    Parents: []*demo.Cat_Parent{dave},
  }

  data, err = proto.Marshal(cat)

  return
}

func main() {
  data, _ := getcat()
  cat := &demo.Cat{}
  err := proto.Unmarshal(data, cat)
  if err != nil {
    log.Fatal("problem unmarshaling: ", err)
  }

  fmt.Println(cat.String())
}

The Go proto library provides a nifty String() method on a protobuf message so that I can print it for human reading. When I run this little app I get:

name:"Rocky" age:11 parents:<name:"Dave" email:"doesntmatter@gmail.com" >

Perhaps you noticed the JSON tags in the Cat struct of the generated file above. That means we can easily marshal our Cat protobuf message to JSON as well:

var jsondata []byte
jsondata, err = json.Marshal(cat)
fmt.Println(string(jsondata))
// Returns: {"name":"Rocky","age":11,"parents":[{"name":"Dave","email":"doesntmatter@gmail.com"}]}

That is pretty sweet and means that it’s easy to support both protobuf and JSON with very little additional code.

Now let’s test this. Let’s say I add a Dog message in a dog.proto file that only cares if the dog can bark or not and how many years old it is (this metaphor is starting to fall apart):

syntax = "proto3";

package demo;

message Dog {
    bool barks = 1;
    int32 yearsold = 2;

    message Parent {
        string name = 1;
        string email = 2;
    }

    repeated Parent parents = 4;
}

In my Go app I can try to unmarshaling the Dog bytes into a Dog protobuf message. The Go compiler won’t complain because my getcat() function, now nefarious, returns a slice of bytes. But when I run the app an error is caught and returned:

proto: bad wiretype for field demo.Cat.Name: got wiretype 0, want 2

Cool!

In JavaScript

Let’s shift to good old JavaScript now. The protobuf third-party page shows five libraries for JavaScript with ProtoBuf.js looking to be the most maintained/starred.

ProtoBuf doesn’t require the proto-to-language compilation step like most other languages will, however there is a command-line tool, pbjs, included that can be used to convert your declarations between JavaScript “classes”, .proto files and JSON files.

My example code for encoding and decoding protocol buffers is similar to my Go code:

var pb = require('protobufjs'),
  builder = pb.loadProtoFile("./cat.proto");
  Demo = builder.build('demo'),
  Cat = Demo.Cat;

var getCat = function() {
  var dave = new Cat.Parent({"name": "Dave", "email": "nope@google.com"});
  var newCat = new Cat({
    "name": "Sonny",
    "age": 10,
    "parents": [dave]
  });

  return newCat.encode();
}

var run = function() {
  var cat = getCat();
  console.log(Cat.decode(cat));
  // { name: 'Sonny', age: 10, parents: [ { name: 'Dave', email: 'nope@google.com' } ] }}
}

run();

The object returned by getCat() here is an ByteBuffer instance, ready for transport. There is also a encodeToJSON() method for returning a JSON string:

return newCat.encodeToJSON();
// {"name":"Sonny","age":10,"parents":[{"name":"Dave","email":"nope@google.com"}]}

The Protobuf.js library's documentation includes a warning for using protocol buffers in the browser:

When reading/writing binary data in the browser or under node.js, it is mandatory to understand that just reading it (as a string) is not enough. When doing this, the data will probably become corrupted when it is converted between character sets and ProtoBuf.js will not be able to decode it or will return random stuff. What's actually required here is to properly handle the data as binary.

Since jQuery is often used to make Ajax requests in the browser, I looked for ways to support binary data in that library. It appears to be a long-standing issue with jQuery, however there are custom solutions.

Although I’m not going to try it now, you should be able to see how we can produce protocol buffer messages via a server and send them as binary to a client, which can even be a web browser. We share the message declaration in .proto files across two different languages (and potentially many more) and enjoy some piece of mind in knowing that producers and consumers will throw errors if the data doesn't conform to the structure that we have defined.

Key Takeaways

  • If you’re just getting started with protocol buffers now, use version 3.0 of the spec
  • It’s trivial to encode your data back as JSON also once it’s encoded as a protocol buffer message

I've pushed my JavaScript and Go code to GitHub as reference.

Additional Resources

Discuss this post with me on Twitter.

Send a pull request for this post on GitHub.

Dave Walk is a software developer, basketball nerd and wannabe runner living in Philadelphia. He enjoys constantly learning and creating solutions with Go, JavaScript and Python. This is his website.