Protocol Buffers

Matt Cutts of Google declared on his blog that Google had open sourced protocol buffers. They encode data in binary format for transmission. You write a description of the protocol you desire. The Google code then generates classes to work with the protocol. It supports the C++, Java, and Python programming languages. There are over 10k protocols used by Google itself.

People have said Protocol Buffers look similar to Facebook Thrift. And Thrift supports even more languages like Perl, PHP, XSD, C#, Ruby, Objective C, Smalltalk, Erlang, Ocaml, and Haskell. Matt Cutts has gone on the record as stating that Protocol Buffers predate Thrift. Both Thrift and Protocol Buffers are based on old ideas such as Corba and IDL. Some have commented that it would be nice if you could map Protocol Buffers automatically to XML.

Protocol buffers create stubs for your RPCs. People have commented that Protocol Buffers look a lot like JSON. Sometimes Google refers to Protocol Buffers as pbuffers. Google uses it exclusively for talk between servers. Google itself uses C++ for programs that run on production machines. This is in order to get the best performance.

I consulted the Google developer’s guide on Protocol Buffers. It declares that they are language neutral, platform neutral, and extensible. Their intended purpose is to serialize structured data. It claims that Protocol Buffers are smaller, faster, and simpler than XML. You define message types in “.proto” files. They contain name value pairs. Fields are numbered in each message type. When you add fields, the result is still backward compatible.

Mark Pilgram, another Google employee, likens a proto file to a schema. It does not contain data. He says that Protocol Buffers are designed to minimize network traffic and maximize performance. They can be nested. And they are both backward and forward compatible. He stated they will not replace JSON.

I have not used Protocol Buffers myself. However if Google uses them that much, there must be some really good benefits to them. Unfortunately my own project at work seems to be going in the direction of XML. I think we are officially prototyping it next year in production.