THE CURIOUS CASE OF PROTOBUFS…
De-mystifying Google’s hottest binary protocol
Prasanna KanagasabaiJovin Lobo
About us :
Prasanna Kanagasabai : Security Engineer @ ThoughtWorks Member of null- The Open Security Community . Author of IronSAP a module over IronWASP. Speaker @ nullcon-Delhi, Clubhack, IIT Guwahati and
various null meetups.
Jovin Lobo : Associate Consultant @ Aujas Networks Member of null- The Open Security Community. Author of GameOver – Linux distro for learning web
security. Spoken at nullCon, GNUnify before.
Agenda
Introduction. Anatomy of Protobufs
Defining Message formats in .Proto files. Protobuf compiler Python API to read write messages.
Encoding Scheme Problem Statement. Decoding like-a-pro with IronWasp
‘Protobuf Decoder’.
Introduction:
Protocol Buffers a.k.a Protobufs : Protobufs are Google's own way of
serializing structured data . Extensible, language-neutral and
platform-neutral . Smaller, faster and simpler to
implement. Java, C++ and Python
Defining a .Proto file.
#> less Example.protomessage Conference {required string conf_name = 1 ; required int32 no_of_days = 2 ; optional string email = 3 ;
}// * 1,2,3 are unique tags. These are used
by the fields in binary encoding.* For optimization use tags from 1-15 as higher nos. will use one more byte to encode.
Compiling
Syntax: protoc –I=$_input_Dir --
python_out=$_out_Dir $_Path_ProtoFile
Eg: protoc –I=. --python_out=.
Example.proto
This will generate a Example_pb2.py file in the specified destination directory.
$ProtoFile_pb2.py
The Protobuf compiler generates special descriptors for all your messages, enums, and fields.
It also generates empty classes, one for each message type:
Eg:
Reading and writing messages using the Protobuf binary format :
SerializeToString() serializes the message and returns it as a
string.
ParseFromString(data) parses a message from the given string.
Encoding.
example2.protomessage Ex1 { required int32 num = 1; // field tag }
Code snippet:obj = example2_pb2.Ex1();obj.num = 290; // field valueobj.SerializeToString();
Output : 08 A2 02 #hex000010001010001000000010 #binary
Lets Decode it ..
Step 1 : Find the wire type .
Step 2: Find the field number.
Step 3: Find the field tag.
Step1: finding wire type.
0000 1000 1010 0010 0000 0010 To find wire type take the first
byte: 0000 1000 1010 0010 0000 0010
[0]000 1000 Drop MSB from First byte.
0001 000 The last 3 bits give wire type.
Wire type is 000 type = 0 is Varint.
Step 2: Field tag.
What we already have is 0001000 Now we right shift value by 3 bits
and the remaining bits will give us the field tag. 0001000 0001 000 ‘0001 ‘ i.e. ‘ 1’
So we get the field tag = 1
Step 3: Find the field value 0000 1000 1010 0010 0000 0010 We drop the 1st byte
1010 0010 0000 0010 Drop the MSB’s from each of these bytes
1010 0010 0000 0010 010 0010 000 0010
Reverse these bytes to obtain the field value. 000 0010 010 0010 000 0010 010 0010 i.e 256 + 32 + 2 = 290
So we finally get the value of the field = 290.
So we successfully decoded example2.proto
message Ex1 { required int32 num = 1; }
Code snippet:obj = example2_pb2.Ex1();obj.num = 290;obj.SerializeToString();
Output : 08 A2 02 #hex000010001010001000000010 #binary
We successfully Decoded Value : “290”
Automating all this with IronWasp Protobuf Decoder:
About IronWasp : IronWasp is an open-source web security
scanner. It is designed to be customizable to the
extent where users can create their own custom security scanners using it.
Author – Lavakumar Kuppan (@lavakumark)
Website : www.ironwasp.org
01101000001111010000010110111001111001001000000101000101110101011001010111001101110100011010010110111101101110011100110010000000111111
01101000001111010000010110111001111001001000000101000101110101011001010111001101110100011010010110111101101110011100110010000000111111
01101000001111010000010110111001111001001000000101000101110101011001010111001101110100011010010110111101101110011100110010000000111111
Hmmm … Decoding ……