preon under the hood

9
1 Under The Hood What's going on behind the scenes? Wilfred Springer Table of Contents 1. Codec .......................................................................................................................... 1 2. CodecFactory ............................................................................................................... 2 2.1. CompoundCodecFactory ...................................................................................... 5 2.2. WholeNumberCodecFactory ................................................................................. 5 2.3. BooleanCodecFactory ......................................................................................... 6 2.4. ObjectCodecFactory ........................................................................................... 6 2.5. ListCodecFactory ................................................................................................ 7 3. Binding ....................................................................................................................... 7 4. CodecDecorator ........................................................................................................... 8 4.1. LazyLoadingCodecDecorator ................................................................................. 8 4.2. SlicingCodecDecorator ........................................................................................ 8 The previous chapter introduced a couple of simple introductory cases, showing some of the tricks the framework has up its sleeves if you do not feel the desire to customize anything. While you were reading that chapter, you might have already gone like "hmmm, that's sweet, but unfortunately it doesn't work in my corner case". If that's what you thought, then this is the chapter you need. In this chapter, I will explain what is actually going on under the hood, in order to understand how to extend the framework yourself. The focus in this chapter is on four major abstractions: the Codec interface itself, the CodecFactory interface, the Binding interface and the CodecDecorator interface. In addition to discussing the actual interface, this chapter will also discuss some of the implementations, in order to help you to understand how everything fits together. 1 Codec Let's first revisit the Codec interface. In the previous chapter, we already talked about the way you use this interface. In fact, we said that you better use the Codecs convenience class in all cases. If you do that, then there is actually very little to know about the Codec interface itself. It just magically works. However, if you want to extend the framework yourself, then this is the first interface you really need to understand, since this is probably the interface you need to implement. Figure 1. Codec Interface public interface Codec<T> {

Upload: wilfred-springer

Post on 10-Apr-2015

3.039 views

Category:

Documents


6 download

DESCRIPTION

An overview on what's going on behind the scenes when you are using Preon.

TRANSCRIPT

Page 1: Preon Under the Hood

1

Under The HoodWhat's going on behind the scenes?

Wilfred Springer

Table of Contents

1. Codec .......................................................................................................................... 12. CodecFactory ............................................................................................................... 2

2.1. CompoundCodecFactory ...................................................................................... 52.2. WholeNumberCodecFactory ................................................................................. 52.3. BooleanCodecFactory ......................................................................................... 62.4. ObjectCodecFactory ........................................................................................... 62.5. ListCodecFactory ................................................................................................ 7

3. Binding ....................................................................................................................... 74. CodecDecorator ........................................................................................................... 8

4.1. LazyLoadingCodecDecorator ................................................................................. 84.2. SlicingCodecDecorator ........................................................................................ 8

The previous chapter introduced a couple of simple introductory cases, showing some of the tricksthe framework has up its sleeves if you do not feel the desire to customize anything. While you werereading that chapter, you might have already gone like "hmmm, that's sweet, but unfortunately itdoesn't work in my corner case". If that's what you thought, then this is the chapter you need.

In this chapter, I will explain what is actually going on under the hood, in order to understand howto extend the framework yourself. The focus in this chapter is on four major abstractions: theCodec interface itself, the CodecFactory interface, the Binding interface and the CodecDecoratorinterface. In addition to discussing the actual interface, this chapter will also discuss some of theimplementations, in order to help you to understand how everything fits together.

1 Codec

Let's first revisit the Codec interface. In the previous chapter, we already talked about the way youuse this interface. In fact, we said that you better use the Codecs convenience class in all cases. Ifyou do that, then there is actually very little to know about the Codec interface itself. It just magicallyworks.

However, if you want to extend the framework yourself, then this is the first interface you reallyneed to understand, since this is probably the interface you need to implement.

Figure 1. Codec Interface

public interface Codec<T> {

Page 2: Preon Under the Hood

Under The Hood

2

T decode(BitBuffer buffer, Resolver resolver, Builder builder) throws DecodingException; int getSize(Resolver resolver); Expression<Integer, Resolver> getSize(); CodecDescriptor getCodecDescriptor(); Class<?>[] getTypes(); Class<?> getType();}

The decode(BitBuffer, Resolver, Builder) is obviously the operation that will decode datafrom the BitBuffer into an instance of T.

The Resolver allows the Codec to resolve references used in Preon annotations.

The Codec interface is the interface implemented by objects that are able to decode data from aBitBuffer and to encode data into a BitBuffer1. In addition to that, the Codec needs to be able tomake some sort of prediction on the number of bits occupied by the encoded data. And last but notleast, the Codec needs to be able to return a CodecDescriptor, used for rendering documents with adescription of the Codec.

When I just said that Codecs are capable of decoding data from a BitBuffer, you could have actuallyread 'decoding an object from a BitBuffer'. Codecs are associated to a single type, and are expectedto return only a single instances of that type from the BitBuffer.

In many cases, your objects will hold references to many other objects. Does that mean that a singleCodec needs to be able to recursively traverse the object graph and understand how to decode eachof the individual members of the objects that it encounters?

The answer to that question is both yes and no. "Yes", since whenever you invoke decode on theCodec, it needs to be able to reproduce all objects that are referenced by the object that is goingto be returned. "No", since the Codec does not have to understand all of that itself. It can simplydelegate to other Codecs, one for every type of attribute it encounters.

If you are decoding an object, you could actually use the Codec interface directly. If you create aninstance using the Codecs class, they just magically know what to do. We just learned that Codecsconstructed like this, will most likely delegate their work to other Codecs, which in turn will mostlikely delegate their work to other Codecs, and so on and so forth. This way, the Codec both hides achain of responsibility, but is also able to act as a link inside a chain of responsibility.

2 CodecFactory

The previous section ended with stating that a single Codec most likely delegates to other Codecs,which in turn delegate to other Codecs, etc. Obviously, each Codec has to be constructed before itcan be used. All of these instances are created as a result of the create() method on Codecs. But howdoes the Codecs class know which ones to create?

As you might have already have guessed by its name, it is of course the CodecFactory. TheCodecFactory a single operation, that is expected to be able to return Codec from the context passedin, or return null.

1 The latter is currently not supported yet, but it's not unlikely that it's going to be implemented in the future.

Page 3: Preon Under the Hood

Under The Hood

3

Figure 2. CodecFactory Interface

public interface CodecFactory { <T> Codec<T> create(AnnotatedElement metadata, Class<T> type, ResolverContext context);}

Almost every type of Codec you will ever use, will be created by a CodecFactory. If you are searchingthe Preon codebase for the different type of Codecs supported by it, then chances are you willstumble across CodecFactories only. And if you want to extend the framework with your ownCodecs, the thing you actually need to pass to the Codecs class is a CodecFactory, and not a Codec.

The CodecFactory needs to be able to create a Codec from three parameters passed in. The type ofobject expected, a socalled ResolverContext and metadata. If the Codec is used to decode data to beinjected in a field, then the metadata provides access to the annotations defined on that field2.

The third parameter (ResolverContext) is a Limbo ReferenceContext. This is the object that supportsyour CodecFactory in creating references to the context of the field for which it is currently trying tocreate a reference.

Example 1. Expression sample

class Stuff { @Bound int nrOfThings; @BoundList(size="nrThings") Thing[] things;}

Let's take the class above as an example, and assume that your CodecFactory needs to see if it isable to construct a Codec for the "things" field. The annotation contains an expression defining thenumber of "things" in the array. However, the size annotation attribute is just a String.

In order to be able to turn the expresion "nrOfThings" into something usable, we need to turn thatexpression into something we can actually evaluate - in this case, a Limbo Expression object. And ifthere is a problem with this expression, we obviously want to find out early.

2 In general, the CodecFactory should not rely on the assumption that the metadata passed in is based on a field. It should just treat it as anumber of hints suggesting how to decode data.

Page 4: Preon Under the Hood

Under The Hood

4

Figure 3. Expression applied

There are basically two things that could be wrong with the expression: either the expression cannotbe interpreted, or the expression contains references to variables not available in the context inwhich the expression will be evaluated. The ResolverContext supports detecting the latter case.

This is the way it works: whenever your CodecFactory gets a chance to create a Codec based onmetadata, type information and a ResolverContext passed in, it will have to use the ResolverContextto create Expression objects. The Expressions class used to create Expression instances accepts aReferenceContext as a parameter, and ResolverContext is nothing but a special ReferenceContext.(One specific to Preon.) Normally, you would pass the ResolverContext directly, but there are casesin which you could consider replacing or wrapping the ResolverContext3.

So, the first and most important purpose of the ResolverContext is early validation of yourexpression. The second and also important purpose of the ResolverContext is to facilitatedocumentation getting generated from your Codec.

If your CodecFactory constructs a Codec based on an expression, then the documentation generatedby that Codec probably needs to take that into account.

In case of the example of Figure 3, “Expression applied”, you would expect a description similar tothis: "First you read a 32-bit integer. Then you read a list of items with the size corresponding to the32-bit integer you just read before."

Any type of realistic documentation of this file format needs to include this dependency. You wantthe documentation to clearly state that the size of the list size is dependent on the 32 bits you justread before. That last bit "the 32 bits you just read before" is encapsulated in a Reference. And theResolverContext allows the Expressions class to obtain these references without having to analyzethe entire class again.

So, the Expression "nrOfElements" will be parsed into a Reference to the nrOfElements attributedefined before. It's the responsibility of the Reference to render itself in a useful way. The Codec

3 There are actually cases in which you might want to replace that ResolverContext with another one. Typically when you want to introduce newvariables, or if you are basically 'popping' or 'pushing' the stack. (If your current context changes into the context of the attributes type.)

Page 5: Preon Under the Hood

Under The Hood

5

or CodecFactory does not even try to make sense out of it. It just relies on the ResolverContext toreturn a proper reference.

Expression expr = Expressions.create("nrOfElements", resolverContext);// Potentially results in" // "the number of elements defined before"expr.document(....);

Now, this may be one of the areas in which the flexibility is getting a litle in the way of understandingwhat's going on here. However, just to comfort you a little, it's just object orientation. That's all it is.The Reference itself decides how it needs to be represented. And the type of ResolverContext passedto the Codec decides how these References are getting constructed. With that, sky is the limit.

2.1 CompoundCodecFactory

The CompoundCodecFactory must be one of the only implementations of CodecFactory thatdoesn't actually create any other Codec itself. The main purpose of the CompoundCodecFactoryis to hide the complexity of choosing a particular type of CodecFactory behind an interface. TheCompoundCodecFactory references a list of other CodecFactories. When asked for construction of aCodec, it will simply ask all of the factories it references and try each of them to construct a Codec. Ifall fails, it will simply return null, just what you would expect for a CodecFactory.

2.2 WholeNumberCodecFactory

There are many CodecFactories that create Codecs themselves. (There are also a few otherCodecFactories that don't create Codecs themselves, apart from the CompoundCodecFactory, butthat's a subject that can wait for a while while.) This section starts with one of the most simplest: theWholeNumberCodecFactory.

The WholeNumberCodecFactory creates Codecs capable of decoding - well - whole numbers.I didn't want to say integers, since that might lead you to be believe that its capableof decoding java.lang.Integer and its primitive counterpart only, which is not the case.WholeNumberCodecFactory supports decoding byte, short, int, long and all object representations ofthose types.

The WholeNumberCodecFactory is the first example in which a a CodecFactory could use themetadata passed with annotations to decode the data in the proper way. By default, it will return aCodec for byte, short, int or long type of fields whenever:

• null is passed in as metadata;• An @Bound instance is passed in as metadata;• An @BoundNumber instance is passed in as metadata.

The null case is not important for now. The @Bound annotation supports the default case. TheCodecFactory will take this annotation as a signal that it actually needs to create a Codec, using thedefaults for the type passed in. The @BoundNumber annotation supports the case in which you dowant a Codec to be created, but you don't like the defaults.

Here are some examples in which you would want to use the @BoundNumber annotation, instead ofthe @Bound annatation:

Page 6: Preon Under the Hood

Under The Hood

6

• You want to decode 4 bits into a byte.• You want to force little endian when decoding an 32-bit integer.• You want to decode a 32-bit unsigned integer as a long.

The WholeNumberCodecFactory is a good example of the pattern that you will find implemented inalmost all other CodecFactories:

• Attempt to return a default Codec whenever null or @Bound is passed in.

• ...unless there is some other piece of metadata passed in, telling the CodecFactory how tocustomize the Codec it creates.

2.3 BooleanCodecFactory

The BooleanCodecFactory is by far the simplest example of a CodecFactory (and associated Codec)that you could ever come up with. If the CodecFactory is challenged with a boolean type (eitherthe primitive type or the object type) and the presence of an @Bound annotation in the metadatapassed in, then it will create a new Codec. Whenever this Codec is asked to decode a value from theBitBuffer, it will read a single bit and return true if the bit is 1, and false otherwise.

2.4 ObjectCodecFactory

In theory, it could have been possible to construct Codecs by hand, by using a constructor. However,that's not actually something you want to do yourself. The Codec created would have to closelyresemble the datastructure you need to decode. And since every Codec is capable of decodingone value, and there is a fair chance that the object you are trying to decode exists of many otherobjects, it would even be quite hard to do this yourself.

The good news is: you don't have to do all of that yourself. The ObjectCodecFactory basically does itall for you.

The ObjectCodecFactory works on basically all existing objects. If all other CodecFactories fail, thenthe ObjectCodecFactory might still be able to return a Codec, even though the Codec created mightbasically be a no-op. (It won't decode anything.) That's why normally every CompoundCodecFactoryinstance is advised to try the ObjectCodecFactory as a last attempt.

The ObjectCodecFactory is probably much simpler than you would expect. Suppose that you want tocreate a decoder for instances of type A:

1. Get the list of all fields declared on type A.

2. For each field declared on type A, create a Codec for the type of value accepted by that field,and wrap both the reference to the field and the Codec for its values in a new Binding instance.

3. Create a new Codec that - on an invocation of decode - will always create an instance of type A,and populate its data by giving every Binding the opportunity to load its data from the BitBufferpassed in. (The Binding will simply use the Codec it references to do the actual decoding of thefield's value.)

You may have wondered how the ObjectCodecFactory creates a Codec corresponding to the field'stype, in Step 2. The answer is simple: it uses a CodecFactory. It may be different CodecFactory than

Page 7: Preon Under the Hood

Under The Hood

7

the ObjectCodecFactory itself, but it's highly likely that it references the ObjectCodecFactory itselfsomewhere under its covers.

We just said that ObjectCodecFactories use Binding objects under the covers. It's important that youknow about these guys, since the ObjectCodecFactory actually allows you to override the way theywork, by pluging in your own BindingFactory. We are not going to discuss the typical use case yet,but for now it's important at least to know they exist. (Check out Section 3, “Binding” for a typicalexample on how to leverage this plugpoint.)

2.5 ListCodecFactory

With the three CodecFactories listed above, we would already be able to decode nested objects, inwhich each of the objects fields references either a numberic value or another object. But that's notenough. (In fact, we don't even know when it would be enough; that's why Preon is an extensibleframework. We leave it up to you to decide when you consider it to be enough.)

One of the most important missing cases is support for lists. Many binary encoding standards havesome repeating sequences inside. This is where the ListCodecFactory comes in.

The ListCodecFactory allows you to decode lists of objects. At this stage, it's limited to a list of acertain size, but it is expected that there will be other implementations that have other ways todemarcate the end of the list.

The ListCodecFactory creates a Codec whenever the @BoundList annotation is passed in. It will usethe attributes of this annotation to figure out how and what to decode. In the simple case, that couldbe as little as stating the type and the size of the list.

If you would create a Codec simply by passing the type and the size of the list, then theListCodecFactory can be expected to return a Codec implementation that will - on calling decode() -return a List implementation that allows you to visit all elements in the List. Don't expect one of thedefault implementations of List though. By default, the Codec created by the ListCodecFactory willreturn its own implements of List, one that decodes object on the fly, on demand.

It is important to emphasize the difference between the Codec that returns the List instance, and theCodec used to decode elements of the List. These are two completely different Codecs. The Codecdecoding the List is seeded with a Codec capable of reading the elements of the List. ??? shows howthese elements collorate in a real world scenario.

Figure 4. Decoding a List

3 Binding

A couple of sections back, I talked about Bindings, and how the Codec created by anObjectCodecFactory uses these bindings to load and store values from and in a Field. I also said thatyou can customize the way these Bindings behave. This section is going to give you an example.

Many binary formats have some conditional built into the specification. Here are some examples:

• If the version of the file is bigger than 400, then read this data structure first, otherwise continuewith the next data structure.

Page 8: Preon Under the Hood

Under The Hood

8

• If the first bit read is 1, then read 7 bits and decode it as a byte; otherwise continue to read 7 + 8bits and decode as a short.

All of these cases could have been solved by breaking out of the ordinary framework and implementthe entire encoding/decoding logic yourself. That would not only be really hard, but also result incode that will not be self-documenting in the way Preon normally is.

Preon takes another approach, and allows you to declaritively specify the conditions in which theframework should try to load data in bound fields, using the @If annotation. The @If annotationtakes a single argument, which is the condition stated in Limbo.

The reason Preon is capable of dealing with these expression is because it will create a Binding thatis capable of loading data from the BitBuffer if the condition holds. Which is why it is sometimesconvenient to have the ability to create your own custom Binding implementation instead of thedefault one.

4 CodecDecorator

Up til now, we have only seen examples of CodecFactories creating Codecs. Codecs are howevernot always constructed by CodecFactories. If you want the framework to deal with your ownCodec, then there is another way to make it create your own Codec, which is by implementing theCodecDecorator interface.

As you may have guessed, the CodecDecorator has something in common with the DecoratorPattern (see GoF). The CodecDecorator allows you to transparenly add additional behaviour to aCodec, by applying some 'decoration'. So, where the CodecFactory accepts the type and metadataindicating the specific Codec you want to construct, the CodecDecorator also accepts the Codec thatshould be wrapped.

4.1 LazyLoadingCodecDecorator

When you ask the LazyLoadingCodecDecorator to decorate an existing Codec, you get a new Codecthat - on receiving a call to decode - will not pass the call on to the Codec it decorates; instead, it willreturn a proxy that acts as the object that needs to be created by the decororated Codec. The actualobject will only be loaded from the BitBuffer right after you call an operation on that proxy for thefirst time.

The LazyLoadingCodecDecorator is triggered by the presence of the LazyLoading annotation on thetype you want to have loaded lazily.

4.2 SlicingCodecDecorator

The SlicingCodecDecorator decorates existing Codecs by returning a new Codec that - on receivinga decode call - passes the request on to the decorated Codec, but replacing the BitBuffer passed bytaking a slice of the original.

The main reason of having a SlicingCodecDecorator is to support type length value records. The mainidea here is that Codecs reading the records data should not be required to know the amount of data

Page 9: Preon Under the Hood

Under The Hood

9

to be expected. As you might have seen before, the Codec constructed doesn't support passing inthe maximum number of bytes to be read, or anything. And having to implement that logic for all TLVrecord readers seemed to be a waste.

Instead of having every TLV record Codec implement support for detecting the end of the record,the decision was made to externalize that behaviour and put it into a separate Codec, decorating theCodec that reads the data from the TLV record.