The Scalable Apparat

I am a big fan of Scala and used it for a couple of internal tools here at Hobnox. What I really like about Scala is the fun factor and the programming speed. That is the reason why I decided to port the whole Apparat framework from Java to Scala.

This was not an easy decision since Apparat is a big project. However I have decided to rewrite Apparat because of the following reasons.

  1. Some parts of the framework are not well thought.
  2. TAAS would have been hard to optimize.
  3. This is my (n+1)-th iteration of implementing the ABC format. I think a lot of things can be simplified again.
  4. Scala is an ideal language for Apparat’s tasks.

The Scala source code is already available on Google Code.

Commits will happen early and often. The current revision is able to parse Swf and Swc files. I have also implemented most of the SWF tags which have been part of the Java framework.

I could compare the Scala and Java source every day. Everything is so concise now.

Polyglott Programming On The AVM2

Take some time and think about this tweet for a moment. It took me a while to realize that Joel Hooks is right. I was embaressed of myself. How could I forget about that? But I had also a big smile on my face at the same same time. Let me explain why.

The Java to SWF compiler does not compile Java sourcecode but JVM bytecode to ActionScript bytecode. This means I do not have to teach my program the Java language. It only understands JVM bytecode. This seems like an akward decision on the one hand since working on the bytecode level implies lots of problems. But it turns out that this was a really cool decision on the other hand. “Java to SWF compiler” is maybe the wrong description. “any language that compiles to JVM bytecode to SWF compiler” is maybe better.

So what does any language mean? Here is a list of JVM languages. Now you feel maybe like I did after reading that tweet. And I am really looking forward to get Scala up and running.

Some problems still exist. Threading is one issue and I will basically have to do what Scott Peterson did for Alchemy. But reflections, annotations and method overloading have to be solved as well. Some glitches may exist even after figuring everything out. Stacktraces will look pretty weird. However I think this is a really cool project.

Update: I forgot to mention something important. Java supports native code. This means you can build a library that works with OpenGL for instance. Those native methods can not be converted. There are also some other things that do not work. File access is just one of them.

Getting Rid Of null

A couple of weeks ago I started learning Scala. I can highly recommend it. The language has a lot of great approaches to multithreading and scalability. The reason why I like Scala is because it is so simple yet powerful.

Of course doing something at home influences work. I decided to write a build tool for the Hobnox AudioTool which is about 200 lines of Scala code. Cool thing is that this tool replaces a manually maintained Ant build and all project dependencies are always correct. Plus analyzing the dependencies in a more powerful way allows me to spawn the compiler in parallel. Building the AudioTool is now twice as fast and much more comfortable.

When learning a new programming language you also learn about new concepts. Functional languages in general have a different approach to nullable types. I know Scala is not the only one but let me introduce the concept in terms of ActionScript.

When you have a method that returns either a result or nothing: what do you do? Imagine you have some kind of service and a Dictionary of users. Requesting a user works by his unique id. The Dictionary is private to the class since you want to keep it read only.

function getUser(id: String): User {
  return hasUser(id) ? users[id] : null;
}

If I would now simply ask the service for an unknown user and do something like if(getUser('xyz').isLoggedIn) { trace('Hooray'); } I could probably and up with a null-reference error. No one checks for me if the user exists. So what else could we do? Write a lot of boilerplate code and prepend a check if the user is null or not each time we request one from the service. A much better approach in my opinion is to throw an error as early as possible. In this case we would rewrite the method to something like this:

function getUser(id: String): User {
  if(hasUser(id)) return users[id];
  throw new NoSuchUserError(id);
}

In this scenario we get informed about the error as early as possible. But we are stuck again. First of all ActionScript does not enforce you to catch possible exceptions. This means if you do not read the documentation of a method carefully or look into the source code of a method before calling it somewhere you will never know that it throws an exception. And what if we are actually in a scenario where we do not expect errors for non-existing objects? Think of the Dictionary object throwing each time an error when you access it and the result is null. How could I even check if an object exists in a Dictionary?

try {
  dictionary[key];
  return true;
}
catch(noSuchElementError: NoSuchElementError) {
  return false;
}

I guess you see that this can not be the solution to our problem. In a real world example you may deal with your own collection of objects instead of a Dictionary of course. So we have to get rid of exceptions and null for the optimal solution. Scala’s approach to this problem is the Option type. We always abuse null as a placeholder when we want to express that an element does not exist. The Option means that either Some or None result exists. Rewriting our getUser function using this approach would yield the following ActionScript code.

function getUser(id: String): Option {
  return hasUser(id) ? new Some(user[id]) : new None();
}

Why is this much better than the old approach? When calling the method you will always know that the method has only an optional result value. We get rid of the exception and null values. Our only problem at the moment is ActionScript. The result is now untyped. In an ideal world this method would be written as:

function getUser(id: String): Option.<User> {
  return hasUser(id) ? new Some.<User>(user[id]) : new None.<User>();
}

However we can still tackle this issue by implementing null-representations of our objects. Imagine the User class. You could rewrite the code to something like this.

function getUser(id: String): IUser {
  return hasUser(id) ? user[id] : new NullUser();
}

final class NullUser implements IUser {
  public function get isLoggedIn(): Boolean { return false; }
  public function get name(): String { return 'null'; }
}

And even if you are interested in null-reference errors you could rewrite your code to something like this:

final class NullUser implements IUser {
  public function get isLoggedIn(): Boolean {
    CONFIG::ThrowNullReferenceErrors { throw new NullReferenceError(); }
    return false;
  }
  public function get name(): String {
    CONFIG:: ThrowNullReferenceErrors { throw new NullReferenceError(); }
    return 'null';
  }
}

It is definitely a very different approach. A functional language like Scala allows you to deal much better with Options. But it makes sense to diferentiate between an uninitialized variable which is null and an optional result of a function. Unfortunately this is at the moment very painful with the lack of generics in ActionScript.

TAAS As A Decompiler

The TAAS compiler is different from the ActionScript compiler since its input is not ActionScript source code but already compiled SWF or SWC files. Just like the haXe compiler can output AS3 instead of a SWF the TAAS compiler can do the same.

Now if you add one and one together you see that the TAAS compiler can be used as a very strong decompiler. My own tests have shown that it will work flawlessly where other commercial decompilers output rubbish. Since the compiler behaves like the Flash Player it will “execute” the bytecode in order to parse it which means it has a very highlevel understanding of the structure inside the SWF.

The only question is now what to do with the source code. I wrote the decompiler for my session at FOTB to show much easier how the optimizations behave. It is also a great tool to debug errors. But should it be opened or not?

To take it one step further one might also be able to write an obfuscator using the TAAS compiler. In my opinion it would be cool to have a strong decompiler and obfuscator, both being open source. We might also add an option to protect SWFs from the decompiler by adding something to the SWF metadata for instance. Of course this is just a simple rule which could be removed by someone once the code is open. What do you think?

FOTB Recordings

The people from Powerflasher were kind enough to upload their recording of my FOTB session. I am sorry that the microphone stopped working at the end. But you might still be able to understand what I was talking about if you turn up the volume of your speakers.
Continue reading ‘FOTB Recordings’

Compiling Java and C# to SWF

At the end of my Leaving the Sandbox session I showed two projects I have been working on without telling anyone. The first was a compiler that could compile C# to SWF. The second was another compiling being able to compile Java to SWF. First of all I have to say that both tools are far from being finished. They are a so called proof-of-concept. For me it is enough to know that this is possible. I am also not sure if I will finish one of those tools.

So how does this work? Because I am now working on the AudioTool backend for our highly anticipated launch I had to switch to fulltime Java development. In order to understand the JVM better I started to search for the optimizations that the HotSpot VM performs and stumbled upon a framework called Soot. Voilà. Reading about Soot made me question why nobody is working on such a framework for ActionScript. Probably because universities see Flash just as a toy. For us developers being dependent on the Flash Platform it is of course more than that.

I will quickly explain what Soot does. It reads compiled Java class files and converts the machine code they contain into a three address code representaiton. This code is then heavily optimized. TAAS is basically doing the same. I also want to note that Soot is by far more complex and complete than TAAS right now. It would be megalomania saying that TAAS is Soot for ActionScript. But you get the idea. The one framework converts Java to three address code and the other framework converts ActionScript to three address code. Maybe it is possible to connect them.

This is what I did for the Java compiler. The whole pipeline looks like this:

  1. Compile Java sourcecode with the Java compiler of your choise
  2. Use Soot to get the three address code representation of a compiled binary
  3. Create TAAS expressions for the expressions that Soot uses
  4. Connect the TAAS graph the same way the Soot graph is connected
  5. Compile the TAAS graph to a SWF

This is about it. There is some glue-code involved here and there and I did not implement all Java expressions yet. I did also not bother about threads. Basically one could do what Scott has done for Alchemy. The cool thing is that we get highly optimized Java code because of the Java compiler and Soot. Then the TAAS compiler can run a second time after linking is done and perform platform specific optimizations like inlining and loop invariant code motion for certain expressions.

I already wrote a couple of classes and packed them in a SWC. Those mimic the behaviour of the Java classes like java.lang.System or java.util.LinkedList. On the Java side I implemented the Flash classes like flash.display.Sprite and flash.events.EventDispatcher. The interesting part is that all the methods those classes contain are marked native which means they have no implementation since they are native to the Flash Player.
This is also a great advantage. It is slow in Alchemy to call Flash methods and to communicate with Flash classes. The Java approach does not have this tradeoff. I also do not have to trigger the ActionScript compiler. The conversion form class to swf is entirely done using Soot and TAAS.

So far for the Java compiler. But in order to show a little bit more of the potential I wanted to have a C# compiler working as well. The great thing about advanced languages and a large community is that people have already built a lot of tools. It did not take too long to find an advanced framework that converts C# to Java. Actually this is not be the best approach. But I could not find the “Soot for C#” the evening before my FOTB session.

So I am using net2java to convert .NET code like C# or VisualBasic to Java. That code is then compiled using the Java compiler and I am using Soot again to convert the code to TAAS without much stress.

For me the most important part is to know that it is possible to compile .NET source code and Java binaries to SWF. To complete those tools one just has to implement all the missing expressions and the standard Java library in Flash. Everything could also work the other way around. Compiling from TAAS to .NET, Silverlight and Java is another option.

After The Sandbox

In 2006 I got invited to the FlashForum Conference in Düsseldorf, Germany. It was my very first Flash conference and my topic was AS3 code optimization. I have talked a lot about bitwise operators, lookup tables and other techniques to speed your code up.
Now about three years later I wanted to prepare a very special talk for FOTB and make a step forward by getting rid of old habits and idioms.

Before FOTB I was not sure if I can present Apparat at all. For me as an autodidact it is not always easy to solve certain issues. Mario Klingemann talked about Magic Moments in Brighton. I guess I had one after a long time of thinking and failing. I got really into the flow and everything became so clear. Stay tuned for more detailed posts on the technique behind the different compilers.

Continue reading ‘After The Sandbox’

TAAS progress

I have developed some other optimizations during the past couple of days. Including strength reduction and optimizing tail recursive calls.

Strength reduction can already handle about 55 different cases at the moment. For instance it will convert code like if( x - 1 == 0 ) into if( x == 1 ). But it will also remove expressions that other optimizations could introduce. x + 0 is something that might occur during inline expansion. Or x + Number.NaN for example.

But I really like the tail recursive optimizations. TAAS can detect if a method needs to call itself or not. Take this code for example:

private function sum( i: int, value: int ): int
{
	if( i == 0 )
	{
		return value;
	}

	return sum( i - 1, value + i );
}

You see that the last statement sum( i - 1, value + i ) is tail recursive. This means we can replace the recursive call to sum() with a jump to the beginning of the method after changing the method parameters. The sum() method becomes converted to something like a simple while loop. This means also that one could have written code like this in the first place since this optimization applies only to methods that do not have to be recursive after all.

Knock yourself out with some examples:

Euro Invasion At MAX 2009

Since the news has spread pretty quickly I want to shed some light on this as well. Influxis has invited a lot of European Flash developers to come to LA and attend Adobe MAX 2009. Those include Seb Lee Delisle, Ralph Hauwert, Mario Klingemann, Andre Michelle and me. I was absolutely overwhelmed when I heard about that idea first. I really want to thank the whole Influxis team for making this possible. You guys definitly rock! But I would also like to thank Adobe for sponsoring MAX passes in order to get everyone invited.

Hopefully I will meet a lot of interesting people and Flash Player engineers. It will be my first MAX so I am really curious.

First results of TAAS

TAAS in action

Finally I am able to present the first results of TAAS. It was a long way to get here. I was actually not sure at all if I will manage to show something at FOTB 09. Let me explain what happens here.

In the image is the original ActionScript code on the left. The ActionScript bytecode produced by the ASC is in the middle and the compiled TAAS version is on the right. You see that I am running a loop and call a method inside that loop. It would be nice to have small helper methods like this one inlined automatically. TAAS can do this for you now.

So how does this work? First of all an SWF file is parsed and the bytecode is extracted. This bytecode is transformed into a control flow graph. This control flow graph is converted into a graph of TAAS expressions which a lot of benifits. The conversion step works like the Flash Player and kind of executes your code. The result is that all methods and local variables are typed if that is possible.

It is much easier to perform a lot of optimizations now. Like inline expansion. In order to do that I can simply compile the method that is called to TAAS and connect the vertices together in the graph after some adjustments have been made. The inline expansion step will automatically inline methods when it thinks it is useful and possible. You can only inline methods that are private or final and make no use of reflections.

Inlined methods create a lot of overhead. Usually they contribute additional local variables and the code gets bigger. So one step after the inline expansion is to clean everything up again. In this example I can remove those registers thanks to copy propagation and dead code elimination.
And it gets even better. If you take a closer look at the example, the calc method expects two paramters that are of type Number. Therefore the original method uses an Add instruction. TAAS knows about the types and sees that it will pass two integers into this method. Since it makes no sense to convert from int to Number to int it stays with integer in this case.

So after inlining the method ends up with even less local variables. The original local variable t1 is considered useless since it is only used one time. TAAS will put the code getTimer() at the position where t1 has been used and we end up with a heavily optimized method. What I like the most is that I do not have to change the way I write my ActionScript code. All happens behind the scenes automatically.
Some other optimizations are implicit. TAAS will use AddInt instead of Add if both operands are typed int. A FlowOptimizer is also involved and will invert the if expression for the loop. This can reduce the number of required jumps in a method by 50%.

I know that the output is not perfect. But this is all a work in progress and really the first result that I can share. The SWFs speak for them self.