Archive for the 'experiments' Category

A Simple Method And Taas

I just want to share with you how Taas can actually optimize code and what happens behind the scenes. So in my last post I talked about stackless code and optimizations that are possible but how does it work at all?
Imagine you have got this ActionScript code:

var x: Number = 1.0;
var y: Number;

if(true)
{
	y = 3.0;
}
else
{
	y = 2.0;
}

x += y;

return x;

For the sake of simplicity I will not use local variables in the bytecode. But it will be easier to figure out what happens in the bytecode.

      PushDouble           1.0
      PushTrue
      IfTrue               L0

      PushDouble           2.0
      Jump                 L1

L0:   PushDouble           3.0
L1:   Add
      ReturnValue

This bytecode is nearly the same as the method but without local variables. This should be fairly simple to understand.

When converting bytecode to Taas, its corresponding control flow graph is converted into a control flow graph of Taas expressions. I do not want to go into much detail how the stack based code is converted but here is the graph before and after the transformation from bytecode to Taas. It looks quite similar but there is a major difference. All those constant values in the Taas graph are not values pushed on a stack but assignments to virtual variables which exist only in theory. You can see also that the Add in the bytecode has known operands and type in the Taas version. Since it is currently not know if 2.0 or 3.0 has been used there is a Φ function that says “This value is either 2.0 or 3.0 depending on the path taken at runtime”.

So as I said this code can be optimized much better than stack based code. This graph is very redundant. There are three utilities to compact and simplify the current graph of Taas expressions. The algorithm has been developed for Java bytecode but works with ActionScript very well tool. The concept is quite simple. Perform copy propagation, constant folding and dead code elimination until the graph stops changing. Applying these techniques to the Taas graph yields the following results.

Copy propagation:

Most of the theoretical variables have been eliminated using copy propagation.

Constant folding:

Known constants have been replaced. Even the if condition is no longer needed and the false branch has been removed. The dead code elimination will clean up this code afterwards.

Dead code elimination:

Dead code elimination has removed the dead Jump statement and also the value 2.0 from the Φ expression. Afterwards constant folding replaced the Φ expression with its one constant value. Then another iteration of constant folding replaced the Add expression with the constant value 4.0. Afterwards copy propagation has put the result into the Return statement and voilá.

This result may have seemed quite obvious from the beginning and you may ask who writes such code. Probably nobody. But once you start inlining methods this makes a lot of sense. A lot of preconditions are known in that case and unnecessary branches can be removed. Since inlining methods bloats up the code it is very important to compact it afterwards as much as possible. I hope you see now how interesting all of this might be in the future.

Massive amounts of 3D particles without Alchemy and PixelBender

As a response to Ralph Hauwert’s article I created a little example of what can be achieved using plain ActionScript 3 syntax. Ralph has put up a great example of how you can wire things like Alchemy, ActionScript and PixelBender together to achieve an astonishing result.

However I asked myself if it is possible to achieve the same result without making use of Alchemy, PixelBender or bytecode manipulation. I asked the guys sitting with me in the office to compare the results and they were unfortunately very different on various machines. Sometimes my version is faster, sometimes the version from Ralph is faster and sometimes they are about the same.

Now there are some very important things to note here and I am surprised that I got so close. Since Ralph is making use of PixelBender the number crunching is done on multiple cores. Something that is not possible with the ActionScript version which is the real bottleneck. And there is another big difference. Ralph’s calculations are done in 32bit while I am using 64bit precision. Therefore I am happy with the result and it shows that using pure ActionScript is still a good choice.

In order to optimize the code I used a linked list for the particles and minimized the comparisons between different data types. Here is the result.

Sources:

Sneak peak: PixelBender Development Tool

PBDT

Because the PixelBender Toolkit editor is very annoying I started writing my own some weeks ago. All I want to have is a more comfortable way to write shaders in PixelBender and I think I have achieved that already but once you start doing something you want to implement even more features until you are really satisfied.

I am really thankful that Arne from FDT was answering a lot of my questions regarding the Eclipse framework. That is also why I could implement so many features already in a short amount of time.

The current features include:

  • Syntax highlightning
  • Matching brackets highlightning which handles also < and > for metadata correct
  • Smart auto-indent strategy
  • Context sensitive auto-completion
  • Auto-insert of closing brace etc.
  • Documentation for built-in methods on hover

There are two really important issues that I still would like to address. One is the ability to highlight stuff a little bit semantic. Defined constants for instance should look different in code. And I could also add very basic live error highlightning. An outline would be nice and some content-assist features like “Create parameter …” etc.
The only downside is currently that you can not have a live-preview of your shader in Eclipse. Maybe there will be a way of doing this. I am currently not sure. Please tell me what you think and leave feature requests in the comments. I can not garuantee for anything but I will try implementing as much as possible in a reasonable timeframe.

Rasselbock loves the Lemur

This is what you can get when you combine a Lemur with the AudioTool and map some controls to the Rasselbock. I really hope that we can release the Midi feature to the public soon because it is just so much fun!

AudioTool On A TouchScreen

Comming into the office and seeing a video like this makes me really happy and proud.

Tween engine comparison

I wrote a while ago about our tween engine at Hobnox but I did not post any performance demos.

Here are the results for 1000 DisplayObjects with manipulation on their scaleX, scaleY, alpha, x, y and rotation properties. It is also very important that you take a look at the memory behaviour. A click starts the tweening.

As you can see it makes a lot of sense to stay type-safe and to manage the memory you are using on your own.
I will talk about those concepts in my next session AudioTool’s Private Parts in Brighton and Boston. Since optimization on a code level is trivial the main performance boost is achieved by re-thinking algorithms, structures and concepts. I guess one of the most interesting topics will be the optimization of our cable solver which was running O(n^2) (really!), then O(n(n+1)/2) and could be minimized to a rare O(n(n-4)/2) worst-case.

PixelBender Runtime Compilation

Tinic posted today his PixelBender assembler and disassembler. This makes me happy, because now I can post an experiment I could not show for a while.

If you know PixelBender, than you know that you can not create loops. What you could do is unroll all constant loops with a fixed length. If you know simple convolution filters like a blur, you know that you need an xy-loop and you know it should not be possible with PixelBender at all. Let me prove you wrong and have a look at dynamic loop unrolling with PixelBender (be careful with high values!).

I built a library to assemble and disassemble PixelBender kernels at runtime. I wrapped it also in a high level API so basically you can create a new Kernel by doing var kernel: Kernel = new Kernel();. Then when you need your shader as a ByteArray you simple call kernel.compile(). There are still some glitches here and there but I hope that I can release the source code pretty soon.

AS3C — take a look inside

I have started working on AS3C at the end of last year. After a quick prototype the development stagnated and I added just several fixes and tests to the code. Basically I started AS3C as a complete C# newcommer and because of that the code is very ugly.

Due to the fact that I do not have much free time to continue developing AS3C I think it is the right time to release the source-code on the one hand and to let people experiment with it on the other hand.

You can either download the sources and build AS3C manually (you will need zlib.net) or download a binary from trunk/bin/.

When using AS3C you will need the ActionScript from the SVN. Remember that you write real ActionScript code which gets translated by AS3C. There is also one undocumented and very experimental feature existing. If you run as3c.exe -optimize main.swf you could get some speed improvements if you have heavy loops using the Math class. But it could also destroy the SWF so do not forget to make a backup :o)

ActionScript 3 Vorbis Encoder

Vorbis Encoding TimesAs André already mentioned we have finally got an Ogg Vorbis Encoder in ActionScript 3. André wrote a lot about the benefits already.

Now what I really like about the encoder is the way we could minimize the encoding time by ~50%! I know two implementations of Ogg Vorbnis. One is written in C and another one in Java. What we did first was just getting it to work and it looked pretty much like the C/Java version. Then we started optimizing the code by comparing the encoding results always to reference files. I am quiet happy because on my machine we reduced the encoding time from 32sec to 16.5sec with simple optimizations and tricks.

Where to go from here? Of course we think the best we can do is to open-source the encoder. But there are a couple of other things in my mind. I started modifing Tamarin and added two functions to the Math class. Those convert a Number to its binary IEEE32 single-precision representation and vice versa. With those two functions we could get huge speed improvements for performance hungry tasks like this one. I hope Adobe will not forget about this as well — there seems to be general problem currently trying to support the simple Flash user on the one hand and the “Flash explorer” on the other hand.

I know you may ask what this feature could be good for, but doing floating point math using bitwise operations is a killer when it comes to performance. We are currently not allowed to do this and there is no reason for that.

Sound.extract() rocks!

When Tinic Uro was implementing the dynamic audio features he was also giving us a very nice present called Sound.extract(). And it is simply awesome!

I wrote a simple experiment yesterday night using the popforge library. Yes — you can use it still with the new features and it is a great help. I just had to convert the sample data into a format for popforge.

The experiment is a simple flanger applied to an MP3. André wrote the flanger about a year ago but it is still fun listening.

The sourcecode will also show you how to loop sounds using Sound.extract(). It is a very safe way so that you get always as much data as you need.

There is only one problem right now. I wanted to use the new FileReference features so that you can load your own MP3s into the flanger. But unfortunately there is currently no easy way to do this. You would have to load the bytes, then create a SWF at runtime and inject the MP3. Then you could load that SWF with Loader.loadBytes() and once that is complete you can extract the sound. I thought doing that would be a little bit too much for a small demo :)