Everyone likes examples. So here are three examples using TDSI. The archive includes a ready-to-go FDT project with post-compile ANT tasks configured.
This is the old code of the already optimized attractor using the Memory API instead of a Vector.<uint>.
In this case there exists no Particle class at all and no linked list. The particle information is stored inside the memory as well. Particles are extended to a fourth value so indexing a particle can be done with a simple bitshift which is very fast.
The last example uses float instead of double values for the particles. The framerate stays the same which is really cool because the memory usage drops. Before a particle consisted of four doubles which is a total of 4 * 8b = 32b. In this example each particle takes up only 16b. There the memory difference is 0x4B0000b which is about 4.7mb in total.
And also the first version needs about 20mb on my machine which means about 12mb of RAM are not wasted. Pretty cool when thinking about devices with less memory.
By the way I just stumpled across a bug when using [Embed]. Hopefully it will be easy to fix.
Everyone likes examples. So here are three examples using TDSI. The archive includes a ready-to-go FDT project with post-compile ANT tasks configured.
Example01
This is the old code of the already optimized attractor using the Memory API instead of a Vector.<uint>.
Example02
In this case there exists no Particle class at all and no linked list. The particle information is stored inside the memory as well. Particles are extended to a fourth value so indexing a particle can be done with a simple bitshift which is very fast.
Example03
The last example uses float instead of double values for the particles. The framerate stays the same which is really cool because the memory usage drops. Before a particle consisted of four doubles which is a total of 4 * 8b = 32b. In this example each particle takes up only 16b. There the memory difference is 0x4B0000b which is about 4.7mb in total.
And also the first version needs about 20mb on my machine which means about 12mb of RAM are not wasted. Pretty c
As a response to Ralph Hauwert’s article I created a little example of what can be achieved using plain ActionScript 3 syntax. Ralph has put up a great example of how you can wire things like Alchemy, ActionScript and PixelBender together to achieve an astonishing result.
However I asked myself if it is possible to achieve the same result without making use of Alchemy, PixelBender or bytecode manipulation. I asked the guys sitting with me in the office to compare the results and they were unfortunately very different on various machines. Sometimes my version is faster, sometimes the version from Ralph is faster and sometimes they are about the same.
Now there are some very important things to note here and I am surprised that I got so close. Since Ralph is making use of PixelBender the number crunching is done on multiple cores. Something that is not possible with the ActionScript version which is the real bottleneck. And there is another big difference. Ralph’s calculations are done in 32bit while I am using 64bit precision. Therefore I am happy with the result and it shows that using pure ActionScript is still a good choice.
In order to optimize the code I used a linked list for the particles and minimized the comparisons between different data types. Here is the result.
Sources:
As a response to Ralph Hauwert's article I created a little example of what can be achieved using plain ActionScript 3 syntax. Ralph has put up a great example of how you can wire things like Alchemy, ActionScript and PixelBender together to achieve an astonishing result.
However I asked myself if it is possible to achieve the same result without making use of Alchemy, PixelBender or bytecode manipulation. I asked the guys sitting with me in the office to compare the results and they were unfortunately very different on various machines. Sometimes my version is faster, sometimes the version from Ralph is faster and sometimes they are about the same.
Now there are some very important things to note here and I am surprised that I got so close. Since Ralph is making use of PixelBender the number crunching is done on multiple cores. Something that is not possible with the ActionScript version which is the real bottleneck. And there is another big difference. Ralph's calculations are done in 32bit while I am