Massive amounts of 3D particles without Alchemy and PixelBender

As a response to Ralph Hauwert’s article I created a little example of what can be achieved using plain ActionScript 3 syntax. Ralph has put up a great example of how you can wire things like Alchemy, ActionScript and PixelBender together to achieve an astonishing result.

However I asked myself if it is possible to achieve the same result without making use of Alchemy, PixelBender or bytecode manipulation. I asked the guys sitting with me in the office to compare the results and they were unfortunately very different on various machines. Sometimes my version is faster, sometimes the version from Ralph is faster and sometimes they are about the same.

Now there are some very important things to note here and I am surprised that I got so close. Since Ralph is making use of PixelBender the number crunching is done on multiple cores. Something that is not possible with the ActionScript version which is the real bottleneck. And there is another big difference. Ralph’s calculations are done in 32bit while I am using 64bit precision. Therefore I am happy with the result and it shows that using pure ActionScript is still a good choice.

In order to optimize the code I used a linked list for the particles and minimized the comparisons between different data types. Here is the result.

Sources:

43 Comments

  1. Posted Apr 3, 2009 at 4:37 pm | Permalink

    Very impressive! How exactly did you manage this without the new ByteArray opcodes? I thought the performance cost of 300k writeFloat() was too much as demonstrated in Ralph’s code?

    Im working on a haXe version to get around the fiddley alchemy bit, hope to have something to show and tell soon.

  2. Posted Apr 3, 2009 at 4:40 pm | Permalink

    Sorry make that a writeUnsignedInt()

  3. Posted Apr 3, 2009 at 4:41 pm | Permalink

    Sorry, I forgot to add the sources which are now available. I am not using a ByteArray at all. Since there is no PixelBender and Alchemy involved I used a Vector.<uint> (which sucks) and a linked list of particles.

  4. Posted Apr 3, 2009 at 4:41 pm | Permalink

    Great example. I hoped my previous post on Alchemy (the heresy one), already shined through enough on how I felt about this. It’s merely a tool to achieve the speed with the same vm easier. I wish the native AS3 compiler was just better.

  5. Posted Apr 3, 2009 at 4:47 pm | Permalink

    And… how are you doing it? What’s the difference between Ralph’s AS3 version of the effect and yours?

  6. Willem
    Posted Apr 3, 2009 at 4:59 pm | Permalink

    Very fast on my P4 here (Northwood 2.4GHz). Ralph’s example freezes Firefox, while yours is very smooth and ‘light’ on the cpu. How odd?

  7. Posted Apr 3, 2009 at 5:02 pm | Permalink

    Woh, lots of comments while I posted mine. Ignore mine :)

  8. Posted Apr 3, 2009 at 5:11 pm | Permalink

    On a macmini core2duo 2ghz + linux yours is 2fps faster.

  9. Posted Apr 3, 2009 at 5:13 pm | Permalink

    Yes it is faster for me too — and I am using Linux as well :)

  10. Posted Apr 3, 2009 at 5:39 pm | Permalink

    At first I was amazed by the beauty of the pattern. Then I thought, ‘It’s easy to display a bunch of particles. The trick is moving them.’ Then I went to close the tab and dragged my mouse across the SWF. Twirl…

    Now I’m impressed. The particles spin with zero lag. Nice work and thanks for the code.

  11. Posted Apr 3, 2009 at 6:34 pm | Permalink

    Very interesting dude, thanks ;)

  12. Posted Apr 3, 2009 at 6:43 pm | Permalink

    cool! no fps changed with low->high quality got 22/32 here, ;D

  13. vitaLee
    Posted Apr 3, 2009 at 7:29 pm | Permalink

    on my machine core 2 duo 1.7Ghz(Win XP, Firefox 3) your version runs at 23fps, while Ralph’s alchemy/pixelbender version runs at 13fps.
    great experiment dude!!!

  14. Posted Apr 3, 2009 at 7:46 pm | Permalink

    On my machine, both versions run with the same FPS. Great job!

  15. Posted Apr 4, 2009 at 9:15 am | Permalink

    20-21 fps running on battery on macbook with 2.1 GHz core 2 duo. Damn!

  16. gravitron
    Posted Apr 4, 2009 at 10:04 am | Permalink

    Perfect performance on my MacBook Pro 2.2 GHz Intel Core 2Duo, OSX 10.5

  17. Posted Apr 4, 2009 at 5:08 pm | Permalink

    I’m getting 30/32 FPS on my LG Intel Core 2 Duo 2.53 GHz – sweet!

  18. Posted Apr 4, 2009 at 9:10 pm | Permalink

    Yours is 14/15 FPS running on a AMD Semprom 1.8GHz.

  19. Posted Apr 4, 2009 at 9:21 pm | Permalink

    Hmm I am getting nothing less than solid 30/32 fps in chrome on a T61p with the centrino cPro. Maybe it’s chrome?

    I’m going to check out these sources, did you use papervision at all for the rendering? I’ll probably post an entry on my blog on this one, keep up the great work!

    DjacK

  20. Posted Apr 5, 2009 at 12:29 pm | Permalink

    There is no use of PV3D in this example. Most of the techniques used can be found at http://wiki.joa-ebert.com/

  21. Posted Apr 5, 2009 at 5:05 pm | Permalink

    ~24/32fps …and it´s kinda cold-freezing my browser when running for a while… (Intel Core 2 CPU 2.00 GHz)

  22. Posted Apr 6, 2009 at 5:00 pm | Permalink

    I do get better framerate results with the pure AS3 version. This one runs very stable at 20/30 while the PB+alchemy one nearly crashes my browser.

    Is PB causing the trouble then? Maybe different also when using PB async instead of synchronous?

  23. Posted Apr 6, 2009 at 6:11 pm | Permalink

    The main benifit of the AS3 version is that data does not have to be pushed around all the time from AVM to PixelBender and vice-versa.

  24. oos
    Posted Apr 6, 2009 at 7:06 pm | Permalink

    Yeah this one is much faster and smoother on my machine. Nice example!

  25. Posted Apr 7, 2009 at 7:27 am | Permalink

    I have 6 to 10 fps here, which is far faster than Ralph’s.

  26. Matt Rix
    Posted Apr 7, 2009 at 7:37 pm | Permalink

    Yours is definitely faster(25 fps vs 17fps) and it just feels nicer/smoother too. I’m using Opera on Windows using Flash Player 10,0,12,36

  27. Posted Apr 8, 2009 at 5:46 am | Permalink

    Very interesting results … on a Core2Duo @3GHz …

    Ralph’s version: 22-25/60 fps, CPU usage 60-70% (because he uses multi core).

    Your version: 32/32 fps, CPU usage: 30-50%.

    Overall your version runs more smooth here.

  28. Posted Apr 8, 2009 at 9:47 am | Permalink

    Great job, Joa.

    Ralph’s: 14fps
    Yours: 22fps

    on a AMD 2.0 GHz Dual Core.

  29. Roland
    Posted Apr 21, 2009 at 6:21 pm | Permalink

    wie kann das sein, dass deine version flüssig läuft (21fps) ich aber exakt deine sources kompiliere (Flex 4) ich nur 3fps erreiche? wie hast du kompiliert?

  30. Posted Apr 25, 2009 at 12:20 am | Permalink

    500.000 Particle experiment:
    http://www.yagizgurgul.com/blog/2009/04/24/500000-particle/

  31. Posted May 6, 2009 at 8:06 pm | Permalink

    Very nice. I really like the way you loop through particles.

    It makes me want to do bench mark speed tests with standard for(var i:int = 0; i < particles.length; ++i){} loops.

  32. Posted May 8, 2009 at 8:57 am | Permalink

    Very nice!

    Pure AS3 is running faster for me. The Alchemy version crashed FireFox the first time it loaded.

  33. Hudson Ansley
    Posted May 28, 2009 at 12:18 am | Permalink

    This was a fascinating read, thanks!
    I thought this might make a good example of how much faster linked lists are than Arrays, but I did a version that uses an array instead of the list and it was only about 1 fps slower (31 instead of 32 fps on my machine) and I’m not even typing the Array values…
    Does that sound odd to you? I know haXe has a fastList which uses a linked list structure because it is supposed to be much faster… kind of puts a crimp in a talk I was going to give on optimization… :-/
    I’d like to hear any insights you might have about this.

  34. Posted Jun 22, 2009 at 10:34 pm | Permalink

    here it is in haxe for those interested :)

    http://webr3.org/blog/haxe/flash-10-massive-amounts-of-3d-particles-with-haxe/

  35. Nightrider
    Posted Jul 8, 2009 at 11:52 am | Permalink

    I tested both movies in an old pc. The AS3 version was much faster, around 17fps while Ralph’s one was struggling to reach 9fps. PC specs: AMD Athlon 64 2400MHz, 1536 MB (DDR SDRAM)

  36. Patrick
    Posted Sep 6, 2009 at 12:24 pm | Permalink

    If you replace

    while( –n > -1 ) buffer[ n ] = 0×000000;

    with a simple BitmapData.fillRect and another BitmapData.getVector() it’s actually faster. What’s even better is that performance won’t depend that much on the size of the BitmapData anymore but (almost) only on the number of particles.

  37. Posted Sep 7, 2009 at 10:37 am | Permalink

    Patrick: You will create a new Vector each frame in that case with the size of the BitmapData. This means every so-often the GC will kick in and has to clean up a lot …

  38. Patrick
    Posted Sep 8, 2009 at 10:08 am | Permalink

    That’s what I thought too. And to my surprise, the framerate really is about 2-3fps higher and stays very stable. The GC doesn’t have that much to do in the rest of my stuff so maybe that’s why it’s okay?

  39. Posted Sep 18, 2009 at 7:32 am | Permalink

    I found that using two vectors shaves off a few milliseconds:

    private const _empty: Vector. = new Vector.( 550 * 400, true );

    I fill this with the 0xFFFFFF when the class is instantiated. Then in the enter frame method I clone the vector:

    var buffer: Vector. = _empty.slice();

    I’ve left it running for some time, and don’t seem to see any issue with GC.

  40. Posted Sep 18, 2009 at 7:34 am | Permalink

    Sorry that should have been “0×000000″ not “0xFFFFFF”.

  41. Cardin
    Posted Jan 31, 2010 at 3:34 pm | Permalink

    13/32 on your version, very smooth I like it.

    10/60 on Ralph’s. A bit of a lag. Nearly froze my Firefox – everything but that tab was not working. Thankfully I had Fire Gestures to close the tab, phew!

    This is a real bummer for Pixel Bender, I’d really thought we could play with shaders in Flash. =(

  42. RiaanWest
    Posted Jun 13, 2012 at 5:35 pm | Permalink

    More than 3 years later, and I’m using your method to do a Kinect-driven particle cloud app :)

    I have one question, and hopefully you’ll see this:

    Is there any way to increase the particle size when they come closer to the “viewer”?

    I would really appreciate your help!

    Thanks

  43. Jon Deo
    Posted Oct 31, 2012 at 1:47 am | Permalink

    @Riaan
    Possibly to late …
    Short answer: No

    This would heavily impact performance, since at the moment each particle is drawn as a pixel.
    Changing the size of a particle depending on its z-value, would mean drawing some kind of shape at the position of the particle.
    That would be hard to do with the use of a vector, so you’d have to draw the shape onto the bitmap, making everything that speeds up this version obsolete.

15 Trackbacks

  1. [...] Joa Ebert – Massive amounts of 3D particles without Alchemy and Pixelbender [...]

  2. [...] Joa Ebert – Massive amounts of 3D particles without Alchemy and Pixelbender [...]

  3. [...] Joa Ebert en parle aussi ici avec un superbe exemple. Il disait qu’il s’en servait aussi dans son moteur de tween qui était, grâce à tout ça, plus performant que les Tweenlite ou Tweener. [...]

  4. [...] but I have not implemented all instructions of the AVM+ yet. Although I can already transform the 3D lorenz attractor to TAAS for instance and all types are solved [...]

  5. By Miream Explained | Andy Li's Blog on Jul 5, 2009 at 5:21 am

    [...] you may noticed that the program is very similar to the Alchemy particle pusher. In fact, I used Joa Eberts AS3 only version as a base and altered it as I need to overcome the limit of [...]

  6. [...] the inverse square root might be inlined and optimized much better. I did a simple test using the Lorenz attractor from before and replacing the Vector.<uint> buffer with a ByteArray gave a performance boost [...]

  7. By Notes: Loop Optimizations 2 | Course Vector on Sep 8, 2009 at 6:23 pm

    [...] of a difference in speed. The discovery I made after looking into BetweenAS3 and the efforts of Joa Ebert on his AS3 particles I noticed they were using LinkedLists to manage large numbers of objects. [...]

  8. [...] is also a link to Joa Eberts blog where he show an implementation of the strange attractor also in As3 but highly optimized… – it [...]

  9. By Seeing eye god - Mote god - Eye of god on Sep 8, 2010 at 12:16 am

    [...] Massive amounts of 3D particles without Alchemy and PixelBender at 3 Apr 2009. Ralph's calculations are done in 32bit while I am using 64bit precision.. on my machine core 2 duo 1.7Ghz(Win XP, Firefox 3) your version runs at.. Pingback on Apr 6th, 2009 at 11:11 am. 2 More play with Alchemy.blog.joa-ebert.com/…/massive-amounts-of-3d-particles-without-alchemy-and -pixelbender/ – Massive amounts of 3D particles without Alchemy and PixelBender at [...]

  10. [...] Massive amounts of 3D particles without Alchemy and PixelBender at 3 Apr 2009. Ralph's calculations are done in 32bit while I am using 64bit precision.. on my machine core 2 duo 1.7Ghz(Win XP, Firefox 3) your version runs at.. Pingback on Apr 6th, 2009 at 11:11 am. 2 More play with Alchemy.blog.joa-ebert.com/…/massive-amounts-of-3d-particles-without-alchemy-and -pixelbender/ – Massive amounts of 3D particles without Alchemy and PixelBender at [...]

  11. By Performance verbessern - Flashforum on Jun 28, 2011 at 5:34 pm

    [...] [...]

  12. [...] PixelBender+Alchemy option (by Ralph Hauwert) http://www.unitzeroone.com/blog/2009/03/18/flash-10-massive-amounts-of-3d-particles-with-alchemy-source-included/ Pure AS3 option (by Joa Ebert) http://blog.joa-ebert.com/2009/04/03/massive-amounts-of-3d-particles-without-alchemy-and-pixelbender… [...]

  13. [...] [...]

  14. By Alchemy 와 속도 | Blog on Nov 19, 2011 at 9:19 pm

    [...] of 3D particles with Alchemy 란 블로그로 Alchemy의 속도에 논란을 불지폈지만,Massive amounts of 3D particles without Alchemy and PixelBender 란 블로그를 보면 Vector를 사용하고 코드를 최적화 하면Actionscript 만으로도 [...]

  15. [...] and File Streams I’ve read quite a few posts and comments on the internet debating whether or not Alchemy is faster than straight ActionScript, since it all runs within the [...]