- 3D Stuff (9)
- BumpTop (5)
- C/C++ (16)
- Open Source (6)
- Radiant (8)
- Seneca (4)
- Stuff (11)
- Uncategorized (14)
- win32 (11)
- November 10, 2008: Radiant Update
- October 28, 2008: Primary Export: Pain
- September 7, 2008: Results
- September 5, 2008: Multicore Processing And Game Engines
- July 31, 2008: Work++; // Again!
- June 8, 2008: Patterns And Such.
- June 4, 2008: Work++;
- May 20, 2008: SIMD And Randomness
- April 30, 2008: Coder Burn-Out
- March 26, 2008: Some GameDev Math Resources.
Arrays, Post Haste!
Almost any chance I get, I like to dabble in crazy things ranging from code optimizations to patterns and data structures. Well the last few weekends I had a some of time to kill and with my ever growing disgust of the STL Vector class, I decided to write my own (Growable Array class), just for the hell of it. Out of the box, Vectors are slow, especially when being run in debug mode. There are a slew of iterator checks, data verifications (during memory swaps, etc) and so on which completely turned me off of using the Vector class in time-critical applications such as games.
In that case, lets bust out some numbers. After I wrote my Array class (and optimized it, heavily) , I decided to test it against the Vector class in various different compilation modes. First was a regular debug build (without compiler optimizations), the second test was with full compiler optimizations, and lastly was a test using full compiler optimizations and that neat trick I mentioned in a previous post about disabling iterator checks (#define _SECURE_SCL 0). I was surprised to see the numbers once they came in, check them out:
| No Optimizations: | ||
| Vector | Array | |
| Push Back: | 6542ms | 619ms |
| Operator []: | 1156ms | 444ms |
| Operator =: (50) | 1710ms | 1639ms |
| Operator ==: (20) | 62742ms | 2471ms |
| Operator !=: (20) | 62622ms | 2452ms |
| Get Size: | 380ms | 365ms |
| Get Back/Front: | 28400ms | 830ms |
| PopBack: | 351ms | 40ms |
| Erase/Remove: (1000) | 25755ms | 23777ms |
| Full Optimizations: | ||
| Vector | Array | |
| Push Back: | 167ms | 84ms |
| Operator []: | 31ms | 0ms |
| Operator =: (1000) | 15291ms | 15932ms |
| Operator ==: (500) | 0ms | 0ms |
| Operator !=: (500) | 0ms | 0ms |
| Get Size: | 0ms | 0ms |
| Get Back/Front: | 78ms | 0ms |
| PopBack: | 4ms | 2ms |
| Erase/Remove: (1000) | 22584ms | 22373ms |
| Optimizations and _SECURE_SCL Trick: | ||
| Vector | Array | |
| Push Back: | 168ms | 81ms |
| Operator []: | 0ms | 0ms |
| Operator =: (1000) | 15250ms | 15832ms |
| Operator ==: (500) | 0ms | 0ms |
| Operator !=: (500) | 0ms | 0ms |
| Get Size: | 0ms | 0ms |
| Get Back/Front: | 78ms | 0ms |
| PopBack: | 2ms | 1ms |
| Erase/Remove:(1000) | 22611ms | 22419m |
Lots of Numbers! Let me explain what I did to get these numbers. The tests were conducted on 1 million integers within a loop, with the exception of the equals operator (operator=), comparison operator (==), the not equals operator (!=), and the Erase/Remove functions which were reduced considerably since they took so long. The numbers in brackets are the number of iterations that were done to get the result. Due to the debugging information that is added into the Vector class, the performance of the class will slow down to about one-seventh of the speed it should run at. But once all the debugging information is removed, you can really see the class fly. But I guess that was not enough for me. Once the Debugging information was removed and the compiler fully optimized the code, the Vector class was raised to the level of my Array class. It was on par with the majority of the tests, with the exception of push_back(), back(), and front().
One advantage that my Array class has over the Vector class is that it tries to use realloc() in order to grow, rather then trying to allocate new space and copying over the old info. Secondly, I scoured the web for some assembly tricks and stumbled upon an optimization paper written by AMD that mentioned the use of a buffer when copying information from one memory address to another. This is all done on the CPU, using registers. This function out preforms the stock memcpy() or memmove() functions that are included with Visual Studio. I also read that tapping into the MMX or SSE technology would give potential gain, but currently I don’t know enough about that technology.
Either way, I’m releasing the source for everyone to check out under the zlib/libpng license. If anyone has questions/comments/improvements to this class, please feel free to email me. Enjoy :).
One Response to “Arrays, Post Haste!”
Leave a Reply
You must be logged in to post a comment.
September 17, 2007 at 11:45 pm
dude, did you write that assembly? i never knew you had it in you :)
you may want to call it something other than array so you won’t get name conflicts
nice job!