You are currently browsing the CodeBot @ Work weblog archives for the day September 17, 2007.
- 3D Stuff (12)
- AI (1)
- BumpTop (5)
- C/C++ (22)
- Open Source (8)
- Radiant (11)
- Seneca (6)
- Stuff (17)
- The Mortal Realm (2)
- Uncategorized (14)
- win32 (12)
- July 9, 2010: On Visual Studio 2010...
- December 22, 2009: Efficient Rendering, A La Mark.
- December 3, 2009: A Simple Opponent
- December 3, 2009: Blog++
- September 7, 2009: Food Budgeting
- July 3, 2009: Too Busy...
- March 26, 2009: How Do Patents Apply To Me?
- February 27, 2009: U.S. And Human Rights
- February 7, 2009: I've Been Busy...
- November 10, 2008: Radiant Update
Archive for September 17, 2007
Arrays, Post Haste!
September 17, 2007 by Mark.
Almost any chance I get, I like to dabble in crazy things ranging from code optimizations to patterns and data structures. Well the last few weekends I had a some of time to kill and with my ever growing disgust of the STL Vector class, I decided to write my own (Growable Array class), just for the hell of it. Out of the box, Vectors are slow, especially when being run in debug mode. There are a slew of iterator checks, data verifications (during memory swaps, etc) and so on which completely turned me off of using the Vector class in time-critical applications such as games.
In that case, lets bust out some numbers. After I wrote my Array class (and optimized it, heavily) , I decided to test it against the Vector class in various different compilation modes. First was a regular debug build (without compiler optimizations), the second test was with full compiler optimizations, and lastly was a test using full compiler optimizations and that neat trick I mentioned in a previous post about disabling iterator checks (#define _SECURE_SCL 0). I was surprised to see the numbers once they came in, check them out:
| No Optimizations: | ||
| Vector | Array | |
| Push Back: | 6542ms | 619ms |
| Operator []: | 1156ms | 444ms |
| Operator =: (50) | 1710ms | 1639ms |
| Operator ==: (20) | 62742ms | 2471ms |
| Operator !=: (20) | 62622ms | 2452ms |
| Get Size: | 380ms | 365ms |
| Get Back/Front: | 28400ms | 830ms |
| PopBack: | 351ms | 40ms |
| Erase/Remove: (1000) | 25755ms | 23777ms |
| Full Optimizations: | ||
| Vector | Array | |
| Push Back: | 167ms | 84ms |
| Operator []: | 31ms | 0ms |
| Operator =: (1000) | 15291ms | 15932ms |
| Operator ==: (500) | 0ms | 0ms |
| Operator !=: (500) | 0ms | 0ms |
| Get Size: | 0ms | 0ms |
| Get Back/Front: | 78ms | 0ms |
| PopBack: | 4ms | 2ms |
| Erase/Remove: (1000) | 22584ms | 22373ms |
| Optimizations and _SECURE_SCL Trick: | ||
| Vector | Array | |
| Push Back: | 168ms | 81ms |
| Operator []: | 0ms | 0ms |
| Operator =: (1000) | 15250ms | 15832ms |
| Operator ==: (500) | 0ms | 0ms |
| Operator !=: (500) | 0ms | 0ms |
| Get Size: | 0ms | 0ms |
| Get Back/Front: | 78ms | 0ms |
| PopBack: | 2ms | 1ms |
| Erase/Remove:(1000) | 22611ms | 22419m |
Lots of Numbers! Let me explain what I did to get these numbers. The tests were conducted on 1 million integers within a loop, with the exception of the equals operator (operator=), comparison operator (==), the not equals operator (!=), and the Erase/Remove functions which were reduced considerably since they took so long. The numbers in brackets are the number of iterations that were done to get the result. Due to the debugging information that is added into the Vector class, the performance of the class will slow down to about one-seventh of the speed it should run at. But once all the debugging information is removed, you can really see the class fly. But I guess that was not enough for me. Once the Debugging information was removed and the compiler fully optimized the code, the Vector class was raised to the level of my Array class. It was on par with the majority of the tests, with the exception of push_back(), back(), and front().
One advantage that my Array class has over the Vector class is that it tries to use realloc() in order to grow, rather then trying to allocate new space and copying over the old info. Secondly, I scoured the web for some assembly tricks and stumbled upon an optimization paper written by AMD that mentioned the use of a buffer when copying information from one memory address to another. This is all done on the CPU, using registers. This function out preforms the stock memcpy() or memmove() functions that are included with Visual Studio. I also read that tapping into the MMX or SSE technology would give potential gain, but currently I don’t know enough about that technology.
Either way, I’m releasing the source for everyone to check out under the zlib/libpng license. If anyone has questions/comments/improvements to this class, please feel free to email me. Enjoy :).
Posted in C/C++, win32 | 1 Comment »