What do the Tests Mean?
The GfxBench2D benchmark tool performs a series of individual tests. In order to make sense of the results, it is important to understand what operations the tests are performing. The benchmark tool will execute an operation (or operations) many times, and measure how long it takes to complete. Details of each test are presented below.
NOTES:
- Be very careful when comparing data from different Operating Systems (OSes). Different OSes may perform slightly different tests. For example, Windows can only do compositing with a premultiplied alpha channel, and so those results are not directly comparable to other OSes that can do full compositing. Likewise, some OSes (or graphics systems) may not support particular operations, and so those results are zero
- Do not compare scores between OSes. Given the previous point, consider the scores to be relative ranking on a per-OS (or per-graphics system) basis. Directly comparing scores between OSes without first checking that the tests are identical will inevitably lead to wrong conclusions
- Where tests have multiple sub-tests, the final score is equal to the averate MPixels/s of all sub-tests combined multiplied by the bits-per-pixel of the screen-mode and divided by 32 (except for the MemCopy test score which is handled as described below). Weighting the scores by the number of bits-per-pixel attempts to compensate for the fact that an operation on a 16-bit bitmap requires half the bandwidth of a 32-bit bitmap. Without this weighting, a false impression could be created that one graphics card was faster than a more powerful card simply because one was tested on a 16-bit screen mode, while the other was tested with a 32 bit mode
- Despite the previous point, tests on a 16-bit screen should not be compared directly to a 32-bit screen
MemCopy
This is a series of tests to measure the speed at which data can be copied between main memory (RAM) and graphics card RAM (VRAM). A lot of graphics are not generated by the graphics card itself, but are images or videos loaded from disk or downloaded from the internet. With videos in particular, the transfer speed of data to the graphics card is an important factor.
The following memory copy tests are performed:
- Copy to VRAM - A block of data is copied from RAM to VRAM. In this test, this task is performed by the CPU with no DMA
- WritePixelArray - A block of data is copied from main memory (RAM) to VRAM using the Operating System's (OS') graphics library. The graphics may or may not use DMA in order to accelerate the copy operation
- Copy from VRAM - A block of data is copied from VRAM to RAM. In this test, this task is performed by the CPU with no DMA
- ReadPixelArray - A block of data is copied from VRAM to RAM using the Operating System's (OS') graphics library. The graphics may or may not use DMA in order to accelerate the copy operation
The MemCopy score is the weighted sum of the tests listed above. The CPU based tests are both multiplied by 0.25 while WritePixelArray and ReadPixelArray are multiplied by 0.75. More importance is placed on using the graphics library's copy functions.
FillRect
This test measures the speed at which the graphics card can render rectangles with a solid colour. Multiple tests are performed with different rectangle sizes. Different rectangle sizes are tested since the performance may be limited by a combination of both the graphics card's rendering speed, and the speed at which the CPU can submit render commands across the bus to the graphics card.
BlitRect
This test measures the speed at which the graphics card can copy (a.k.a. blit) a rectangular area from one bitmap to another. This is one of the most fundamental 2D graphics operations. Multiple tests are performed with different rectangle sizes. Different rectangle sizes are tested since the performance may be limited by a combination of both the graphics card's rendering speed, and the speed at which the CPU can submit render commands across the bus to the graphics card.
OverlappedBlitRect
This test measures the speed at which the graphics card can copy (a.k.a., blit) a rectangular region from within a single bitmap to another region in the same bitmap that overlaps the first. This is essentially a scroll or move operation (e.g., this could occur when you move a window on the screen by dragging it with the mouse). This is tested separately from BlitRect since overlapped blitting requires copying the pixels in a specific order so that source data is not overwritten before it is copied.
Multiple tests are performed with different rectangle sizes. Different rectangle sizes are tested since the performance may be limited by a combination of both the graphics card's rendering speed, and the speed at which the CPU can submit render commands across the bus to the graphics card.
Composite
This test measures the speed at which the graphics card can perform a basic source-over-destination Porter-Duff composite operation (a.k.a, alpha-blending). In this operation, the source bitmap has an alpha channel in addition to the Red, Green, and Blue (RGB) channels, and the alpha channel determines the transparency of each pixel. This technique allows the rendering of transparency effects, and can "composite" anti-aliased bitmaps on-top of one another to create a much better looking results than if blitting were performed with a simple mask.
Multiple tests are performed with different rectangle sizes. Different rectangle sizes are tested since the performance may be limited by a combination of both the graphics card's rendering speed, and the speed at which the CPU can submit render commands across the bus to the graphics card.
Premultiplied alpha: On some OSes (e.g., Windows), straight compositing as per the equations is not possible, and premultiplied alpha is used instead. In this case, the RGB channels are premultiplied by the alpha channel beforehand. As a result, the composite operation has less work to do. Results obtained using a premultiplied alpha will be marked as such, and should not be directly compared with results obtained using full compositing.
CompositeSrcMask
The CompositeSrcMask test performs the same source-over-destination test as in the Composite test. However, the source bitmap's alpha channel is replaced with the alpha channel from an 8-bit source mask bitmap.
Multiple tests are performed with different rectangle sizes. Different rectangle sizes are tested since the performance may be limited by a combination of both the graphics card's rendering speed, and the speed at which the CPU can submit render commands across the bus to the graphics card.
Combined alpha mode: On some systems (e.g., Windows) the mask is multiplied by the source bitmap's alpha channel instead of replacing it. Additionally, the source bitmap could be premultiplied or have a straight alpha value. Results obtained using this mode will be marked as such. Since these differences affect how much work the graphics card must do, results in the different modes are not directly comparable.
Random
This test will execute all of the operations in the previous tests randomly with random rectangle sizes. Under normal conditions, a graphics card rarely executes the same operation with the same size over and over again. The purpose of this test is to obtain results under more realistic conditions, or even, worst-case conditions.
NOTES:
- From the MemCopy test, only WritePixelArray and ReadPixelArray are used in this test
- The WritePixelArray and ReadPixel array operations are executed much less often than the other operations. Simply put, typical graphics rendering usually involves far more of the other operations than copying data to/from the card. WritePixelArray and ReadPixel array are each executed 1 out of every 102 operations on average, whereas the other test operations each occur on average 20 times out of every 102 operations.
- For consistency, the random test will attempt to use compositing on all cards, even those that do not support it. While these operations will fail on cards without compositing support (no software fallback is used), they are still counted as an executed operation (there is always a little overhead involved, so we cannot treat it as a non-operation). However, they will not affect the total number of megapixels processed in the test, after all, no pixels are processed in a failed operation.
Overall Score
The overall score is simply the sum of the scores for each of the tests above. It gives an overall figure for 2D graphics performance that can be used to compare different cards/systems.
IMPORTANT: The overall score may not be comparable between operating systems (OS'). This is because some OS' might not support some certain operations (e.g., compositing), or may use operations in a different way (e.g., premultiplied alpha vs. straight alpha).
Finally, please note that this is NOT intended to be an exhaustive test of the graphics card. As with any benchmark tool, it tests a specific set of operations under a specific set of conditions. Under different conditions (e.g., those within a particular game), cards that performed well in this benchmark may still perform worse than lower scoring cards and vice-versa. Thus, while you can compare scores between graphics cards and reach general conclusions about the performance of one card relative to another, it cannot provide a 100% definitive ranking of graphics card. Graphics cards are too complex for a single "this one is the best" metric to exist.
Benchmark » GfxBench2D » What do the Tests Mean?