: News
: Reviews
: Editorials
: About
: Contact
: Advertising
: Privacy
: Links
: Forums



The State of 3D


October, 2004

by Josh Walrath



            Last year at this time I released my Sordid State of 3D article, and it received quite a bit of attention.  Some of the attention was good, some of it was bad, but all of it was enlightening.  I discovered that I had strayed far away from my 3D roots, and I did not properly understand the nuances of the major architectures of the time from NVIDIA (NV3x) and ATI (R3x0).  In this article I will try to make amends for that previous State of 3D and provide more accurate information, as well as take a look at how the current industry is faring and what we can expect in the near future. 

Refuting the Sordid State

            There were so many errors in this previous article that I am not sure where to start.  The biggest flaw in that article was my misrepresentation of the two architectures that I covered there.  In some aspects, I was correct, but in the majority of them I was sadly mistaken.  So to help get a better impression of these products it would be helpful to go over each in a bit more detail.

            The GeForce FX architecture was not all that NVIDIA and their fan base was hoping it would be.  The original NV30 was a very complex chip that lacked a lot of floating point power when it came to PS 2.0 rendering.  From 10,000 feet up, it appears as though DX 8.1 was the main thrust of this architecture, and that DX 9.0 functionality was an added bonus.  The NV30 had the misfortune of being released after the Radeon 9700 Pro, which was a very focused DX 9.0 part.  If ATI had not released a product such as the 9700 Pro, the review and enthusiast community would have probably sung the praises of the NV30 (well, except perhaps the cooling solution used) about how well it ran current games yet was still forward looking in regards to DX 9 functionality.  Unfortunately for NVIDIA and its bottom line, ATI released a product that not only matched the DX 8.1 performance of the NV30, but also trumped so many other aspects of the architecture that the NV30 was soon relegated to the scrap heap.

            Standing as far away as I do from the NVIDIA engineering department, it appears as if overall PS 2.0 speed was not a priority for NVIDIA.  Again, the NV30 was great with DX 8.1 content, as well as OpenGL games, but PS 2.0 rendering was a huge weakness.  Why did NVIDIA design such a part?  Was it due to NVIDIA not participating (by their own decision) with the initial development of DX 9.0 specifications?  Was it due to NVIDIA underestimating the very competent R300 chip from ATI?  Was it due to NVIDIA believing that no real DX 9.0 content would hit the streets until 1.5 years after the release of the NV30?  Few people outside of NVIDIA truly know the answer, but we certainly can speculate!

            NVIDIA has always been known for raising the bar on graphics technology.  This occurred with the TnT, GeForce, GeForce 2, GeForce 3, and GeForce 4 products.  Each time a new product was released, it added features and performance far greater than what the previous generation showed.  Perhaps the push to raise the bar over what NVIDIA considered the standard DX 9 part doomed the NV30 in the first place?  It is very hard to say, but we can see that the initial specifications for the NV30 were laid down before Microsoft finalized DX 9.0.

            There were rumors that NVIDIA tried to force Microsoft’s hand when it came to finalizing the DX 9.0 specification so that it more adequately reflected NVIDIA’s architecture.  Microsoft of course would not bow down before NVIDIA, no matter how much of the graphics market NVIDIA controlled.  Instead Microsoft worked with ATI, Matrox, and all of the other graphics companies to develop the DX 9.0 specification that we all know.  This worked out in ATI’s favor, as the finalized DX 9.0 specification nearly matched their R300 part.  There were of course workarounds that helped out NVIDIA’s architecture as well (such as the decision to use partial precision).  Overall Microsoft did a good job in balancing out the industry when it came to the overall specifications for DX 9.0, PS 2.0, and VS 2.0.

            NVIDIA was left in a pickle at this point, as they had truly overdesigned their product in regards to the finalized DX 9.0 specification.  The flexibility and programmability that NVIDIA designed into their product was a significant step up from the PS 2.0 specification, but that flexibility came at the expense of speed.  Also, the NV30 seemed to be aimed more at the DX 8.1 specification, as it had a lot of integer computational ability.  It was sorely lacking in floating point power, and it also lacked a lot of register space.  Using 32 bit floating point precision quickly filled up these registers, and performance suffered.  No matter how flexible this architecture was, it just didn’t have the horsepower and balanced design to be a fast product.  The only saving grace for the NV30 was its outstanding DX 8.1 ability.

            NVIDIA tried to rectify the situation with the release of the NV35 product, and its subsequent derivatives.  This product essentially took most of the integer units of the NV30 architecture out, and replaced it with floating point units.  This had a slight performance decrease in DX 8.1 applications as compared to the 500 MHz GeForce FX 5800 Ultra, but the ability to render PS 2.0 content was greatly enhanced.  It still suffered many of the same problems that the NV30 did, such as register space limitations and the drop in performance with full precision rendering.  Using mixed precision did result in better performance for the NV35, but it still couldn’t match ATI’s R3x0 series of cards.

            NVIDIA took a different approach than ATI did in overall pipeline design, as it chose a narrow but long functional unit.  The NV35 had 4 true pixel pipelines, each with two texturing units attached.  These 4 pixel pipelines were very complex, and were able to put out a decent amount of work each clock cycle.  Due to the programmability of the NV35 pixel pipeline, it could do some very complex operations to each pixel, but it again was hampered by its lack of speed and register space.  For professional use this card could do full FP32 precision rendering with nearly unlimited instructions that could be applied to each pixel.  For enthusiasts using the NV35 architecture for gaming were left somewhat disappointed by the results.


Next: More NV3x


If you have found this article interesting or a great help, please donate to this site.


Copyright 1999-2004 PenStar Systems, LLC.