: News
: Reviews
: Editorials

: Archives
: About
: Contact
: Advertising
: Privacy
: Links
: Forums




The Definitive Multi-GPU Roundup


Introduction Summer 2006


by Josh Walrath


The Tech Behind the GPUs

            NVIDIA’s offerings for this series of articles revolves around the 7600 GS, the 7950 GX2, and the 7900 GTX.  The 7600 GS is based on the G73 chip, while both the 7950 GX2 and 7900 GTX are both based on the G71 chip.  The 7900 GTX is a standard G71, while the 7950 GX2 appears to be a mobile stepping of the G71 which allows it to be clocked not quite so fast, but consumes less power and produces less heat.

            NVIDIA’s G7x series of chips were a fairly significant jump from the NV4x series that preceded it.  The two big advantages of G7x over NV4x were transparency anti-aliasing (allowing the AA of alpha textures) and the enhanced MADD capabilities of the two ALU units located in each shader pipeline.  Theoretically this doubles the amount of MADD operations each pipeline can do as compared to the older NV4x generation of designs.  Other tweaks were involved in the design of the G7x series, it still shares many of the same basic features of the NV4x series.  G7x is still a SM 3.0 enabled architecture with support for OpenEXR High Dynamic Range rendering.  It features angle dependent anisotropic filtering with plenty of optimizations which improve performance, but does slightly degrade image quality as compared to a true angle independent filtering algorithm.  It also performs “brilinear” filtering, which is a combination of triliner and bilinear filtering to again improve performance while incurring a slight degradation of image quality.

            The anti-aliasing capabilities of the G7x series has not undergone any massive transformation over the past several years, but it has been tweaked and worked on.  The unit can do up to two samples per pass, with a maximum of two passes per pixel.  This gives a maximum of 4 sample AA.  With the NV4x generation NVIDIA improved the coverage by using a rotated grid sampling pattern for 4X AA (previous generations featured 4X ordered grid AA).  With the G7x they added support for the multi-sampling and super-sampling transparency AA.  NVIDIA also throws in several mixed multi-sampling and super-sampling modes to improve both texture filtering and anti-aliasing, but these are generally used in more processor bound titles as it takes a hefty performance hit on more GPU bound applications.

            The 90 nm G7x products caused quite a stir in the industry when they were released due to their significant cut-down in transistor count and die size.  At every other previous node change (eg. 180 nm to 150 nm) we have seen die sizes stay consistently large while transistor counts have increased dramatically, NVIDIA bucked this trend by releasing smaller chips at each level.  So while the G71 did not add any functional units or features over G70, it is a much smaller chip that theoretically costs NVIDIA less to produce yet achieve better yields and speed bins on the 90 nm Low-K process.

            The G73 based 7600 GS features 8 ROPS, 12 pixel pipelines, and 5 vertex shader units.  It runs at a core clock of 400 MHz, while the 128 bit 256 MB of GDDR2 memory runs at 400 MHz giving it a total bandwidth of 12.8 GB/sec.  The G73 is comprised of about 178 million transistors and has a die size of around 130 mm square.  Active cooling is often not needed for this SKU, but it is an option depending on the manufacturer.

            The G71 based 7950 GX features 16 ROPS, 24 pixel pipelines, and 8 vertex shader units.  Each core on the 7950 GX2 runs at 500 MHz with 256 bit 512 MB of GDDR-3 running at 600 MHz giving each chip up to 38.4 GB/sec of bandwidth.  Since there are two PCB’s it has a total of 76.8 GB/sec of available bandwidth, but because much of the texture and geometry data needs to be duplicated on each board, the effective bandwidth is much lower than that figure.  The G71 has around 278 million transistors, which is down 25 million from the original G70.  The die is around 199 mm squared, which is much smaller than the 334 mm square that the G70 weighed in at.  Cooling is provided to each chip by its own heatsink and fan.  These are temperature controlled, so rarely are they heard.  Once the card does warm up, the fans do spin up and it can be audible over other system fan noise.

            The 7900 GTX uses the basic G71 chip, so it has the same features as the 7950 GX2.  It is clocked significantly higher though.  The core on the 7900 GTX runs at 650 MHz, while the 512 MB of 256 bit memory runs at 800 MHz giving it an effective bandwidth of 51.2 GB/sec.  This flagship part also features a very innovative cooler which is whisper silent, yet still cools very effectively.  Rarely, if ever, is this cooling system heard from.


Next: The ATI Side


If you have found this article interesting or a great help, please donate to this site.


Copyright 1999-2006 PenStar Systems, LLC.