: News
: Reviews
: Editorials

: Archives
: About
: Contact
: Advertising
: Privacy
: Links
: Forums




Sapphire Radeon X1800 XT


Late… but not too little


by Josh Walrath


            It is unfortunate that the X1800 series of cards will be more remembered for their delays than their performance, but that may very well be its legacy.  ATI actually taped out the initial R520 silicon in late 2004, but consumers did not actually get to see the chip until late November of 2005.  A whole year between initial tapeout and first delivery is pretty much unheard of, but through a series of issues that were not necessarily ATI’s fault the R520 was just not ready to come to market.  In the middle of last summer ATI finally found the issue that was holding this silicon back and went full speed with production.  Again misfortune plagued ATI with this part as it took a full three months to fabricate the chip and package it.  By the time ATI finally had product ready for delivery, NVIDIA had its GeForce 7800 GTX and 7800 GT products out in force and in good supply.  To further harass ATI, NVIDIA released the very fast (and rare) GeForce 7800 GTX 512.

            These are all very unfortunate events that have plagued what is generally a great chip.  ATI has produced a very solid and fast part, and it integrates a whole host of advanced features.  Due to R520’s lateness, we are coming upon the next refresh of the series from ATI, the R580.  The first question that should pop into anyone’s mind is, “Why would I want to buy an X1800 XT when the X1900 series are coming out?”  This is a good question, and one any good consumer should ask.  The answer is actually pretty cut and dried, but we need to dig into the results first to get a good idea of it. 

The R520

            ATI had not significantly changed the basic core design of the R300 through the R480 chips, and while each generation was better than the last, they did not offer many new features or a drastic new architecture.  The R520 was going to change that.  ATI would design this chip from the ground up, and they would introduce it on TSMC’s brand new 90 nm process.  As evidenced above, things did not go according to plan.  There was a simple, but elusive, design flaw that did not allow the design to go above 500 MHz, which was much lower than the design specification.  ATI initially designed this chip to hit around 700 MHz in the top end version.  Once this issue was found and solved, the R520 was able to clock much closer to the expected range.

This unobtrusive box holds one of the most powerful graphics cards to date.

            ATI took a look at yields and speed bins, as well as the time on their hands, and decided to clock the X1800 XT to a core speed of 625 MHz.  This is lower than 700 MHz, but with the clock ticking ATI didn’t have time to really sit around and let yields and speed bins mature.  So in the end the X1800 XT was clocked at 625 MHz core and 750 MHz GDDR-2 (1500 MHz effective).  This still made it a fast card.

            The R520 is a clean slate design, and as such some very interesting decisions were made.  The first is that it has a very programmable memory controller and a unique “Ringbus” memory architecture.  The memory controller can handle GDDR-3 memory at high speeds, and is broken up into 8 x 32 bit units which give it a total of 256 bits to main memory.  Clocked at 1500 MHz the card can receive 48 GB/sec of bandwidth.  That is some serious bandwidth.  The 512 bit Ringbus does not only interact with main memory, but also is used to shuffle the data around the chip to where it is needed.  It is a general purpose, bi-directional pathway that offers a lot of internal bandwidth for the different functional units.  It is also programmable so that it can change its behavior to match what the application is requesting.  This has been shown to give very good results in applications like Doom 3 where ATI’s products had previously lagged behind NVIDIA’s.

            The R520 is ATI’s first high end SM 3.0 part, and as such they have put a lot of effort into making it a very good SM 3.0 part.  While it does not natively support vertex texture fetch in hardware (it emulates it in software) it can be viewed as a more robust SM 3.0 unit than NVIDIA’s.  Dynamic branching and flow control are much faster than on NVIDIA products, but since we are still in the early stages of SM 3.0 application development, these features have not been widely utilized.  With the increase in complexity in pixel shaders ATI has also paid very special attention to thread performance.  In the R520 there can be a maximum of 512 threads in flight, and the hardware has a very complex scheduler to handle those threads.  This is supposed to make the internal operations more efficient and streamlined, and thereby grant the chip greater performance over other designs already in the field.

            There are a total of 16 pixel shader units, 16 ROPS, 16 Z-Compare units, and 8 vertex shader units.  In an interesting move, ATI has decoupled the 16 texture units from the pixel pipelines.  This means that the pixel units now are comprised of only ALU’s that handle pixel shader functions.  It does allow a great deal of flexibility in the chip as the decoupled texture units can almost act like a “internal texturing computer” without having to affect the functionality of the pixel shader units.  Something else that is quite interesting with this architecture is that when anti-aliasing is enabled it can do double the Z-compares as without AA.  This helps in stencil shadow performance, and makes the units perform like the NVIDIA based products (two Z compare per pipeline per pass).  ATI also offers the only card on the market that can do anti-aliasing while doing FP16 HDR effects.  NVIDIA cannot do that on any of their hardware to date.  This is a major visual improvement to any application which features FP16 HDR.


Next: More R520


If you have found this article interesting or a great help, please donate to this site.


Copyright 1999-2005 PenStar Systems, LLC.