: News
: Reviews
: Editorials

: Archives
: About
: Contact
: Advertising
: Privacy
: Links
: Forums

 

 

 

Exuberant Optimism?

 

Intel and AMD after Conroe

 

by Josh Walrath

 

The Intel Side

            While Conroe may be a next generation product with their four issue design, a lot of negatives come into play when we consider that this chip will still be sitting on the GTL+ front side bus.  The first iteration of this bus was found on the Pentium Pro, and it has improved throughout the years to what we see today.  Unfortunately, while bandwidth and latency have improved to what we see today, some of the underlying weaknesses are still there.

            There are several issues with the FSB that Intel has to work around to get the Conroe generation of products to be high performing.  For example, while the highest speed on this bus is 266 MHz, only the data bus is quad pumped (giving the 1066 MHz number often bandied about).  The address and command busses are still single pumped, so they essentially run at 266 MHz.  The bus is also bidirectional, but not full duplex, which means that during any clock the data/address/command streams are traveling in only one direction.  Once that is finished the bus switches back the other direction.  Data does not simply stream in and out through the FSB, but rather it is a one way street at any one time, and when needed that street travels in the other direction.

            While this worked fine for many generations of Pentiums, it is now looking far too creaky and slow to truly feed the new generation of Intel processors.  The memory controller that is located on the chipset is running at chipset speed, not CPU speed.  So for a 800 MHz FSB processor the memory controller is running at 200 MHz, but for a 1066 MHz FSB product the controller is running at 266 MHz.  This is a far cry from AMD’s memory controllers that run from 1.8 GHz to 2.8 GHz.

            The FSB is the main weakness behind this next generation of Intel parts.  So how exactly is Intel getting around this?  The easiest way is to include large caches on the processor.  This is exactly what Intel has done.  The Conroe series of chips will come with either 2 MB or 4 MB of cache per core.  This means that in the standard desktop dual core Conroe chip, there will be 4 MB of L2 cache.  On the possible Extreme Edition chip we expect to see a total of 8 MB of L2 cache.  That is a lot of cache, and it takes up far more space than the actual core itself.  On the AMD side L2 cache sizes go from 256 KB L2 on the Semprons to 1 MB L2 per core (2 MB L2 total) on the high end X2 chips.

            Intel does add another wrinkle to improve overall cache efficiency in dual core chips by connecting the caches.  This way each core is connected to a single, shared L2.  It appears as though Intel will have to adopt the MOESI protocol to get this to work.  This will improve the efficiency in dual core/multi-threaded applications with the shared cache.  Intel has also done a lot to all levels of cache to squeeze as much efficiency out of the architecture so that it can better utilize the four issue capabilities of these new processors.

            Caches are not enough though, especially as we are heading to 64 bit computing.  64 bit instructions will be larger (though not twice as large as one would first assume), as well as the data being produced.  Intel will still have to rely on main memory, which is woefully distant as compared to the Athlon 64 product.  Even with the 1066 MHz bus, address and command latency are still very bad as compared to AMD’s integrated memory controller (which by nature is a full duplex unit connected directly to the CPU die running at full core speed).

            In multi chip servers we will see some of the issues with Intel’s FSB be addressed by using two full GTL+ busses (Bensley Platform) between the different CPU’s.  This will really help Intel in the two chip and four chip spaces, but it still won’t be enough to address the bottlenecking that occurs in eight chip applications.  Even then, the four chip space will still have to share busses, and these products will not scale nearly as well as AMD’s current products.

            So, keeping the Conroe series of chips running efficiently will be quite a task, there are just too many negatives by continuing with the outdated GTL+ FSB.  These negatives will hold Intel back from ever getting close to achieving anywhere from .90 to 1 issues per clock with their four issue design.  Now, Intel has probably not uncovered all of their optimizations to help efficiency, but the very fact of the matter is that all data eventually has to come from or be written to main memory.  200 to 300 CPU cycles can easily go by before an address command is executed by the memory controller, and that kind of inefficiency can be costly.

            Overall the Conroe series of chips will be a step above current Intel offerings, but it is not the panacea that many think.  The use of 65 nm process tech will help to make sure that this is a cool running design, and even with 2 MB of L2 cache it will not have an overly large die size.  This will be a much easier chip for Intel to produce, and OEM’s and end users will rejoice because this chip will not require the massive cooling that the high end Pentium 4’s did.

            So, while the Conroe series will be a very good chip from Intel, people are overlooking one large aspect to the situation: AMD is not standing still.

 

Next: Quietly Moving Forward

 

If you have found this article interesting or a great help, please donate to this site.

 

Copyright 1999-2005 PenStar Systems, LLC.