|
Linux 2.6 and Hyper-Threading The
last time we got together, we took a brief look into the technology
behind hyper-threading and performance with a variety of Windows benchmarks.
At the end of that article I vowed to bring you the Linux side of
the story at a later date. Well friends, that time has arrived. Since
the
last installment, my testbed has strengthened considerably with
the addition of 2x 3.2GHz Xeons with 1MB of L3 cache and a 3.2E Pentium
4 'Prescott'. The latter has received an upgrade of hyper-threading
technology and Intel was kind enough to let us take a look at the
impact it may have on Linux performance. Anyone
who's ever set out to perform Linux benchmarks quickly realizes the
difficulties involved in such an undertaking, not only with the availability
of quality benchmarks (or lack thereof), but also in the way the test
system(s) are configured. Most of the Linux benchmarks that I see
on hardware review sites are simple things like kernel compiles or
povray... maybe a game benchmark or two. Those certainly have merit,
but I wanted to try to do things differently for this article. I wanted
to get more involved with server-oriented benchmarks to really see
what hyper-threading brings to that market. Don't get me wrong, we're
still going to take a look at compiling performance and even media-encoding
performance, but those won't be the most interesting results you'll
see here today. One
of the other problems that I see in articles featuring Linux benchmarks
is a lack of details when it comes to system configuration, compile
options, versions of software and libraries, and so forth. Consider
this article an act of full disclosure. I'm going to give you as much
background information as humanly possible so you really get a sense
of how my test systems were configured. Curious how I compiled apache?
I'm going to tell you. What version of GCC was I using? You'll know.
How did I configure my kernel? You can peek at the .config file for
yourself. Prescott's
Improved Hyper-Threading While
the general concensus on Prescott's performance in its current state
(namely clockspeed) has not been glowing, we should take a moment
to go over the architectural changes meant to improve hyper-threading.
The simplest change Intel made to Prescott was increasing the size
of the respective L caches on the processor. The L1 data cache was
bumped from 8KB to 16KB and the L2 from 512KB to 1024KB. As I
discussed previously, when you have two logical processors competing
for shared resources, more is better. I
don't consider myself to be a processor architecture guru, so whenever
a new processor rolls along I generally refer to Ace's
Hardware. They were kind enough to break down the other more subtle
changes which should result in improved hyper-threading performance.
From their Prescott
review:
Additionally, two new instructions were added to Prescott: Monitor and mWait. These should improve performance by increasing efficiency while also decreasing power consumption. Based on their musings, it appears as if we'll just have to wait for an OS patch before these improvements can be realized. From what I gather, the decreased power consumption will certainly be welcome. The first thing we have to do is break down the configuration of my test systems. We'll not only delve into the hardware used, but also peer into my OS and software configuration with a magnifying glass.
|