Intel 7500 Series "Nehalem-EX" Xeons
Posted on: 03/30/2010 10:57 PM

Matrix Multiply v3.5

This Matrix Multiplication benchmark computes the product of 2 square matrices of the same dimension. It can achieve near theoretical peak performance, even on modern processors. It is a key operation for realizing high performance in many of the standard tasks done in scientific computing. This includes solving large dense systems of linear equations (as in the Linpack Benchmark), and QR factorizations (to solve least square problems). Solving large dense linear systems is a key component for electro-magnetic simulations used in stealth design applications. The computation is done by the dgemm subroutine in the Intel Math Kernel Library (MKL).

The default workloads are square matrices of size 5000 to 15000 by steps of 5000. By default each multiplication is run 2 times and the average value is reported.

For the purposes of this comparison, I used the GFLOPS measurment from the 15000 block sized matrix only.

Nehalem-EX Matrix Multiply

The Nehalem-EX box turns in another great performance in Matrix Multiply.


MyriMatch is a tool designed to take experimental data from shotgun proteomics experiments and compare those spectra against sequences in a known database of proteins. Whether the program is being run in a single-computer environment or across an entire cluster of processing nodes, it is able to optimally divide work in a much more efficient way than many other database search programs. This is because it only generates candidate sequences from the known database once for the entire set of spectra instead of once for every spectrum. Thus, for each candidate sequence generated, it is compared against every spectrum. The spectra keep a certain (user-defined) number of candidate sequences that had the highest scores.

MyriMatch is designed to take advantage of (symmetric) multiprocessor systems by multithreading the database search. A search process on an SMP system will spawn one worker thread for each processing unit (where a processing unit can be either a core on a multi-core CPU or a separate CPU entirely). The main thread then generates a list of worker numbers which is equal to the number of worker threads multiplied by this parameter. The worker threads then take a worker number from the list and use that number to iterate through the protein list.

Nehalem-EX MyriMatch

More of the same Xeon goodness in MyriMatch.

POV-Ray 3.70 beta 36

The Persistence of Vision Ray-Tracer creates three-dimensional, photo-realistic images using a rendering technique called ray-tracing. It reads in a text file containing information describing the objects and lighting in a scene and generates an image of that scene from the view point of a camera also described in the text file. Ray-tracing is not a fast process by any means, but it produces very high quality images with realistic reflections, shading, perspective and other effects.

Nehalem-EX POV-Ray

The POV-Ray results are pretty much neck-and-neck between the two systems, with the Opterons edging out the win. As a side note, the X7560 Xeons' (with 16 2.26ghz cores and 32 threads) results are almost identical to the numbers previously posted by a pair of W5590 Xeons (with 8 2.93ghz cores and 16 threads) in this test.

Sungard Adaptiv Analytics v4.0

SunGard Adaptiv Credit Risk application is a component of SunGard's comprehensive suite of risk management products ( This workload is a scaled down version of the full application. At its core, the application utilizes a proprietary Monte Carlo method financial engine to determine the future value of a fictitious portfolio.

This package consists of a Microsoft Windows based .NET application and two data files - a sample market data and a sample portfolio, which provide input to the financial engine.

Nehalem-EX Sungard

Way back when I originally started using Sungard as a test workload on Opteron and Xeon platforms, a small group of people cried foul about the inherent optimizations of the .NET platform for Intel processors. It's been a while since I've run this test on Opterons, but here we are four revisions of the benchmark and 5 revisions of the .NET platform later and the Xeons still handily win this test.

Printed from (,5.html)