search for it

Thursday, November 17, 2011

Sandy Bridge-E: Core i7-3960X Is Fast, But Is It Any More Efficient?

Ironically, when it comes to performance, Intel’s Core i7-3960X is the real Bulldozer. Since its power consumption levels are lower than the Gulftown-based Core i7, it should also deliver amazing performance per watt as well. Is that really the case?
Intel's Sandy Bridge-E design takes the company's 32 nm Sandy Bridge architecture to the next level. As you likely saw in Chris Angelini’s full review on Sandy Bridge-E And X79 Express, the new high-end processor family offers more of almost everything: more cores, more cache, more memory channels, and more PCI Express connectivity, resulting in better benchmark scores in almost every discipline.
While the new processor design, which is now available as the Core i7-3960X and Core i7-3930K (and Core i7-3820 some time next year) delivers more performance, we've already seen the first review machines based on X79 Express lowering power consumption versus the Gulftown/X58 combination thanks to the dual-chip platform layout. AMD might not want to learn in detail what this could mean in terms of performance per watt, since the six-core Core i7-990X was already faster than its flagship FX-8150.

The Numbers Game

The secret sauce of Sandy Bridge-E turns into a relatively simple recipe, which reads: do more of the same. This is made possible by the solid performance per core of Sandy Bridge, and the parallelism of a six-core implementation. In other words, it appears that Sandy Bridge scales very well, so it makes sense that Intel would introduce it as a six-core desktop offering and, later, an eight-core server-oriented Xeon processor.
In short, Sandy Bridge-E facilitates up to six cores (rather than the four you max out with on LGA 1155), includes four 64-bit memory channels (rather than LGA 1366's maximum of three), boasts official memory data rates as high as 1600 MT/s, and features 40 PCI Express 3.0-capable lanes. Moreover, the 2.27 billion-transistor processor occupies 434 mm2 of die space, too.

Getting Rid Of Dead Weight

But Sandy Bridge-E also sheds certain elements that might otherwise contribute to its overall power consumption. As on Sandy Bridge, power gating allows unused parts of the processor to be almost completely shut down, minimizing power consumption. Add that to the single-chip platform, which replaces its predecessor's two-chip layout, and you have the foundation for new lows in idle and peak power usage compared to any other six-core CPU in the lab.
The promise, then, is one of new efficiency records, particularly in applications able to leverage Sandy Bridge-E's parallelism. Just recently, we looked at the performance per Watt of AMD’s FX processor in the article AMD FX: Energy Efficiency Compared To Eight Other CPUs. In today’s article, we’re performing the very same experiment.
So, if you’re in search of information on power efficiency, have a look at the aforementioned story. Or, if it's architectural details you're after, make sure you've already read our Sandy Bridge-E launch article for more the story about design and performance.
Sporting six cores, 32 nm lithography, 15 MB of shared L3 cache, and clock rates between 3.3 and 3.9 GHz, depending on workload, is the Core i7-3960X a good foundation on which to enable great power efficiency? It seems like it could be, as the idle power consumption of 87 W measured in our launch coverage represents a record low for a high-end desktop PC.


The test sample Intel provided to our German crew is the same as the one that landed in the U.S.: the six-core Sandy Bridge-E chip branded as Core i7-3960X. It is specified to run at a base frequency up to 3.3 GHz with all six cores, but Turbo Boost 2.0 can take it up to 3.9 GHz when one or two cores are active.
In order to ensure maximum thermal headroom, we also received a closed-loop liquid cooling kit designed by Asetek and very similar to the products offered by Antec, Corsair, Cooler Master, and Cool IT. Interestingly, the solution provided here is not as large or as powerful as the one AMD delivered with its FX sample. We'll use the same liquid cooler when we analyze efficiency at overclocked settings in the very near future. Intel plans to make it available somewhere between $85-100, and it should fit all current Intel platforms.
There aren't many folks who'd spring for a $1000 CPU and the handicap it with insufficient cooling. So, we're using the closed-loop system for our efficiency exploration at stock clocks today. The result of ample cooling is better heat dissipation, which results in Turbo Boost holding its elevated clock rates for longer stretches without running into thermal bottlenecks.
Intel’s DX79SI is the company's highest-end motherboard, competing against products from ASRock, Asus, ECS, Evga, Gigabyte, and MSI, amongst others. If history is any indication, Intel's own retail platforms tend to be more conservative than the flagships from its third-party board partners. As such, they're generally not the first choice of hardcore enthusiasts. Now, that doesn't mean Intel's team isn't capable of designing a great motherboard; in fact, the DX79SI is perhaps its best effort to date. Notably, eight DIMM slots and three PCI Express slots represent inclusions that any power user is going to demand.
Thus, the DK79SI represents a good, stable platform for us to test on. It's not loaded with the number of features you'd expect to find on some of the more extravagant boards that Thomas is in the process of rounding up. However, it facilitates all of Sandy Bridge-E's performance, it enables its salient features, and it's stable.
The DX79SI comes armed with lots of USB 2.0 connectivity, a couple of USB 3.0 ports, gigabit Ethernet, and six SATA ports (two of which operate at data rates as high as 6 Gb/s), as well as software-based RAID support. The X79 Platform Controller Hub facilitates eight lanes of second-gen PCIe connectivity, while the processor contributes 40 lanes of PCI Express 3.0, which drive the board's three 16-lane slots.
Intel’s highest-end thermal solution is manufactured by Asetek, but it’s not a true high-end part as far as liquid cooling is concerned.


LGA 2011 Platform
LGA 2011 PlatformIntel DX79SI, Chipset: Intel X79 Express
LGA2011 ProcessorsIntel Core i7-3960X Extreme Edition (32 nm, Sandy Bridge-E), 6C/12T, 3.3 GHz, 6 x 256 KB L2 Cache, 15 MB Shared L3 Cache, 130 W TDP, 3.9 GHz max. Turbo Boost
Socket AM3+ Platform
Socket AM3+ PlatformAsus Crosshair Formula V (Rev. 1.0), Chipset: AMD 990FX, BIOS: 9905 (2011-10-03)
AM3 ProcessorsAMD Phenom II X4 980 (45 nm, Deneb, C3), 4C/4T, 3.7 GHz, 4 x 512 KB L2 Cache, 6 MB Shared L3 Cache, 126 W TDP

AMD Phenom II X6 1100T (45 nm, Thuban, E0), 6C/6T, 3.3 GHz, 6 x 512 KB L2 Cache, 6 MB Shared L3 Cache, 126 W TDP, 3.7 GHz max. Turbo Core
AM3+ ProcessorsAMD FX-8150 (32 nm, Zambezi), 8C/8T, 3.6 GHz, 8 MB L2 Cache, 8 MB Shared L3 Cache, 125 W TDP, 3.9 GHz Turbo Core, 4.2 GHz max. Turbo Core
LGA 1156 Platform
LGA 1156
Platform
Gigabyte P55A-UD7, Chipset: Intel P55 Express, BIOS: F8b
LGA 1156 ProcessorsIntel Core i7-870 (45 nm, Lynnfield, B1), 4C/8T, 2.93 GHz, 4 x 256 KB L2 Cache, 8 MB Shared L3 Cache, 95 W TDP, 3.6 GHz max. Turbo Boost

Intel Core i5-750 (45 nm, Lynnfield, B1), 4C/4T, 2.66 GHz, 4 x 256 KB L2 Cache, 8 MB Shared L3 Cache, 95 W TDP, 3.2 GHz max. Turbo Boost
Socket LGA 1155 Platform
LGA 1155 PlatformIntel DP67BG, Chipset: Intel P67 Express, BIOS: 2040 
LGA 1156 ProcessorsIntel Core i7-2600K (32 nm, Sandy Bridge, D2), 4C/8T, 3.4 GHz, 4 x 256 KB L2 Cache, 8 MB Shared L3 Cache, w/ HD Graphics 3000, 95 W TDP, 3.8 GHz max. Turbo Boost

Intel Core i5-2500K (32 nm, Sandy Bridge, D2), 4C/4T, 3.3 GHz, 4 x 256 KB L2 Cache, 6 MB Shared L3 Cache w/ HD Graphics 3000, 95 W TDP, 3.7 GHz max. Turbo Boost
LGA 1366 Platform
LGA 1366 PlatformMSI BigBang-Xpower, Chipset: Intel X58 Express, BIOS: 1.2
LGA 1366 ProcessorsIntel Core i7-975 Extreme Edition (45 nm, Bloomfield, D0), 4C/8T, 3.33 GHz, 4 x 256 KB L2 Cache, 8 MB Shared L3 Cache, 130 W TDP, 3.6 GHz max. Turbo Boost

Intel Core i7-980X Extreme Edition (32 nm, Gulftown, B1), 6C/12T, 3.33 GHz, 4 x 256 KB L2 Cache, 8 MB Shared L3 Cache, 130 W TDP, 3.6 GHz max. Turbo Boost
Common Platform Components
Dual DDR3 Memory2 x 4 GB DDR3-1333, Kingston KHX1600C9D3K2/8GX
Discrete GraphicsAMD Radeon HD 6850, GPU: Cypress (775 MHz), Graphics RAM: 1024 MB GDDR5 (2000 MHz), Stream Processors: 960
System DriveSamsung PM810, 256 GB, SATA 3 Gb/s
Power SupplySeasonic X-760, SS-760KM Aktive PFC F3
System Software & Drivers
Operating SystemWindows 7 Ultimate x64 SP1
Drivers and Settings
ATI Radeon DriversAMD Catalyst 11.8 Suite for Windows 7
Intel Chipset DriversChipset Installation Utility Ver. 9.2.3.1022
Intel Rapid StorageVer: 10.6.0.1002

Benchmarks and Settings
Audio Benchmarks and Settings
BenchmarkDetails
iTunesVersion: 10.4.1.10
Audio CD ("Terminator II" SE), 53 min.
Convert to AAC audio format
Lame MP3Version 3.98.3
Audio CD "Terminator II SE", 53 min
convert wav to mp3 audio format
Command: -b 160 --nores (160 Kb/s)
Video Benchmarks and Settings
BenchmarkDetails
HandBrake CLIVersion: 0.95
Video: Big Buck Bunny (720x480, 23.972 frames) 5 Minutes
Audio: Dolby Digital, 48000 Hz, Six-Channel, English to
Video: AVC1 Audio1: AC3 Audio2: AAC (High Profile)
MainConcept Reference v2.2Version: 2.2.0.5440
MPEG2 to H.264
MainConcept H.264/AVC Codec
28 sec HDTV 1920x1080 (MPEG2)
Audio:
MPEG2 (44.1 kHz, 2-Channel, 16 Bit, 224 Kb/s)
Codec: H.264 Pro
Mode: PAL 50i (25 FPS)
Profile: H.264 BD HDMV
Application Benchmarks and Settings
BenchmarkDetails
7-ZipVersion 9.22 beta
LZMA2
Syntax "a -t7z -r -m0=LZMA2 -mx=5"
Benchmark: 2010-THG-Workload
WinRARVersion 4.01
RAR
Syntax "winrar a -r -m3"
Benchmark: 2010-THG-Workload
WinZip 15.5 ProVersion 14.0 Pro (8652)
WinZIP Commandline Version 3
ZIPX
Syntax "-a -ez -p -r"
Benchmark: 2010-THG-Workload
Autodesk 3d Studio Max 2012Version: 10 x64
Rendering Space Flyby Mentalray (SPECapc_3dsmax9)
Frame: 248
Resolution: 1440 x 1080
Adobe After Effects CS5.5Create Video which includes 3 Streams
Frames: 210
Render Multiple Frames Simultaneosly: on
Adobe Photoshop CS 5.1 (64-Bit)Version: 11
Filtering a 16 MB TIF (15000x7266)
Filters:
Radial Blur (Amount: 10; Method: zoom; Quality: good)
Shape Blur (Radius: 46 px; custom shape: Trademark sysmbol)
Median (Radius: 1px)
Polar Coordinates (Rectangular to Polar)
Adobe Acrobat X ProfessionalVersion: 10.0.0
== Printing Preferenced Menu ==
Default Settings: Standard
== Adobe PDF Security - Edit Menu ==
Encrypt all documents (128 bit RC4)
Open Password: 123
Permissions Password: 321
Microsoft PowerPoint 2010Version: 2007 SP2
PPT to PDF
Powerpoint Document (115 Pages)
Adobe PDF-Printer
BlenderVersion: 2.59 beta
Syntax blender -b thg.blend -f 1
Resolution: 1920x1080
Anti-Aliasing: 8x
Render: THG.blend frame 1
MatlabR2011a
Internal Benchmark: 10 runs

We also ran the efficiency test's applications in the following order:
 
Single-Threaded:
Adobe Acrobat
WinZip
iTunes
Lame
Multi-Threaded:
3ds Max
Blender
HandBrake
MainConcept
After Effects
Photoshop
Premiere
Matlab
7-Zip


No comments: