We regularly publish articles on the latest processors from Intel, which for many year has maintained its position as the market leader of server solutions. However, the situation has been changing as of late and other players have actively been making a name for themselves. This past March, AMD released it EPYC processor family, which has received some interesting and generally positive reviews (like this article on Anandtech). Seeing and trying something for yourself though often proves more valuable than reading (even hundreds) of articles online.
That being said, there aren’t many articles out there worth of our attention. Moreover, since releasing these processors, AMD has hardly published any technical or marketing materials: at the moment, they’re bound to information from the article AMD EPYC SoC Sets 4 World Records on SPEC CPU Benchmarks, which is more marketing in nature than technical.
We were lucky enough to get the chance to try it out for ourself: our colleagues from ASUS recently lent us a server built on the AMD EPYC 7351 processor to test. We decided to compare it with Intel Skylake SP processors and test its overall performance. The results and an analysis are of these tests are presented below.
Some Notes about Our Testing Methods
When choosing a method, we primarily follow the principle of practical application. Here, our take matches that of the author of Computer System Performance Testing Methods, 2011 (version 5.0, in Russian): while testing, tasks should resemble real-world usage as closely as possible.
Use of synthetic tests should be reduced to a minimum: we perform these exclusively to get a general representation of the processors’ capabilities (which may be corrected while performing further tests) and to compare our results with those published online. We’re much more interested in looking at tasks that we and our clients perform daily, such as processing large amounts of data, compiling complex software, working with database management systems under heavy loads, and more.
As part of the experiment we’ll talk about below, we run the following tests:
- basic synthetic tests (Geekbench)
- compiling Boost libraries
- memory bandwidth testing (STREAM benchmark)
- NAMD benchmarks (evaluation of floating point computing performance)
General Technical Specifications
We measured the performance of three servers during our tests:
- CPU AMD EPYC 7351/516 GB RAM/2×800 GB SSD
- CPU Intel Xeon Gold 6140/384 GB RAM/2×800 GB SSD
- СPU Intel Xeon Silver 4114/384 GB RAM/2×800 GB SSD
All three servers ran Ubuntu 16.04.
The table below contains detailed specs of all the processors:
|Characteristic||Intel Xeon Silver 4114||Intel Xeon Gold 6140||AMD EPYC 7310|
|Lithography||14 nm||14 nm||14 nm|
|Number of cores||10||18||16|
|Number of threads||20||36||32|
|Base frequency||2.20 GHz||2.30 GHz||2.40 GHz|
|Maximum Turbo frequency||3.00 GHz||3.70 GHz||2.90 GHz|
|L3 cache||13.75 MB||24.75 MB||64 MB|
|TDP (thermal design power)||85 W||140 W||155/170 W|
Zen Microarchitecture: A Brief Overview
AMD EPYC processors are built on the Zen microarchitecture, which was first launched on March 2, 2017. It’s used not only in server solutions, but desktop solutions as well (AMD Ryzen processors). Like the Ryzen processors, the EPYC uses 8-core silicon dies that are made up of 2 CCX (Core Complex) modules. For AMD, these modules are made up of four processor cores and an L3 cache.
As we can see from the table above, AMD EPYC processors have 16 cores. This is technically implemented with two 8-core silicone dies, which are connected by an Infinity Fabric bus. They share a memory controller and PCI Express hub.
We won’t give a detailed description of all the Zen microarchitecture features, especially considering the number of thorough descriptions already available online (for those interested, we recommend AMD Zen Microarchitecture: Dual Schedulers, Micro-Op Cache and Memory Hierarchy Revealed). We’ll just highlight the improvements worth mentioning:
- two threads per core (Simultaneous Multi-Threading technology, or SMT, which can be seen as a hyperthreading equivalent)
- decoded micro-op cache
- new instruction set support (AVX, AVX2, BMI1 and BMI2, AES, SHA1/SHA256, RDSEED, SMAP, and many more, including those specific to AMD)
- large unified L2 cache (512 KB on the core)
It does not support AVX512, SGX (Software Guard Extensions) and MPX (Memory Protection Extensions) instruction sets (which is a key difference from Xeon processors). We can’t really consider this a drawback in the strictest sense of the word; the developers at AMD have simply decided to take another route (more detailed information about this can be found in the somewhat outdated, yet still interesting article EPYC Offers x86 Compatibility).
For a long time, energy efficiency was considered a weak spot for AMD, as opposed to Intel. When making the EPYC line, a lot of work went into removing these drawbacks. To improve energy efficiency and lower power consumption, new technologies were introduced into the AMD EPYC, including dynamic frequency and voltage scaling per core based on temperature and load.
With the energy efficiency algorithms used, it can determine if the current load is susceptible to delays and if necessary, reduce the core’s frequency to optimize performance per watt consumption. Per-core linear power regulators were also implemented in the EPYC processors. Each core can work at its own frequency and voltage, if the current workload and other factors dictate.
General Features: Basic Synthetic Tests
Now that we’ve sorted through the theoretical side of things, it’s time to start testing and to analyze the results. For starters, we decided to see what results the AMD EPYC 7351 would yield for some of the most commonly used synthetic tests. We’d like to repeat that we use synthetic tests strictly as a point of reference for coming up with and supporting hypotheses, nothing else.
We decided to use Geekbench, which is a set of synthetic tests that awards points based on the results and then uses them to generate a detailed diagram. Users can upload their own results and compare them with other users’ results.
The full list of tests can be found in the official documentation. Even though Geekbench has the reputation of being first and foremost for desktops, it includes a good number of standard server benchmarks.
To start with, we launched Geekbench on three servers: one with the AMD EPYC, one with the Intel Xeon Gold 6114, and one with the Intel Xeon Silver.
We got the following results:
- browser.geekbench.com/v4/cpu/4807485 — Intel Xeon Gold (4399 points for single-core performance tests and 74097 for multi-core performance)
- browser.geekbench.com/v4/cpu/4859969 — Intel Xeon Silver (3410 points for single-core, 43971 for multi-core)
- browser.geekbench.com/v4/cpu/4807276 — AMD (3737 points for single-core, 61235 for multi-core)
Essentially, AMD EPYC performed better than Silver, but worse than Gold. However, knowing how valuable synthetic tests are, we won’t be picking apart these numbers.
Memory Throughput: STREAM Benchmark
Intel and AMD processor microarchitectures differ significantly. In light of this, we thought it’d be interesting to see how well our processors manage in memory subsystems. This was done with the well-known STREAM benchmark.
This is a synthetic test that measures throughput when managing steady-state data arrays. For an in-depth description of this benchmark, we recommend reviewing this article by John McCalpin. To put it briefly, STREAM is a fairly simple program written in C, which executes vector operations like a(i) = b(i) + q*c(i), where the type of data is double (64 bit) and q is a constant. It’s used in tests for assessing supercomputer performance (like the HPC Challenge Benchmark).
In our case, there was only one complication: the servers weren’t exactly equal. The AMD server had more memory channels (8) than those built on Intel processors (6 on each one).
Nevertheless, we ran the test and the results were somewhat curious. In general, they matched the results that the Anandtech author got. We ran our experiment a bit differently though: for compiling programs from source code, we used the standard gcc compiler (without any additional flags) instead of the Intel compiler.
The final results are shown below (GB/s; the bigger, the better):
As we can see, the AMD processor has a strong lead over the competition (the manufacturer wrote about this in their recently published marketing materials, like AMD EPYC SoC Delivers Exceptional Results on the STREAM Benchmark on 2P Servers).
However, we won’t jump to any conclusions: high results on synthetic tests are not necessarily evident of high performance. We’ll look at how our processors handled tasks in more-or-less real life scenarios.
To evaluate performance, it’s good to launch a complex, resource-intensive compiler on the server. We compiled the C++ Boost library: we downloaded an archive of the latest version (ver. 1.65.1) of the source code from the official site, extracted the files and launched the compiler (everything was performed strictly per instructions, with no changes being made to the configuration or addition of extra flags to the compiler).
Our test yielded the following results:
- on a server built on the Intel Xeon Gold, compiling took 9 minutes 12 seconds
- on the AMD EPYC 7351 — 10 minutes 15 seconds
- on the Intel Xeon Silver — 12 minutes
As we can see, the results were understandably predictable: AMD was better than Silver, but worse than Gold.
NAMD (Nanoscale Molecular Dynamics) is a program for molecular dynamics, which is used not only for scientific computations, but as a benchmark for evaluating the performance of calculating floating points. The NAMD benchmarks are good because, firstly, they are based on near real-world computational tasks; secondly, they create a decent workload for the processor.
Two standard tests were conducted: STMV and APOA1. Since all of the processors used for the tests contained a different number of cores, we limited the number of threads to 40 (threads on the core).
In addition to the three aforementioned servers, we added another server to the test running an Intel Xeon E5 2630v4 processor.
The first test we ran is called STMV (Satellite Tobacco Mosaic Virus). We won’t be giving a detailed look at the computations (anyone interested can find that information at the link above). We’d like to mention that for modelling the dynamics of the actual virus, the program has to perform complicated computations based on a fairly hefty set of data. Processing a lot of data is a typical use for modern server processors, which is why the results of the NAMD benchmark are of particular interest.
When evaluating and analyzing the results, we first looked at the test runtime. Our results are presented in the following diagram:
As one would expect, the leader was the Intel Xeon Gold. Second place went to the AMD EPYC (224.000992 s). Next was Intel Xeon Silver (250.966705) and then Intel Xeon E5 2630v4 (262.287109 s).
The next test was the APOA1 (Apolipoprotein A1), a standard NAMD benchmark. Here, the results were as follows:
- Intel Xeon Gold — 19.105089
- AMD EPYC — 22.09503
- Intel Xeon Silver — 25.303406
- Intel Xeon E5 2630v4 — 23.258205
These results are visualized in the diagram below:
The AMD EPYC again performed as predicted: bypassing Intel Xeon Silver but still behind Intel Xeon Gold.
Based on the test results, we can conclude that the AMD EPYC 7351 processor demonstrates overall good performance and from what we’ve seen, falls between Intel Xeon Silver and Intel Xeon Gold. This isn’t the first time AMD has tried to find its own niche in the market, but only time will tell how successful this attempt is.
What can we say about the new Intel and AMD processors in terms of price vs. quality?
The recommended price for the AMD EPYC 7351 is 1,100 USD (from the article In the EPYC center: More Zen server CPU specs, prices sneak out of AMD) and this is much cheaper than the bulk of Intel Xeon Gold processors (recommended prices). In this case, the given price corresponds to the cost of “older” Xeon Silver models (such as the Xeon Silver 4116, which has a recommended sales price of 1,000 USD).
Compared to the Intel Silver models, AMD EPYC seems fairly competitive: results like ours and from third-party benchmarks (like the Intel Xeon Silver 4116 Linux Benchmarks and Review of the Top-End Xeon Silver and Dissecting Intel’s EPYC Benchmarks: Performance Through the Lens of Competitive Analysis) show us that AMD tops the competition in a wide array of processor tests.
We fully agree with the aforementioned Anandtech article that for multiple uses (like as a web server or Java application server), we fully recommend servers built on AMD EPYC processors.
At the same time, for more specialized tasks (like high-performance computations and virtualization), Intel processors are preferred (read a discussion on this in the article Dissection Intel’s EPYC Benchmarks: Performance Through the Lens of Competitive Analysis).
We’ll be keeping a close eye on the processor market. We hope we’ll soon have the opportunity to try out other new AMD processors and test them for uses that are of more interest to us. If everything works out, we’ll be sure to write about it in a future publication.
We’d like to thank ASUS for the server.