Friday, 26 July 2019

Mini PCs Linux Performance Comparison

Recent vulnerabilities in Intel x86 microprocessors (Meltdown, Spectre, Foreshadow, RIDL, Fallout, ZombieLoad etc.) are now addressed with mitigation patches for the Linux kernel although have resulted in some performance degradation. As a consequence my previous comparison benchmarks could be somewhat misleading when compared with new results given the different versions of software at the time of execution.

So I've repeated running my standard Phoronix Test Suite benchmarks on several of the latest mini PCs each running the latest updated Ubuntu 18.04.2 software with the same Ubuntu 4.15.0-54 kernel.

Specifically the mini PCs I've used are as follows:
On each mini PC I've also run sbc-bench which is a small set of different CPU performance tests focusing on server performance, glmark2 from the standard repositories which is a benchmark for OpenGL (ES) 2.0. and only uses the subset of the OpenGL 2.0 API that is compatible, some real-world timing tests for the compilation, zipping and unzipping of the Linux mainline v5.2 kernel, iozone also from the standard repositories  which is a filesystem benchmark tool and finally Octane 2 which is a JavaScript benchmark and was run in Chrome.

Where possible I've used the same pair of 4GB SAMSUNG M471A5244BB0-CRC 2400 MT/s DDR4 RAM modules and an Intel SSDSCKJF180A5 SATA 180GB 2280 M.2, either natively attached or housed through an adaptor in a 2.5" SSD enclosure, so that the SOCs maybe more equally compared.



Phoronix Test Suite

Phoronix Test Suite is a testing and benchmarking platform written in PHP5 language. Whilst many tests are available I've only considered tests that focus on server performance to evaluate CPU, RAM and I/O performance:
  • CacheBench – Memory and cache bandwidth performance benchmark.
  • CLOMP – C version of the Livermore OpenMP benchmark developed to measure OpenMP overheads and other performance impacts due to threading.
  • 7-Zip compression – Uses p7zip integrated benchmark feature.
  • dcraw – This test measures the time it takes to convert several high-resolution RAW NEF image files to PPM image format using dcraw.
  • LAME MP3 encoding – This test measures the time required to encode a WAV file to MP3 format.
  • FFmpeg – Audio/video encoding performance benchmark.
  • OpenSSL – Measures RSA 4096-bit performance of OpenSSL.
  • PHPBench – Benchmark suite for PHP.
  • PyBench – Python benchmark suite.
  • SQLite – This test measures the time to perform a pre-defined number of insertions on an indexed database
  • Stream – This benchmark tests the system memory (RAM) performance.
  • TSCP – Performance benchmark built-in Tom Kerrigan’s Simple Chess Program.
  • Unpacking the Linux kernel – This test measures the time it takes to extract the .tar.bz2 Linux kernel package.
  • GMPbench – Test of the GMP 5.0.3 math library
  • IOzone  –  Tests the hard disk drive / file-system performance.
The full results might be a little confusing because for some tests, higher is better, whereas for others, lower is better:


so the following bar chart maybe easier to understand:


The results clearly show some anomalies. The Beelink X45 and X55 devices show much lower read performance in the IOzone test and this impacts the performance in the SQLite tests. Interestingly the DDR3 memory of the L55 device shows best performance for CacheBench and CLOMP tests. The NUC7CJYSAL's dual-core without hyper-threading impacts the performance for 7-Zip compression, LAME MP3 encoding and FFmpeg encoding. CPU throttling occurred on the X55 as evidenced on the 'dmesg' impacting some of the results:

[ 6667.805628] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805630] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805631] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805633] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805637] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805638] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805639] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805647] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.806608] CPU1: Core temperature/speed normal
[ 6667.806609] CPU0: Core temperature/speed normal
[ 6667.806612] CPU0: Package temperature/speed normal
[ 6667.806612] CPU1: Package temperature/speed normal
[ 6667.806630] CPU3: Core temperature/speed normal
[ 6667.806632] CPU2: Core temperature/speed normal
[ 6667.806634] CPU2: Package temperature/speed normal
[ 6667.806635] CPU3: Package temperature/speed normal

sbc-bench

Similarly this benchmark also focuses on server performance and uses the following tools:
  • tinymembench  –  Checks for both memory bandwidth and latency in a lot of variations. 
  • cpuminer  –  Checks for appropriate heat dissipation and instabilities under load. 
  • 7-zip  –  Represents 'server workloads in general'.
  • OpenSSL  –  Solely focuses on AES performance.
and the benchmark was run with the command 'sudo /bin/bash ./sbc-bench.sh -c'.

glmark2

This OpenGL (ES) 2.0 benchmark suite was run using 1920×1080 resolution.

Compile Linux

The following packages were first installed: git build-essential flex bison kernel-package fakeroot libncurses5-dev libssl-dev kernel-wedge libelf-dev devscripts rsync gawk. The kernel source code was then downloaded with the command 'git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git -b v5.2'. Then after making the default config the kernel was timed compiling with the command 'time LOCALVERSION= fakeroot make-kpkg -j$(nproc --all) --initrd kernel_image kernel_headers'.

zip Linux

The kernel source code (downloaded with 'git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git') was timed for zipping with the command 'time zip -r linux.zip linux'.

unzip Linux

The previously zipped kernel source was timed unzipping with the command 'time unzip linux.zip'.

iozone

The iozone command was run with the command 'sudo iozone -e -I -a -s 1G -r 4k -r 16k -r 512k -r 1024k -r 16384k -i 0 -i 1 -i 2' and the read and write results for 16384k recorded in the table below.

octane

Chrome was installed and Octane 2 was run from 'https://chromium.github.io/octane/'.

A summary of the results from each of the above benchmark tests is as follows:


Each benchmark was run when the average load was less than 0.1. For each of the sbc-bench runs the full results uploaded to http://ix.io were checked for throttling and noted in the table. Slight throttling occurred on the X45 and NUC7PJYH whereas the X55 CPU got the hottest and suffered the most throttling as a consequence. The I/O read speeds for the Beelink X45 and X55 devices were nearly half that of the other devices which was also highlighted by the Phoronix IOzone test. The memory benchmark results were lower for the X45 as a result of only having 4GB of RAM and the consequence of this can be seen reflected in other test results.

Overall the results show that a greater number of CPU cores and higher CPU clock speeds do improve performance. However the gains are somewhat marginal and vary between the tasks being performed so the cost associated with 'better' CPUs should equally be considered when choosing a mini PC. The effectiveness of thermal management should also be noted especially if the CPU load is expected to be high as it can significantly affect the performance whilst executing such tasks.