Friday, 26 July 2019

Mini PCs Linux Performance Comparison

Recent vulnerabilities in Intel x86 microprocessors (Meltdown, Spectre, Foreshadow, RIDL, Fallout, ZombieLoad etc.) are now addressed with mitigation patches for the Linux kernel although have resulted in some performance degradation. As a consequence my previous comparison benchmarks could be somewhat misleading when compared with new results given the different versions of software at the time of execution.

So I've repeated running my standard Phoronix Test Suite benchmarks on several of the latest mini PCs each running the latest updated Ubuntu 18.04.2 software with the same Ubuntu 4.15.0-54 kernel.

Specifically the mini PCs I've used are as follows:
On each mini PC I've also run sbc-bench which is a small set of different CPU performance tests focusing on server performance, glmark2 from the standard repositories which is a benchmark for OpenGL (ES) 2.0. and only uses the subset of the OpenGL 2.0 API that is compatible, some real-world timing tests for the compilation, zipping and unzipping of the Linux mainline v5.2 kernel, iozone also from the standard repositories  which is a filesystem benchmark tool and finally Octane 2 which is a JavaScript benchmark and was run in Chrome.

Where possible I've used the same pair of 4GB SAMSUNG M471A5244BB0-CRC 2400 MT/s DDR4 RAM modules and an Intel SSDSCKJF180A5 SATA 180GB 2280 M.2, either natively attached or housed through an adaptor in a 2.5" SSD enclosure, so that the SOCs maybe more equally compared.



Phoronix Test Suite

Phoronix Test Suite is a testing and benchmarking platform written in PHP5 language. Whilst many tests are available I've only considered tests that focus on server performance to evaluate CPU, RAM and I/O performance:
  • CacheBench – Memory and cache bandwidth performance benchmark.
  • CLOMP – C version of the Livermore OpenMP benchmark developed to measure OpenMP overheads and other performance impacts due to threading.
  • 7-Zip compression – Uses p7zip integrated benchmark feature.
  • dcraw – This test measures the time it takes to convert several high-resolution RAW NEF image files to PPM image format using dcraw.
  • LAME MP3 encoding – This test measures the time required to encode a WAV file to MP3 format.
  • FFmpeg – Audio/video encoding performance benchmark.
  • OpenSSL – Measures RSA 4096-bit performance of OpenSSL.
  • PHPBench – Benchmark suite for PHP.
  • PyBench – Python benchmark suite.
  • SQLite – This test measures the time to perform a pre-defined number of insertions on an indexed database
  • Stream – This benchmark tests the system memory (RAM) performance.
  • TSCP – Performance benchmark built-in Tom Kerrigan’s Simple Chess Program.
  • Unpacking the Linux kernel – This test measures the time it takes to extract the .tar.bz2 Linux kernel package.
  • GMPbench – Test of the GMP 5.0.3 math library
  • IOzone  –  Tests the hard disk drive / file-system performance.
The full results might be a little confusing because for some tests, higher is better, whereas for others, lower is better:


so the following bar chart maybe easier to understand:


The results clearly show some anomalies. The Beelink X45 and X55 devices show much lower read performance in the IOzone test and this impacts the performance in the SQLite tests. Interestingly the DDR3 memory of the L55 device shows best performance for CacheBench and CLOMP tests. The NUC7CJYSAL's dual-core without hyper-threading impacts the performance for 7-Zip compression, LAME MP3 encoding and FFmpeg encoding. CPU throttling occurred on the X55 as evidenced on the 'dmesg' impacting some of the results:

[ 6667.805628] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805630] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805631] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805633] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805637] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805638] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805639] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.805647] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 6667.806608] CPU1: Core temperature/speed normal
[ 6667.806609] CPU0: Core temperature/speed normal
[ 6667.806612] CPU0: Package temperature/speed normal
[ 6667.806612] CPU1: Package temperature/speed normal
[ 6667.806630] CPU3: Core temperature/speed normal
[ 6667.806632] CPU2: Core temperature/speed normal
[ 6667.806634] CPU2: Package temperature/speed normal
[ 6667.806635] CPU3: Package temperature/speed normal

sbc-bench

Similarly this benchmark also focuses on server performance and uses the following tools:
  • tinymembench  –  Checks for both memory bandwidth and latency in a lot of variations. 
  • cpuminer  –  Checks for appropriate heat dissipation and instabilities under load. 
  • 7-zip  –  Represents 'server workloads in general'.
  • OpenSSL  –  Solely focuses on AES performance.
and the benchmark was run with the command 'sudo /bin/bash ./sbc-bench.sh -c'.

glmark2

This OpenGL (ES) 2.0 benchmark suite was run using 1920×1080 resolution.

Compile Linux

The following packages were first installed: git build-essential flex bison kernel-package fakeroot libncurses5-dev libssl-dev kernel-wedge libelf-dev devscripts rsync gawk. The kernel source code was then downloaded with the command 'git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git -b v5.2'. Then after making the default config the kernel was timed compiling with the command 'time LOCALVERSION= fakeroot make-kpkg -j$(nproc --all) --initrd kernel_image kernel_headers'.

zip Linux

The kernel source code (downloaded with 'git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git') was timed for zipping with the command 'time zip -r linux.zip linux'.

unzip Linux

The previously zipped kernel source was timed unzipping with the command 'time unzip linux.zip'.

iozone

The iozone command was run with the command 'sudo iozone -e -I -a -s 1G -r 4k -r 16k -r 512k -r 1024k -r 16384k -i 0 -i 1 -i 2' and the read and write results for 16384k recorded in the table below.

octane

Chrome was installed and Octane 2 was run from 'https://chromium.github.io/octane/'.

A summary of the results from each of the above benchmark tests is as follows:


Each benchmark was run when the average load was less than 0.1. For each of the sbc-bench runs the full results uploaded to http://ix.io were checked for throttling and noted in the table. Slight throttling occurred on the X45 and NUC7PJYH whereas the X55 CPU got the hottest and suffered the most throttling as a consequence. The I/O read speeds for the Beelink X45 and X55 devices were nearly half that of the other devices which was also highlighted by the Phoronix IOzone test. The memory benchmark results were lower for the X45 as a result of only having 4GB of RAM and the consequence of this can be seen reflected in other test results.

Overall the results show that a greater number of CPU cores and higher CPU clock speeds do improve performance. However the gains are somewhat marginal and vary between the tasks being performed so the cost associated with 'better' CPUs should equally be considered when choosing a mini PC. The effectiveness of thermal management should also be noted especially if the CPU load is expected to be high as it can significantly affect the performance whilst executing such tasks.



Friday, 14 June 2019

Containing your Gaming



Ever felt like playing a quick game of CS:GO while that compilation is finishing without having to install software that could impact how your current development environment is setup? The simple answer is to game inside a container.

The following are my configuration notes based upon the excellent articles by Stéphane Graber and Simos Xenitellis which allow you to firstly install (and subsequently remove) the containerisation software and then within that how to install and configure Steam to support local and streaming gaming.

# change the following as required
  # Ubuntu user: linuxium
  # Container: steam
  # Container user: linuxiumcomau
# update the current environment to the latest software
sudo apt update
sudo apt upgrade
# add LXD group required for containers and make yourself (linuxium) a member of this group
sudo groupadd --system lxd
sudo usermod -aG lxd linuxium
# reboot to ensure the group membership has updated
sudo reboot
# check you are a member of LXD group
id
# install the LXD container software
sudo snap install lxd
# initializse LXD making sure the 'size in GB' is adequate for gaming e.g. 15 is required for a local CS:GO
lxd init
# check with an LXD command just to make sure everything is working
lxc list
# create login alias for user linuxiumcomau (alternative to lxc exec steam -- /bin/login -p -f linuxiumcomau)
lxc alias add login 'exec @ARGS@ --mode interactive -- /bin/sh -ac $@linuxiumcomau - exec /bin/login -p -f '
# create a LXD gui profile
cat > lxdguiprofile.txt <<+
config:
  environment.DISPLAY: :0
  raw.idmap: both 1000 1000
  user.user-data: |
    #cloud-config
    runcmd:
      - 'sed -i "s/; enable-shm = yes/enable-shm = no/g" /etc/pulse/client.conf'
      - 'echo export PULSE_SERVER=unix:/tmp/.pulse-native | tee --append /home/linuxiumcomau/.profile'
    packages:
      - x11-apps
      - mesa-utils
      - pulseaudio
description: GUI LXD profile
devices:
  PASocket:
    path: /tmp/.pulse-native
    source: /run/user/1000/pulse/native
    type: disk
  X0:
    path: /tmp/.X11-unix/X0
    source: /tmp/.X11-unix/X0
    type: disk
  hostgpu:
    type: gpu
name: gui
used_by:
+
lxc profile create gui
cat lxdguiprofile.txt | lxc profile edit gui
# setup a bridge if using Steam streaming to get the container on the same network as Ubuntu
  nm-connection-editor
    Click on "+" button at the bottom.
    Choose "Bridge" and click "Create".
      The default bridge interface will be named bridge0.
    Click on "Add" button.
    Choose "Ethernet" and click "Create".
    In "Device" field, choose the interface to enslave to the bridge, e.g. eth0.
    Click on "General" tab, and check both "Automatically connect to this network when it is available" and "All users may connect to this network".
    Save the change.
    Click on "General" tab of the bridge, make sure two check boxes are enabled ("Automatically connect to this network when it is available" and "All users may connect to this network").
    Go to "IPv4 Settings" tab, and configure either DHCP or static IP address for the bridge.
      Note that you should use the same IPv4 settings as the enslaved Ethernet interface eth0.
    Finally, save the bridge settings.
    As you have the additional bridge connection you no longer need the previously configured wired connection so delete the original wired connection e.g. Ethernet connection 1.
      You will momentarily lose a connection since the IP address assigned to eth0 is taken by bridge0.
  # update the default profile to use this newly created bridge
  lxc profile edit default # replace lxdbr0 by bridge0
  # delete the old bridge
  lxc network delete lxdbr0
# create an Ubuntu container called steam and install Steam
lxc launch ubuntu:18.04 steam
lxc exec steam -- bash
  apt update
  apt upgrade
  # create user linuxiumcomau
  killall -u ubuntu
  groupmod -n linuxiumcomau ubuntu
  usermod -md /home/linuxiumcomau -l linuxiumcomau ubuntu
  usermod -aG users linuxiumcomau
  loginctl enable-linger linuxiumcomau
  sed -i 's/ubuntu/linuxiumcomau/' /etc/sudoers.d/90-cloud-init-users
  # install dekstop packages
  apt install adwaita-icon-theme-full ubuntu-desktop^
  # exit
lxc login steam
  sudo sed -i "s/; enable-shm = yes/enable-shm = no/g" /etc/pulse/client.conf
  echo export PULSE_SERVER=unix:/tmp/.pulse-native | tee --append /home/linuxiumcomau/.profile
  # exit
# assign the gui profile
lxc profile assign steam default,gui
lxc restart steam
lxc login steam
  firefox # go to Steam website (https://store.steampowered.com) and
    # click Install Steam (top right and then centre screen and save the .deb file)
  cd Downloads/
  sudo apt install ./steam_latest.deb
  steam # use ctrl and \ to terminate
  # exit
# create HUD entry
lxc file pull steam/home/linuxiumcomau/.local/share/Steam/tenfoot/resource/images/steam_tray_48.tga ~/.local/share/icons/
cat > ~/.local/share/applications/steam.desktop <<+
[Desktop Entry]
 Name=Steam
 Comment=Play games on Steam
 Exec=lxc exec steam -- sudo --user linuxiumcomau --login steam
 Icon=/home/linuxium/.local/share/icons/steam_tray_48.tga
 Terminal=false
 Type=Application
 Categories=Game;
+
# use HUD to search for Steam and then add to favourites
# launch Steam

# and to remove everything
# remove HUD entry from favourites
rm ~/.local/share/applications/steam.desktop ~/.local/share/icons/steam_tray_48.tga
# remove LXD
lxc stop steam --force
lxc delete steam
lxc  list
lxc delete <any other containers in list - should be empty in this example>
lxc image list
lxc image delete <any images in list - should be empty in this example>
lxc network list
lxc network delete <e.g. LXD bridge i.e. lxdbr0 if present>
echo '{"config": {}}' | lxc profile edit default
lxc storage volume list default
lxc storage volume delete default <whatever is in list - should be empty in this example>
lxc storage delete default
sudo snap remove lxd
sudo deluser lxd
sudo groupdel lxd

Please donate if you find this guide useful using the following link http://goo.gl/nXWSGf.





Thursday, 30 May 2019

'Austin Beach': Intel's Compute Element Fanless NUC

With Intel's announcement today of the NUC Compute Element [1], drawing a long bow may lead to two observations: firstly Intel is using the halo effect of 'NUC' to overcome the presumed horn effect of 'Compute' from the recently cancelled Compute Card products [2]; secondly a passively cooled NUC is to be launched and become part of the NUC's ongoing product family [3].
The Intel NUC Compute Element enables an industry standard for modular compute through a device that incorporates an Intel CPU, memory, connectivity and other components and is capable of powering solutions like laptops, kiosks, smart TVs, appliances and more [1]. At first glance it almost seems the same as the aforementioned Compute Card except that instead of being a fully-enclosed, gadgety card, the NUC Compute Element looks and acts a lot more like a computer part, right down to its exposed connector. Intel claims the changes reduce its footprint inside other devices and also increases the I/O options on tap [4]. Significantly whilst the Compute Card's sealed design and extra durability added nominally to the card cost it added about $50 to each unit on the OEM side to incorporate the module which stunted its adoption [5]. Intel goes on to further emphasise that the NUC Compute Element delivers incredible performance and amazing connectivity at a low cost while making it easy to integrate, upgrade and service computing in next-generation devices [1].

[6]
[6]
Intel concluded their announcement by stating that the initial NUC Compute Element will be available with a range of processors, including versions with Intel vPro™ technology for increased security and manageability and that products based on the Intel NUC Compute Element are expected to be in market in the first half of 2020 [1]. Interestingly this provides credibility to an earlier leaked product map:

Thursday, 23 May 2019

Intel's 'Islay Canyon' NUCs Announced

Built for casual gaming and home entertainment

Introducing the first Intel® NUC with 8th Generation Intel® Core™ processors and Radeon* 540X discrete graphics for all your gaming and entertainment needs. Play casual games, binge watch the latest series, or stream digital music like never before with a quad-core processor that delivers 2x faster performance.

KEY SPECIFICATIONS

  • 8th Generation Intel® Core™ i7-8565U/i5-8265U Whiskey Lake processor
  • AMD Radeon* 540X discrete graphics with 2 GB GDDR5 graphics memory
  • 8 GB dual-channel LPDDR3-1866 (soldered down)
  • 16 GB Intel® Optane™ memory 1 TB SATA3 HDD/256 GB SSD
  • HDMI* 2.0b and Mini DisplayPort* 1.2
  • Windows® 10 Home Operating System


SKUs

Choose an Intel® NUC that’s right for you – accelerate performance with Intel® Optane™ memory paired with a high capacity HDD to load the next game level up to 4.7x faster, or get efficient reliability with an SSD with no moving parts so bumps and drops won’t damage your drive.



FEATURES AND TECHNICAL SPECIFICATIONS



RECOMMENDED CUSTOMER PRICE
NUC8i7INHJA/NUC8i7INHJPA - $770.00
NUC8i7INHX - $599.00
NUC8i5INHJA/NUC8i5INHPA - $663.00
NUC8i5INHX - $492.00

DOCUMENTATION

          User Guide

Wednesday, 24 April 2019

crostini: '--enable-gpu' not the panacea

Steam on crostini was not a proposition until recently as hardware acceleration was not available. That has changed with the latest development release of ChromeOS (75.0.3761.0) as it is now possible to manually start the termina container with the '--enable-gpu' flag to solve this situation. However whilst improvements are noted the performance is a long way off native Linux and Windows as will be demonstrated below.

Using two hardware devices, an HP Chromebox G2 and a Vorke V5 Plus configured similarly with identical CPU and RAM and similarly sized SSD, I've run the free games Counter-Strike: Global Offensive and Dota 2 under Steam running on crostini, Windows and Ubuntu.

First the basics: in-game settings. On each platform I changed some key advanced video settings for CS:GO to low:


I've then installed Windows on the Vorke V5 Plus followed by Steam and CS:GO and Dota 2. The FPS for CS:GO were mid 20's when idle:


and for Dota 2 were high 50's:



I then dual-boot installed Ubuntu 18.04 on the Vorke V5 Plus followed by Steam and CS:GO and Dota 2. The FPS for CS:GO were again mid 20's when idle:


and for Dota 2 were in the 60's:


Then on the HP Chromebox G2 I installed Ubuntu 18.04 as a crostini container. Running 'glxinfo -B' shows that GPU acceleration is not enabled:



I then installed Steam, CS:GO and Dota 2. The FPS for CS:GO was only 1 when idle (arguably as expected):


and for Dota 2 were in the 2 to 4 range:


So I restarted my container with the GPU flag:


This time for CS:GO the FPS only increased to 5-6:


and interestingly the mouse's directional movements failed resulting in constantly looking at the floor once the mouse was initially moved.

For Dota 2 the FPS improved similarly also only to 5-6:



I then installed the same Ubuntu container on Ubuntu on the Vorke V5 Plus and ran Steam's CS:GO and got a similar FPS in the 20's range thus showing that running in a container is not a bottleneck. I'll post the full details on this in a companion post later.

Therefore the conclusion is that while the 'enable-gpu' has improved the FPS performance it is still significantly lower than the FPS from natively installed OSes. This must be due to the 'virgl' drivers and hopefully a 'next' release will address this issue.

Latest Update: 

Compatible drivers are now available through installing the package 'cros-gpu-alpha':

sudo apt install cros-gpu-alpha

and then update and upgrade with

sudo apt update
sudo apt upgrade

According to https://bugs.chromium.org/p/chromium/issues/detail?id=892279 starting in release 76 a flag will be introduced to allow Crostini GPU to be enabled / disabled and the 'cros-gpu-alpha' package will be automatically installed.

Now for CS:GO the FPS has increased to around 15:


And for Dota 2 the FPS improved to around 40:


Please donate if you find this information useful using the following link http://goo.gl/nXWSGf.





Thursday, 21 March 2019

What is after Gemini Lake?

By The Osthoff Resort - www.osthoff.com, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=59894043

Elkhart Lake [1]. 

Based on a 10 nm manufacturing process, the Elkhart Lake SoC uses Tremont microarchitectures (Atom) [2] and features Gen 11 graphics similar to the Ice Lake processors [3]. Intel’s Gen 11 solution offers 64 execution units, and it has managed over 1 TFLOP in GPU performance [4]. This can be compared with the Nvidia GeForce GT 1030 which offered a peak throughput of 0.94 TFLOPs [5]. Code has already been added in the Linux mainline kernel [6] suggesting a possible Computex announcement and mid to late 2019 availability [7].

References
[1] https://en.wikichip.org/wiki/intel/cores/elkhart_lake
[2] https://en.wikichip.org/wiki/intel/microarchitectures/tremont
[3] https://lists.freedesktop.org/archives/intel-gfx/2019-March/192343.html
[4] https://www.notebookcheck.net/Intel-Gen11-GT2-GPU-outperforms-the-Vega-10-and-closes-in-on-the-Vega-11-in-leaked-benchmarks.410615.0.html
[5] https://www.notebookcheck.net/Intel-s-Elkhart-Lake-SoC-will-feature-a-Gen11-iGPU.414181.0.html
[6] https://www.phoronix.com/scan.php?page=news_item&px=Intel-Elkart-Lake-DRM-Enable
[7] https://appuals.com/intels-leaked-roadmap-shows-coffee-lake-r-refresh-in-2019-10nm-might-be-delayed-to-late-2020/

Monday, 4 March 2019

Ubuntu announced new point releases for 18.04 LTS and 16.04 LTS

Canonical have released the second point release of Ubuntu 18.04 Long-Term Support (LTS) as Ubuntu 18.04.2 and have also released the sixth point release of Ubuntu 16.04 Long-Term Support (LTS) as Ubuntu 16.04.6.

I’ve respun the desktop ISOs using my ‘isorespin.sh‘ script and created ISOs suitable for Intel Atom and Intel Apollo Lake devices:

18.04.2
16.04.6 

Please donate if you find these ISOs useful.