• No results found

Flexible and High-Speed System-Level Performance Analysis using Hardware-Accelerated Simulation

N/A
N/A
Protected

Academic year: 2021

Share "Flexible and High-Speed System-Level Performance Analysis using Hardware-Accelerated Simulation"

Copied!
1
0
0

Loading.... (view fulltext now)

Full text

(1)

Flexible and High-Speed System-Level Performance Analysis using Hardware-Accelerated Simulation

Sascha Bischoff

1

, Andreas Sandberg

2

, Andreas Hansson

3

, Dam Sunwoo

4

, Ali G. Saidi

4

, Matthew Horsnell

3

, Bashir M. Al-Hashimi

1

1

University of Southampton, Southampton, UK

2

Uppsala University, Uppsala, Sweden

3

ARM, Cambridge, UK

4

ARM, Austin, TX

1 Abstract

Application performance is critical; state-of-the-art mo- bile systems are expected to deliver performance on de- mand, whilst remaining extremely power efficient. There- fore, it is vital to fine-tune the operating system, the set of applications running and the hardware in question, as well as ensuring that the performance of each application is able to meet the end-user’s expectations.

The performance of an application can be measured ac- curately in hardware. However, this approach is inflexi- ble and the observability is limited by the availability of hardware counters. Moreover, hardware-based performance analysis is quick, but it is restricted to the actual hardware and, in some situations, this technique can lead to large per- turbations due to probing effects — i.e. the act of measuring changes the outcome.

On the other hand, the performance of an application can be measured using a full-system simulator, such as gem5 [2]. This offers considerably more observability and flex- ibility without affecting the performance of the simulated system. Most simulators, however, will produce only a low level, often text-based, statistics output, which does not integrate well with existing tool-sets, and requires the de- signer to use ad-hoc visualisation. Simulator performance is also significantly slower than running on real hardware, and this slowdown becomes more noticeable when more de- tailed simulation models are utilised.

The Linux Kernel-based Virtual Machine (KVM) pro- vides a well documented API that enables software to use hardware virtualisation outside of the scope of traditional virtualisation environments, and can therefore be used as a core in a simulator. KVM is able to execute any code, pro- vided the peripheral devices are emulated. However, due to limited observability of the virtualised hardware, it is diffi- cult to gather detailed performance metrics whilst running with KVM. Therefore, KVM must be switched for an accu- rate, simulated CPU model at points of interest.

In this work, the gem5 simulator is augmented with

KVM-based hardware acceleration, and is extended with a portable advanced visualisation tool which allows for var- ious types of statistics output, depending on the particular requirements of the analysis conducted. This provides a platform for high-performance system simulation without sacrificing the accuracy, observability or flexibility of the simulator. We demonstrate:

1. hardware acceleration of the open source gem5 full- system simulator on a Samsung Exynos 5 Dual [3]

based development board,

2. visualisation of a large number of statistics using the publicly available ARM Streamline Performance Anal- yser [1].

Hardware virtualisation on an ARM based development board is used to fast-forward the simulation to a point of in- terest, at which moment the virtualised core is switched for a gem5 software model. Employing the gem5 core model we demonstrate that a large number of statistics can be ex- tracted using the publicly available ARM Streamline Perfor- mance Analyser [1]. A comparison of the statistics gathered with those collected from real, comparable hardware shows that they provide significantly greater insights into the fac- tors which govern application performance.

References

[1] ARM. ARM Streamline Performance Analyzer.

http://www.arm.com/products/tools/software-tools/ds- 5/streamline.php.

[2] N. Binkert, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, D. A. Wood, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, and T. Krishna. The gem5 simulator. ACM SIGARCH Com- puter Architecture News, 39(2):1, Aug. 2011.

[3] Samsung. Enjoy the Ultimate WQXGA Solution with Exynos

5 Dual. White Paper, 2012.

References

Related documents

patens was exposed to several abiotic stresses (salt, ABA, drought, cold, copper, UV-B or osmotic stress) The expression of PpLTPg1, PpLTPg2, PpLTPg3, PpLTPg4, PpLTPg5,

Keywords: online distance education, mobile learning, mobile-assisted language learning, information systems artefact, design science research, learning practices,

Analysis of Massive MIMO Base Station Transceivers Christopher Mollén. Linköping Studies in Science and Technology,

4.1 Research Technique Byreddy Sreenibha Reddy 4.2 Evaluation Technique Byreddy Sreenibha Reddy 4.3 Experimental Testbed Byreddy Sreenibha Reddy 4.3.1 Testbed Design

Denna uppsats har svarat på frågeställningar gällande taktikanpassningen som taktisk grundregel och kopplat detta till hur Försvarsmaktens metoder för att utöva

Det finns sedan 1950-talet utrustning för att kommunicera i vatten, vilket görs med hydroakustisk undervattenstelefon, eller som de även kallas, hydrotelefoner.. Hydro-

The aim of this work package is to propose technical and metrological specifications for instruments used to measure luminance and (reduced) luminance

The number of required stamping lines were calculated as for concept 1 and 2 resulting in three lines to produce the stator laminates and two lines for the