Benchmarking Points-to Analysis

(1)

Benchmarking Points-to Analysis

(2)

Linnaeus University Dissertations

No 133/2013

B

ENCHMARKING

P

^OINTS

-

^TO

A

^NALYSIS

T

^OBIAS

G

^UTZMANN

LINNAEUS UNIVERSITY PRESS

(3)

Linnaeus University Dissertations

No 133/2013

B

ENCHMARKING

P

^OINTS

-

^TO

A

^NALYSIS

T

^OBIAS

G

^UTZMANN

LINNAEUS UNIVERSITY PRESS

(4)

Benchmarking Points-to Analysis

Doctoral dissertation, Department of Computer Science, Linnaeus University, Växjö, 2013

ISBN: 978-91-87427-25-1

Published by: Linnaeus University Press, S-351 95, Växjö Printed by: Elanders Sverige AB, 2013

Abstract

Points-to analysis is a static program analysis that, simply put, computes which objects created at certain points of a given program might show up at which other points of the same program. In particular, it computes possible targets of a call and possible objects referenced by a field. Such information is essential input to many client applications in optimizing compilers and software engineering tools.

Comparing experimental results with respect to accuracy and performance is required in order to distinguish the promising from the less promising approaches to points-to analysis. Unfortunately, comparing the accuracy of two different points-to analysis implementations is difficult, as there are many pitfalls in the details. In particular, there are no standardized means to perform such a comparison, i.e, no benchmark suite – a set of programs with well-defined rules of how to compare different points-to analysis results – exists. Therefore, different researchers use their own means to evaluate their approaches to points-to analysis. To complicate matters, even the same researchers do not stick to the same evaluation methods, which often makes it impossible to take two research publications and reliably tell which one describes the more accurate points-to analysis.

In this thesis, we define a methodology on how to benchmark points-to analysis. We create a benchmark suite, compare three different points-to analysis implementations with each other based on this methodology, and explain differences in analysis accuracy.

We also argue for the need of a Gold Standard, i.e., a set of benchmark programs with exact analysis results. Such a Gold Standard is often required to compare points-to analysis results, and it also allows to assess the exact accuracy of points-to analysis results. Since such a Gold Standard cannot be computed automatically, it needs to be created semi-automatically by the research community. We propose a process for creating a Gold Standard based on under-approximating it through optimistic (dynamic) analysis and over-approximating it through conservative (static) analysis. With the help of improved static and dynamic points-to analysis and expert knowledge about benchmark programs, we present a first attempt towards a Gold Standard.

We also provide a Web-based benchmarking platform, through which researchers can compare their own experimental results with those of other researchers, and can contribute towards the creation of a Gold Standard.

i

(5)

Benchmarking Points-to Analysis

Doctoral dissertation, Department of Computer Science, Linnaeus University, Växjö, 2013

ISBN: 978-91-87427-25-1

Published by: Linnaeus University Press, S-351 95, Växjö Printed by: Elanders Sverige AB, 2013

Abstract

Comparing experimental results with respect to accuracy and performance is required in order to distinguish the promising from the less promising approaches to points-to analysis. Unfortunately, comparing the accuracy of two different points-to analysis implementations is difficult, as there are many pitfalls in the details. In particular, there are no standardized means to perform such a comparison, i.e, no benchmark suite – a set of programs with well-defined rules of how to compare different points-to analysis results – exists. Therefore, different researchers use their own means to evaluate their approaches to points-to analysis. To complicate matters, even the same researchers do not stick to the same evaluation methods, which often makes it impossible to take two research publications and reliably tell which one describes the more accurate points-to analysis.

In this thesis, we define a methodology on how to benchmark points-to analysis. We create a benchmark suite, compare three different points-to analysis implementations with each other based on this methodology, and explain differences in analysis accuracy.

We also argue for the need of a Gold Standard, i.e., a set of benchmark programs with exact analysis results. Such a Gold Standard is often required to compare points-to analysis results, and it also allows to assess the exact accuracy of points-to analysis results. Since such a Gold Standard cannot be computed automatically, it needs to be created semi-automatically by the research community. We propose a process for creating a Gold Standard based on under-approximating it through optimistic (dynamic) analysis and over-approximating it through conservative (static) analysis. With the help of improved static and dynamic points-to analysis and expert knowledge about benchmark programs, we present a first attempt towards a Gold Standard.

We also provide a Web-based benchmarking platform, through which researchers can compare their own experimental results with those of other researchers, and can contribute towards the creation of a Gold Standard.

i

(6)

This thesis is based on the following refereed publications:

Tobias Gutzmann, Jonas Lundberg, Welf Löwe: Towards Path-Sensitive Points-to Analysis. Seventh IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2007).

Jonas Lundberg, Tobias Gutzmann, Welf Löwe: Fast and Precise Points- to Analysis. Eighth IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2008).

Jonas Lundberg, Tobias Gutzmann, Marcus Edvinsson, Welf Löwe: Fast and Precise Points-to Analysis. Information and Software Technology, Volume 51, Issue 10, October 2009.

Tobias Gutzmann, Antonina Khairova, Jonas Lundberg, Welf Löwe: To- wards Comparing and Combining Points-to Analyses. Ninth IEEE Inter- national Working Conference on Source Code Analysis and Manipulation (SCAM 2009).

Tobias Gutzmann, Welf Löwe: Reducing the Performance Overhead of Dy- namic Analysis through Custom-made Agents. 5th International Workshop on Program Comprehension through Dynamic Analysis (PCODA 2010).

Tobias Gutzmann, Jonas Lundberg, Welf Löwe: Feedback-driven Points-to Analysis. 26th Annual ACM Symposium on Applied Computing (SAC 2011).

Tobias Gutzmann, Welf Löwe: Custom-made Instrumentation Based on Static Analysis. Ninth International Workshop on Dynamic Analysis (WODA 2011).

Tobias Gutzmann, Jonas Lundberg, Welf Löwe: Collections Frameworks for Points-to Analysis. Twelfth IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2012).

This thesis is also a direct extension of:

Tobias Gutzmann: Towards a Gold Standard for Points-to Analysis. Li- centiate thesis, Linnaeus University, Växjö, Sweden, March 2010.

(7)