Py-CIU: A Python Library for Explaining Machine Learning Predictions Using Contextual Importance and Utility

(1)

http://www.diva-portal.org

Postprint

This is the accepted version of a paper presented at IJCAI-PRICAI 2020 Workshop on

Explainable Artificial Intelligence (XAI).

Citation for the original published paper:

Anjomshoae, S., Kampik, T., Främling, K. (2020)

Py-CIU: A Python Library for Explaining Machine Learning Predictions Using

Contextual Importance and Utility

In: Proceedings

N.B. When citing this work, cite the original published paper.

Permanent link to this version:

(2)

Py-CIU: A Python Library for Explaining Machine Learning Predictions Using

Contextual Importance and Utility

Sule Anjomshoae Timotheus Kampik Kary Fr¨amling

Department of Computing Science

Ume˚a University, Sweden

{sulea,tkampik,kary.framling}@cs.umu.se

Abstract

In this paper, we present the Py-CIU library, a generic Python tool for applying the Contextual Importance and Utility (CIU) explainable machine learning method. CIU uses concepts from deci-sion theory to explain a machine learning model’s prediction specific to a given data point by investi-gating the importance and usefulness of individual features (or feature combinations) to a prediction. The explanations aim to be intelligible to machine learning experts as well as non-technical users. The library can be applied to any black-box model that outputs a prediction value for all classes.

1 Introduction

In recent years, growing concern regarding trust and relia-bility of machine learning-based decision making is draw-ing attention to more transparent and interpretable algo-rithms [Gilpin et al., 2018]. Indeed, laws and regulations are moving towards requiring this functionality from infor-mation systems to prevent unintended side effects. In par-ticular, the European Union’s General Data Protection Regu-lations (GDPR) gives users the right to be informed regard-ing machine-generated decisions [Voigt and Von dem Buss-che, 2017]. Consequently, individuals who are affected by decisions made by a machine learning-based system may seek to know the reasons for the system’s decision out-come. In general, explanations help evaluate the strengths and the limitations of a machine learning model, thereby fa-cilitate trustworthiness and understandability. The Explain-able Artificial Intelligence (XAI) research area is investigat-ing various approaches to make the behavior of intelligent autonomous systems intelligible by humans [Gunning, 2017; Samek et al., 2017; Doˇsilovi´c et al., 2018].

One approach to extract information on how a black-box model reaches a certain decision is post-hoc explanations. Generally, this approach examines the influence of each fea-ture on a local point of interest. This could provide useful information particularly for practitioners and end-users who are interested in instance-specific explanations rather than the internal working of the model [Lipton, 2018]. Several tools have been proposed for explaining a specific model predic-tion, including LIME, SHAP, VIBI, and L2X. LIME explains

a prediction by approximating it locally through perturbing the input around the class of interest until it arrives at a linear approximation [Ribeiro et al., 2016]. SHAP explains the out-come by “fairly” distributing the prediction value among the features based on how each feature is contributing [Lundberg and Lee, 2017]. Both approaches approximate the local be-havior of a black-box system with a linear model. Therefore, they only provide local faithfulness and lose fidelity to the original model. The L2X [Chen et al., 2018] and VIBI [Bang et al., 2019] methods learn a stochastic map that returns a distribution over the subset of features and selects instance-specific features. Both methods provide faithfulness to the original model since they do not restrict the structure of the approximator.

As an alternative, this paper contributes the Py-CIU library to existing XAI tools and methods, which uses the Contex-tual Importance and Utility (CIU) method for explaining ma-chine learning predictions. In principle, the importance of a feature depends on the other feature values, so that a feature that is important in one context might be irrelevant in another (i.e., context is the set of input values that are being tested). The Contextual Importance and Utility method explains the model’s outcome not only based on the degree of feature im-portance but also the utility of features (usefulness for the prediction) [Fr¨amling, 1996]. Contextual utility gives insight into how much a feature is contributing to the current predic-tion relative to its importance. The utility value adds depth to explanation alongside the feature importance value. The Python implementation of the CIU method can serve as a tool for researchers to experiment with the algorithm and to apply it for providing explanations for non-technical users.

Figure 1: Example of obtaining CI and CU values for one input-output pair.

(3)

2 Contextual Importance and Utility

The Contextual Importance and Utility (CIU) method uses two algorithms to explain predictions made by black-box systems [Fr¨amling, 2020; Anjomshoae et al., 2019]. The method originates from a well-known decision-making the-ory, which suggests that the significance of a feature and the usefulness of their values change according to other fea-ture values [Keeney et al., 1993]. For instance, the impor-tance of wearing winter clothes in an outdoor temperature of -5 degrees Celsius is high for an individual’s optimal com-fort level, while the utility value of the each piece of clothes would depend on the environment. In the opposite case, im-portance of wearing a summer outfit in 25 degrees would be optimal; i.e., the usefulness of every item varies depending on the given conditions. Contextual Importance (CI) approx-imates the overall importance of a feature in the current con-text, and the Contextual Utility (CU) provides an estimation of how favorable or not the current feature value is for a given output class. The contextual importance and contextual util-ity values are formally defined as:

CI = Cmaxx(Ci) − Cminx(Ci)

absmax − absmin (1)

CU = yi,j− Cminx(Ci) Cmaxx(Ci) − Cminx(Ci)

(2) • Cmax and Cmin are the highest and the lowest pre-diction values observed by varying the value of the fea-ture(s) x,

• Ciis the context (i.e., the feature or the set of features)

being studied, where i is the prediction value.

• absmax − absmin specifies the value range over all predictions,

• yi,jis the instance-specific prediction value.

The approximation of Cmaxx(Ci) and Cminx(Ci) is

car-ried out through Monte-Carlo simulation by generating a sufficient number of random values within the range of in-put x and observing the effect on the outin-put. CI and CU values are calculated for each feature x with prediction Ci

and yi,j at a time while other input values remain

con-stant. CI corresponds to the ratio of the observed output range [Cmin, Cmax] to the greatest possible output range [absmax, absmin]. If the ratio is greater for one feature than another, then the former is more important. CU expresses the position of the output value yi,j in the [Cmin, Cmax]

range. This shows the degree of contribution to the current prediction value for each feature considering the contextual importance. Figure 1 illustrates the process of computing contextual importance and utility values for one input-output combination.

Conceptually, the CIU method has several advantages. First, CIU is broadly applicable to various machine learning mod-els; in accordance with this, the library is not tied to specific classifiers. Second, CIU explains the outcomes directly with-out transforming the model into an interpretable simplifica-tion. Third, CI and CU values can be calculated for more than one input which means that higher-level concepts that

are combinations of more than one input can be used in ex-planations. Finally, CI and CU values can be represented in different forms and details which may increase the intelligi-bility for users.

3 Py-CIU Software Architecture

The Py-CIU library provides model-agnostic implementation of the CIU method for explaining individual predictions. The source code of the implementation of Py-CIU is available at https://github.com/TimKam/py-ciu. Figure 2 depicts the ar-chitecture and the main components of the library. Initial re-quirements and Py-CIU functions are described as follows.

3.1 Initial Setup and Requirements

Py-CIU requires the setup of a model that outputs a prediction value for all classes. This can be a model that is designed for a particular problem or imported through machine learning libraries such as sklearn, which features various algorithms for classification tasks. Then, a case (i.e., set of feature val-ues), for which the explanations are generated for, is to be selected. Users can select a sample from the test set or ex-amine an instance that is specific to a particular application instance. The prediction_index stores the index of the highest prediction value if the model returns several predic-tions. Then, the CIU function takes; (i) prediction_index, (ii)feature names and feature ranges, (iii) categorical values if any (i.e., gender [‘female’, ‘male’]), and (iv) fea-ture interaction (optional, to determine the combined impact of several features on the prediction).

3.2 Py-CIU Functions

The CIU library comprises of following components; 1. generate_samples: This function generates random

input values considering the ranges for each feature. The default value of the number is 1000.

2. determine_ciu: This function computes the CI and CU. It takes case_data, predictor, and category_mapping as parameters and computes min_maxsvalues, CI and CU values for each feature and feature_interaction. These parameters and meth-ods described in details below;

• case_data: A dictionary that contains the input values of the case.

• predictor: The predictor calls the prediction function of the black-box model and returns pre-diction value for each generated sample.

• min_maxs: A dictionary that contains the fea-ture name (key) and a list of minimal, maximal value, also checks if the value has to be an integer (’feature_name’: [min, max, is_int]). • category_mapping: The category mapping maps

one-hot encoded categorical variables to the list of categories and category names. i.e., It supports the computation of CI and CU values for all kinds of categorical variables, managing the re-encoding of one-hot for the user. Note that this requires to

(4)

pro-Figure 2: Architecture and main components of Py-CIU.

vide a mapping from “raw data” feature names to one-hot encoded feature names.

• CI and CU: This implements the equation 1 and 2 provided in Section 2 for each feature and output class.

• feature_interactions: This function computes the CI and CU for intermediate concepts to pro-vide more abstract explanations. The user can set the feature combinations before calling Py-ciu (See Section 4.1).

3. CiuObject: This object allows to display the contextual importance (importance for a class) and utility (typical-ity for a class) in different ways:

• Visual explanations: The library auto-generates plots that show the CI and CU values of all con-figured features and feature combinations.

• Text-based explanations: In addition to visual plots, the library generates template-based textual expla-nations based on contextual importance and utility. Text explanations are generated by translating the CI and CU values into words based on the degree of the values as shown in Table 1. A general tem-plate is created for each feature with its importance and utility values. Then, each value is replaced with their respective CI and CU words to generate expla-nation phrases.

4 Running Examples

The Py-CIU library can be used to explain the outcome of var-ious machine learning algorithms including random forests, state-vector machines, and neural networks. Note that only tabular data is supported by the current version of the library. In this section, we demonstrate how to setup the basic param-eters and interpret the results for two different models. The

Table 1: Symbolic representation of the CI and CU values

Value Contextual Importance Contextual Utility < 0.25 not important not typical

< 0.5 important unlikely < 0.75 very important typical < 1.0 highly important very typical

examples presented here are available in the library’s reposi-tory as Jupyter Notebook files.

4.1 Random Forest: Loan Approval

This example provides explanations for a decision made by a random forest model regarding the approval/rejection of loans based on a synthetic (script-generated) dataset. The dataset contains five features namely age, assets, monthly income, gender, and job type. Dataset also includes explicit gender bias. First the prediction and the prediction index for the se-lected instance is obtained as follows:

exp_pre = random_forest.predict([test.values[0]]) prediction_index = 0 if exp_pre[0] > 0.5 else 1

Since two of the features have categorical values, we create a category mapping for these inputs as shown in the following example:

category_mapping = {’gender’: [’gender_female’, ’gender_male’,

’gender_other’],

’job_type’: [’job_type_fixed’, ’job_type_none’,

’job_type_permanent’]}

We also assess CIU for feature interaction between income and assets. The CIU function randomizes input values for these two features while keeping other features constant to original values. It is possible to give more abstract names to

(5)

Case 1: 97% Loan Approved Case 2: 65% Loan Rejected Figure 3: Contextual importance and utility values in same context with different gender values and their outcomes.

intermediate concept to increase interpretability for the end-user (e.g., ‘overall_value’).

feature_interactions = [{’assets_income’: [’assets’,

’monthly_income’]}]

Finally we call the CIU function with the following parame-ters: ciu = determine_ciu( test_data_encoded.iloc[0, :].to_dict(), random_forest.predict_proba, { ’age’: [20, 70, True], ’assets’: [-20000, 150000, True], ’monthly_income’: [0, 20000, True], ’gender_female’: [0, 1, True], ’gender_male’: [0, 1, True], ’gender_other’: [0, 1, True], ’job_type_fixed’: [0, 1, True], ’job_type_none’: [0, 1, True], ’job_type_permanent’: [0, 1, True] }, 1000, prediction_index, category_mapping, feature_interactions )

Then we call ciu.plot_ci() and ciu.plot_cu() to plot CI and CU values. Figure 3 illustrates how the importance and utility values varies in same context with different gender values for the following input sets:

Case 1: Case 2: age=48, age=48, assets=35700, assets=35700, income=15500, income=15500, gender=male, gender=other, job=permanent job=permanent

The model predicts the first case as loan approved with 0.974 probability, and the second case as rejected with 0.651 probability. For both cases, the interdependent effects of assets and monthly income have the highest importance on the prediction. Individually, the most important features are gender, monthly income, and assets. In this context, job type is a less important feature. The high importance of a feature means that perturbations in that feature result highest changes in the prediction value while the lower contextual importance value implies that changes on this feature do not affect the outcome significantly.

The contextual utility graph shows the position of the current output value within the importance range. The bigger con-textual utility value means that it is closer to the maximum contextual importance value (Cmax) and a lower contextual utility suggests that the current prediction value is closer to lowest importance value (Cmin). This gives insight into how each feature contributing to the current prediction value rela-tive to the feature importance. For the given examples, gen-der, one of the important features has the highest utility which suggests that it has a high contribution to the outcome. Note that although job type has 100 percent utility value, it does not contribute to overall prediction much due to its low im-portance in this context.

4.2 SVM: Iris Flower Classification

The second example uses Support Vector Machines (SVMs) on the Iris flower classification dataset. The dataset contains measurements for sepal length, sepal width, petal length, and petal width for three classes namely Setosa, Iris-Versicolor, and Iris-Virginica. First, we get the prediction and the prediction index for the following features:

sepal length=5.6, sepal width=2.7, petal length=4.2, petal width=1.3 exp_pre = model.predict_proba([x_test.values[0]]) pre_inx = list(exp_pre[0]).index(max(exp_pre[0]))

In this example, we provide high-level explanations by com-bining petal and sepal features as follows:

feature_interactions =

[{’petal_size’: [’petal_length’, ’petal_width’]}, {’sepal_size’: [’sepal_length’, ’sepal_width’]}]

The model predicts this as ‘‘Iris-versicolor’’ with 0.96 probability. As shown Figure 4, petal size has the high-est contextual importance with high utility. This sugghigh-ests that the input values of these two features are the most favorable to the prediction in the given settings. Although sepal mea-surements have high utility values, their overall importance is low. Thus, we can conclude that petal size is more con-tributing than sepal size for the predicted class Iris-versicolor in this context.

We call print(ciu.text_explain()) to get the text-based explanations including all the features. Explanations gener-ated for this case is as follows:

The feature ‘‘sepal_length’’, which is not important (CI=12.94%), is typical for its class (CU=81.51%).

(6)

96% Iris-versicolor

Figure 4: Contextual Importance and Contextual Utility The feature ‘‘sepal_width’’, which is not important (CI=8.27%), is very typical for its class (CU=99.2%).

The feature ‘‘petal_length’’, which is highly important (CI=97.78%), is very typical for its class (CU=99.24%).

The feature ‘‘petal_width’’, which is highly important (CI=54.8%), is very typical for its class (CU=97.93%).

5 Conclusion

In this paper, we have presented the Py-CIU library to provide a handy implementation of the decision-theory based Con-textual Importance and Utilityexplanation method. Running examples have shown how the library can be applied to ex-plain the outcomes of different machine learning tasks with varying kinds of datasets. In its current implementation, Py-CIU supports two explanation modes: visual and text-based. Furthermore, the feature interaction facet allows for the pro-vision of high-level explanations where feature combinations are appropriate or features have interdependent effects on the prediction. This is important because listing all the causes of an outcome could result in a high cognitive load particu-larly when the number of inputs is high. Thus, explanations based on the feature interaction helps to provide more ab-stract explanations without compromising the number of fea-tures. Moreover, the library can facilitate future studies that apply CIU in real-world settings, or compare the decision-theoretical approach of CIU with other explanation methods such as LIME, SHAP, VIBI, and LX2.

Acknowledgements

This work was partially supported by the Wallenberg AI, Au-tonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

References

[Anjomshoae et al., 2019] Sule Anjomshoae, Kary Fr¨amling, and Amro Najjar. Explanations of black-box model predictions by contextual importance and utility. In International Workshop on Explainable, Trans-parent Autonomous Agents and Multi-Agent Systems, pages 95–109. Springer, 2019.

[Bang et al., 2019] Seojin Bang, Pengtao Xie, Wei Wu, and Eric Xing. Explaining a black-box using deep varia-tional information bottleneck approach. arXiv preprint arXiv:1902.06918, 2019.

[Chen et al., 2018] Jianbo Chen, Le Song, Martin J Wain-wright, and Michael I Jordan. Learning to explain: An

information-theoretic perspective on model interpretation. arXiv preprint arXiv:1802.07814, 2018.

[Doˇsilović et al., 2018] Filip Karlo Doˇsilović, Mario Brˇcić, and Nikica Hlupić. Explainable artificial intelligence: A survey. In 2018 41st International convention on informa-tion and communicainforma-tion technology, electronics and mi-croelectronics (MIPRO), pages 0210–0215. IEEE, 2018. [Främling, 2020] Kary Främling. Decision theory meets

ex-plainable ai. In International Workshop on Exex-plainable, Transparent Autonomous Agents and Multi-Agent Systems, pages 57–74. Springer, 2020.

[Främling, 1996] Kary Främling. Modélisation et apprentis-sage des préférences par réseaux de neurones pour l’aide à la décision multicritère. PhD thesis, Institut National de Sciences Appliquées de Lyon, Ecole Nationale Supérieure des Mines de Saint-Etienne, France, 1996.

[Gilpin et al., 2018] Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. Explaining explanations: An overview of interpretability of machine learning. In 2018 IEEE 5th International Con-ference on data science and advanced analytics (DSAA), pages 80–89. IEEE, 2018.

[Gunning, 2017] David Gunning. Explainable artificial in-telligence (xai). Defense Advanced Research Projects Agency (DARPA), nd Web, 2, 2017.

[Keeney et al., 1993] Ralph L Keeney, Howard Raiffa, et al. Decisions with multiple objectives: preferences and value trade-offs. Cambridge university press, 1993.

[Lipton, 2018] Zachary C Lipton. The mythos of model in-terpretability. Queue, 16(3):31–57, 2018.

[Lundberg and Lee, 2017] Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Advances in neural information processing systems, pages 4765–4774, 2017.

[Ribeiro et al., 2016] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. ” why should i trust you?” explain-ing the predictions of any classifier. In Proceedexplain-ings of the 22nd ACM SIGKDD international conference on knowl-edge discovery and data mining, pages 1135–1144, 2016. [Samek et al., 2017] Wojciech Samek, Thomas Wiegand, and Klaus-Robert M¨uller. Explainable artificial intelli-gence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296, 2017. [Voigt and Von dem Bussche, 2017] Paul Voigt and Axel Von dem Bussche. The eu general data protection regu-lation (gdpr). A Practical Guide, 1st Ed., Cham: Springer International Publishing, 2017.