Estimating local and global feature importance scores using DiCE
Summaries of counterfactual examples can be used to estimate importance of features. Intuitively, a feature that is changed more often to generate a proximal counterfactual is an important feature. We use this intuition to build a feature importance score.
This score can be interpreted as a measure of the necessity of a feature to cause a particular model output. That is, if the feature’s value changes, then it is likely that the model’s output class will also change (or the model’s output will significantly change in case of regression model).
Below we show how counterfactuals can be used to provide local feature importance scores for any input, and how those scores can be combined to yield a global importance score for each feature.
[1]:
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.ensemble import RandomForestClassifier
import dice_ml
from dice_ml import Dice
from dice_ml.utils import helpers # helper functions
[2]:
%load_ext autoreload
%autoreload 2
Preliminaries: Loading the data and ML model
[3]:
dataset = helpers.load_adult_income_dataset().sample(5000) # downsampling to reduce ML model fitting time
helpers.get_adult_data_info()
[3]:
{'age': 'age',
'workclass': 'type of industry (Government, Other/Unknown, Private, Self-Employed)',
'education': 'education level (Assoc, Bachelors, Doctorate, HS-grad, Masters, Prof-school, School, Some-college)',
'marital_status': 'marital status (Divorced, Married, Separated, Single, Widowed)',
'occupation': 'occupation (Blue-Collar, Other/Unknown, Professional, Sales, Service, White-Collar)',
'race': 'white or other race?',
'gender': 'male or female?',
'hours_per_week': 'total work hours per week',
'income': '0 (<=50K) vs 1 (>50K)'}
[4]:
target = dataset["income"]
# Split data into train and test
datasetX = dataset.drop("income", axis=1)
x_train, x_test, y_train, y_test = train_test_split(datasetX,
target,
test_size=0.2,
random_state=0,
stratify=target)
numerical = ["age", "hours_per_week"]
categorical = x_train.columns.difference(numerical)
# We create the preprocessing pipelines for both numeric and categorical data.
numeric_transformer = Pipeline(steps=[
('scaler', StandardScaler())])
categorical_transformer = Pipeline(steps=[
('onehot', OneHotEncoder(handle_unknown='ignore'))])
transformations = ColumnTransformer(
transformers=[
('num', numeric_transformer, numerical),
('cat', categorical_transformer, categorical)])
# Append classifier to preprocessing pipeline.
# Now we have a full prediction pipeline.
clf = Pipeline(steps=[('preprocessor', transformations),
('classifier', RandomForestClassifier())])
model = clf.fit(x_train, y_train)
[5]:
d = dice_ml.Data(dataframe=dataset, continuous_features=['age', 'hours_per_week'], outcome_name='income')
m = dice_ml.Model(model=model, backend="sklearn")
Local feature importance
We first generate counterfactuals for a given input point.
[6]:
exp = Dice(d, m, method="random")
query_instance = x_train[1:2]
e1 = exp.generate_counterfactuals(query_instance, total_CFs=10, desired_range=None,
desired_class="opposite",
permitted_range=None, features_to_vary="all")
e1.visualize_as_dataframe(show_only_changes=True)
100%|█████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 3.12it/s]
Query instance (original outcome : 1)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 43 | Private | Bachelors | Married | White-Collar | White | Male | 50 | 1 |
Diverse Counterfactual set (new outcome: 0.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | - | - | - | - | - | - | 94.0 | 0 |
1 | - | - | - | - | - | - | - | 63.0 | 0 |
2 | - | Self-Employed | - | Widowed | - | - | - | - | 0 |
3 | 33.0 | - | - | - | - | - | - | 15.0 | 0 |
4 | 21.0 | - | - | Separated | - | - | - | - | 0 |
5 | - | - | - | - | Service | - | Female | - | 0 |
6 | - | Government | - | - | Professional | - | - | - | 0 |
7 | - | - | - | - | - | - | - | 14.0 | 0 |
8 | - | - | - | - | - | - | Female | 77.0 | 0 |
9 | - | - | - | - | - | - | - | 64.0 | 0 |
These can now be used to calculate the feature importance scores.
[7]:
imp = exp.local_feature_importance(query_instance, cf_examples_list=e1.cf_examples_list)
print(imp.local_importance)
[{'hours_per_week': 0.6, 'workclass': 0.2, 'marital_status': 0.2, 'occupation': 0.2, 'gender': 0.2, 'age': 0.2, 'education': 0.0, 'race': 0.0}]
Feature importance can also be estimated directly, by leaving the cf_examples_list
argument blank.
[8]:
imp = exp.local_feature_importance(query_instance, posthoc_sparsity_param=None)
print(imp.local_importance)
100%|█████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 12.18it/s]
[{'hours_per_week': 0.4, 'marital_status': 0.3, 'workclass': 0.2, 'education': 0.2, 'gender': 0.2, 'age': 0.2, 'occupation': 0.1, 'race': 0.0}]
Global importance
For global importance, we need to generate counterfactuals for a representative sample of the dataset.
[9]:
cobj = exp.global_feature_importance(x_train[0:10], total_CFs=10, posthoc_sparsity_param=None)
print(cobj.summary_importance)
100%|███████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 11.57it/s]
{'marital_status': 0.46, 'age': 0.45, 'hours_per_week': 0.39, 'education': 0.31, 'occupation': 0.28, 'workclass': 0.22, 'gender': 0.17, 'race': 0.1}
Convert the counterfactual output to json
[10]:
json_str = cobj.to_json()
print(json_str)
{"test_data": [[[37, "Private", "HS-grad", "Divorced", "Blue-Collar", "White", "Male", 45, 0]], [[43, "Private", "Bachelors", "Married", "White-Collar", "White", "Male", 50, 1]], [[25, "Private", "HS-grad", "Single", "Sales", "White", "Female", 40, 0]], [[44, "Private", "Assoc", "Single", "Blue-Collar", "White", "Male", 25, 0]], [[30, "Government", "HS-grad", "Married", "White-Collar", "White", "Male", 40, 1]], [[33, "Private", "HS-grad", "Divorced", "Blue-Collar", "White", "Male", 45, 0]], [[41, "Government", "Some-college", "Divorced", "Service", "White", "Female", 45, 0]], [[20, "Private", "Assoc", "Single", "Sales", "White", "Female", 40, 0]], [[53, "Private", "Some-college", "Married", "White-Collar", "White", "Male", 40, 1]], [[46, "Government", "Bachelors", "Married", "Professional", "White", "Male", 38, 1]]], "cfs_list": [[[37, "Private", "Masters", "Single", "Blue-Collar", "White", "Male", 45, 1], [73.0, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 45, 1], [46.0, "Private", "HS-grad", "Divorced", "White-Collar", "White", "Male", 45, 1], [37, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 88.0, 1], [37, "Private", "Bachelors", "Married", "Blue-Collar", "White", "Male", 45, 1], [42.0, "Private", "HS-grad", "Divorced", "Blue-Collar", "White", "Male", 83.0, 1], [37, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 77.0, 1], [59.0, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 45, 1], [37, "Private", "Masters", "Separated", "Blue-Collar", "White", "Male", 45, 1], [55.0, "Private", "HS-grad", "Divorced", "White-Collar", "White", "Male", 45, 1]], [[43, "Private", "Bachelors", "Married", "White-Collar", "White", "Male", 2.0, 0], [43, "Private", "Bachelors", "Widowed", "White-Collar", "White", "Female", 50, 0], [43, "Private", "Bachelors", "Single", "White-Collar", "White", "Male", 90.0, 0], [22.0, "Private", "Bachelors", "Married", "White-Collar", "Other", "Male", 50, 0], [27.0, "Private", "Bachelors", "Married", "White-Collar", "White", "Male", 38.0, 0], [43, "Private", "HS-grad", "Married", "White-Collar", "White", "Female", 50, 0], [43, "Private", "Bachelors", "Married", "White-Collar", "White", "Female", 80.0, 0], [19.0, "Private", "Bachelors", "Married", "White-Collar", "White", "Male", 83.0, 0], [43, "Private", "Bachelors", "Separated", "White-Collar", "White", "Female", 50, 0], [43, "Private", "HS-grad", "Married", "White-Collar", "White", "Male", 50, 0]], [[81.0, "Other/Unknown", "Doctorate", "Single", "Sales", "Other", "Female", 40, 1], [36.0, "Government", "Doctorate", "Single", "Sales", "Other", "Female", 40, 1], [44.0, "Private", "Assoc", "Married", "White-Collar", "White", "Female", 40, 1], [31.0, "Government", "HS-grad", "Married", "Other/Unknown", "White", "Female", 40, 1], [38.0, "Private", "HS-grad", "Married", "Sales", "White", "Male", 47.0, 1], [47.0, "Private", "Prof-school", "Single", "Professional", "White", "Male", 40, 1], [67.0, "Government", "Doctorate", "Single", "Sales", "White", "Female", 40, 1], [52.0, "Government", "Doctorate", "Single", "Sales", "Other", "Female", 40, 1], [84.0, "Self-Employed", "Prof-school", "Single", "Sales", "White", "Female", 93.0, 1], [25, "Private", "Bachelors", "Married", "Sales", "White", "Female", 57.0, 1]], [[44, "Government", "Prof-school", "Widowed", "Blue-Collar", "White", "Male", 78.0, 1], [44, "Private", "Prof-school", "Single", "Professional", "White", "Male", 25, 1], [44, "Private", "Some-college", "Married", "Other/Unknown", "White", "Male", 50.0, 1], [44, "Other/Unknown", "Assoc", "Married", "Professional", "White", "Female", 25, 1], [44, "Government", "Doctorate", "Single", "Blue-Collar", "White", "Female", 82.0, 0], [44, "Private", "Assoc", "Married", "White-Collar", "Other", "Male", 84.0, 1], [44, "Private", "Masters", "Single", "Blue-Collar", "White", "Male", 51.0, 1], [44, "Government", "Assoc", "Married", "Blue-Collar", "White", "Male", 98.0, 1], [44, "Government", "Prof-school", "Single", "Professional", "White", "Male", 25, 1], [57.0, "Private", "Assoc", "Married", "White-Collar", "White", "Male", 72.0, 1]], [[30, "Government", "HS-grad", "Married", "White-Collar", "White", "Male", 45.0, 0], [30, "Private", "School", "Married", "White-Collar", "White", "Male", 40, 0], [57.0, "Government", "HS-grad", "Married", "White-Collar", "White", "Female", 40, 0], [30, "Government", "HS-grad", "Widowed", "White-Collar", "White", "Male", 91.0, 0], [30, "Self-Employed", "HS-grad", "Married", "White-Collar", "White", "Male", 7.0, 0], [30, "Government", "HS-grad", "Married", "White-Collar", "White", "Male", 56.0, 0], [30, "Government", "HS-grad", "Single", "White-Collar", "Other", "Male", 40, 0], [30, "Private", "Assoc", "Married", "White-Collar", "White", "Male", 40, 0], [29.0, "Government", "HS-grad", "Married", "White-Collar", "White", "Male", 66.0, 0], [30, "Government", "HS-grad", "Widowed", "White-Collar", "White", "Male", 40, 0]], [[67.0, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 45, 1], [69.0, "Private", "HS-grad", "Divorced", "White-Collar", "White", "Male", 45, 1], [33, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Female", 45, 1], [75.0, "Private", "HS-grad", "Divorced", "White-Collar", "White", "Male", 45, 1], [69.0, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 45, 1], [33, "Private", "Bachelors", "Divorced", "Professional", "Other", "Male", 45, 1], [66.0, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 45, 1], [33, "Private", "HS-grad", "Married", "Blue-Collar", "White", "Male", 97.0, 1], [77.0, "Private", "HS-grad", "Divorced", "White-Collar", "White", "Male", 45, 1], [90.0, "Private", "HS-grad", "Divorced", "White-Collar", "White", "Male", 45, 1]], [[41, "Government", "Some-college", "Married", "Professional", "White", "Female", 45, 1], [41, "Government", "Some-college", "Married", "Other/Unknown", "White", "Female", 45, 1], [41, "Government", "Doctorate", "Married", "Service", "White", "Female", 45, 0], [41, "Government", "Bachelors", "Married", "Service", "White", "Female", 45, 0], [41, "Private", "Some-college", "Divorced", "Professional", "White", "Female", 45, 1], [41, "Government", "Some-college", "Married", "Service", "White", "Male", 45, 1], [41, "Government", "Prof-school", "Divorced", "Service", "White", "Male", 45, 0], [41, "Government", "Prof-school", "Divorced", "Professional", "White", "Female", 45, 1], [41, "Government", "Some-college", "Married", "Sales", "White", "Female", 45, 1], [41, "Government", "Some-college", "Married", "White-Collar", "White", "Female", 45, 1]], [[70.0, "Private", "Doctorate", "Single", "Sales", "White", "Female", 69.0, 0], [75.0, "Private", "Assoc", "Single", "White-Collar", "White", "Male", 65.0, 1], [68.0, "Private", "Assoc", "Married", "Sales", "White", "Female", 54.0, 1], [65.0, "Private", "Assoc", "Married", "Blue-Collar", "White", "Male", 40, 1], [83.0, "Private", "Prof-school", "Married", "White-Collar", "White", "Female", 40, 1], [40.0, "Private", "Assoc", "Married", "Sales", "White", "Female", 48.0, 1], [66.0, "Private", "Assoc", "Married", "Sales", "White", "Female", 67.0, 0], [63.0, "Private", "Assoc", "Married", "Sales", "White", "Female", 79.0, 0], [44.0, "Self-Employed", "Assoc", "Married", "Sales", "White", "Female", 40, 1], [45.0, "Private", "Bachelors", "Single", "Sales", "Other", "Female", 74.0, 1]], [[53, "Private", "Some-college", "Married", "White-Collar", "White", "Male", 33.0, 0], [89.0, "Private", "Some-college", "Married", "White-Collar", "White", "Male", 40, 0], [53, "Private", "School", "Married", "White-Collar", "White", "Male", 71.0, 0], [53, "Self-Employed", "Some-college", "Divorced", "White-Collar", "White", "Male", 40, 0], [53, "Private", "Some-college", "Married", "Sales", "White", "Female", 40, 0], [53, "Government", "Some-college", "Single", "White-Collar", "White", "Male", 40, 0], [53, "Private", "Some-college", "Married", "Blue-Collar", "White", "Female", 40, 0], [23.0, "Private", "Some-college", "Married", "Sales", "White", "Male", 40, 0], [53, "Government", "Some-college", "Married", "White-Collar", "White", "Male", 7.0, 0], [42.0, "Private", "Some-college", "Married", "White-Collar", "White", "Male", 40, 0]], [[46, "Self-Employed", "Bachelors", "Married", "Professional", "Other", "Male", 38, 0], [88.0, "Government", "Bachelors", "Married", "Professional", "Other", "Male", 38, 0], [46, "Government", "Bachelors", "Separated", "Professional", "White", "Male", 77.0, 0], [36.0, "Government", "Bachelors", "Married", "Professional", "White", "Male", 62.0, 0], [59.0, "Government", "Bachelors", "Married", "Professional", "White", "Male", 6.0, 0], [59.0, "Self-Employed", "Bachelors", "Married", "Professional", "White", "Male", 38, 0], [46, "Government", "School", "Married", "Professional", "White", "Female", 38, 0], [46, "Government", "Bachelors", "Separated", "Professional", "White", "Male", 63.0, 0], [17.0, "Government", "Bachelors", "Married", "Professional", "White", "Male", 38, 0], [46, "Self-Employed", "Bachelors", "Married", "Professional", "White", "Male", 7.0, 0]]], "local_importance": [[0.5, 0.0, 0.3, 0.7, 0.2, 0.0, 0.0, 0.3], [0.3, 0.0, 0.2, 0.3, 0.0, 0.1, 0.4, 0.5], [0.9, 0.6, 0.8, 0.4, 0.3, 0.3, 0.2, 0.3], [0.1, 0.5, 0.6, 0.6, 0.6, 0.1, 0.2, 0.7], [0.2, 0.3, 0.2, 0.3, 0.0, 0.1, 0.1, 0.5], [0.7, 0.0, 0.1, 0.5, 0.5, 0.1, 0.1, 0.1], [0.0, 0.1, 0.4, 0.7, 0.6, 0.0, 0.2, 0.0], [1.0, 0.1, 0.3, 0.7, 0.3, 0.1, 0.2, 0.7], [0.3, 0.3, 0.1, 0.2, 0.3, 0.0, 0.2, 0.3], [0.5, 0.3, 0.1, 0.2, 0.0, 0.2, 0.1, 0.5]], "summary_importance": [0.45, 0.22, 0.31, 0.46, 0.28, 0.1, 0.17, 0.39], "data_interface": {"outcome_name": "income", "data_df": "dummy_data"}, "feature_names": ["age", "workclass", "education", "marital_status", "occupation", "race", "gender", "hours_per_week"], "feature_names_including_target": ["age", "workclass", "education", "marital_status", "occupation", "race", "gender", "hours_per_week", "income"], "model_type": "classifier", "desired_class": "opposite", "desired_range": null, "metadata": {"version": "2.0"}}
Convert the json output to a counterfactual object
[11]:
imp_r = imp.from_json(json_str)
print([o.visualize_as_dataframe(show_only_changes=True) for o in imp_r.cf_examples_list])
print(imp_r.local_importance)
print(imp_r.summary_importance)
Query instance (original outcome : 0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 37 | Private | HS-grad | Divorced | Blue-Collar | White | Male | 45 | 0 |
Counterfactual set (new outcome: 1.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | - | Masters | Single | - | - | - | - | 1 |
1 | 73.0 | - | - | Married | - | - | - | - | 1 |
2 | 46.0 | - | - | - | White-Collar | - | - | - | 1 |
3 | - | - | - | Married | - | - | - | 88.0 | 1 |
4 | - | - | Bachelors | Married | - | - | - | - | 1 |
5 | 42.0 | - | - | - | - | - | - | 83.0 | 1 |
6 | - | - | - | Married | - | - | - | 77.0 | 1 |
7 | 59.0 | - | - | Married | - | - | - | - | 1 |
8 | - | - | Masters | Separated | - | - | - | - | 1 |
9 | 55.0 | - | - | - | White-Collar | - | - | - | 1 |
Query instance (original outcome : 1)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 43 | Private | Bachelors | Married | White-Collar | White | Male | 50 | 1 |
Counterfactual set (new outcome: 0.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | - | - | - | - | - | - | 2.0 | 0 |
1 | - | - | - | Widowed | - | - | Female | - | 0 |
2 | - | - | - | Single | - | - | - | 90.0 | 0 |
3 | 22.0 | - | - | - | - | Other | - | - | 0 |
4 | 27.0 | - | - | - | - | - | - | 38.0 | 0 |
5 | - | - | HS-grad | - | - | - | Female | - | 0 |
6 | - | - | - | - | - | - | Female | 80.0 | 0 |
7 | 19.0 | - | - | - | - | - | - | 83.0 | 0 |
8 | - | - | - | Separated | - | - | Female | - | 0 |
9 | - | - | HS-grad | - | - | - | - | - | 0 |
Query instance (original outcome : 0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 25 | Private | HS-grad | Single | Sales | White | Female | 40 | 0 |
Counterfactual set (new outcome: 1.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 81.0 | Other/Unknown | Doctorate | - | - | Other | - | - | 1 |
1 | 36.0 | Government | Doctorate | - | - | Other | - | - | 1 |
2 | 44.0 | - | Assoc | Married | White-Collar | - | - | - | 1 |
3 | 31.0 | Government | - | Married | Other/Unknown | - | - | - | 1 |
4 | 38.0 | - | - | Married | - | - | Male | 47.0 | 1 |
5 | 47.0 | - | Prof-school | - | Professional | - | Male | - | 1 |
6 | 67.0 | Government | Doctorate | - | - | - | - | - | 1 |
7 | 52.0 | Government | Doctorate | - | - | Other | - | - | 1 |
8 | 84.0 | Self-Employed | Prof-school | - | - | - | - | 93.0 | 1 |
9 | - | - | Bachelors | Married | - | - | - | 57.0 | 1 |
Query instance (original outcome : 0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 44 | Private | Assoc | Single | Blue-Collar | White | Male | 25 | 0 |
Counterfactual set (new outcome: 1.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | Government | Prof-school | Widowed | - | - | - | 78.0 | 1 |
1 | - | - | Prof-school | - | Professional | - | - | - | 1 |
2 | - | - | Some-college | Married | Other/Unknown | - | - | 50.0 | 1 |
3 | - | Other/Unknown | - | Married | Professional | - | Female | - | 1 |
4 | - | Government | Doctorate | - | - | - | Female | 82.0 | - |
5 | - | - | - | Married | White-Collar | Other | - | 84.0 | 1 |
6 | - | - | Masters | - | - | - | - | 51.0 | 1 |
7 | - | Government | - | Married | - | - | - | 98.0 | 1 |
8 | - | Government | Prof-school | - | Professional | - | - | - | 1 |
9 | 57.0 | - | - | Married | White-Collar | - | - | 72.0 | 1 |
Query instance (original outcome : 1)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 30 | Government | HS-grad | Married | White-Collar | White | Male | 40 | 1 |
Counterfactual set (new outcome: 0.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | - | - | - | - | - | - | 45.0 | 0 |
1 | - | Private | School | - | - | - | - | - | 0 |
2 | 57.0 | - | - | - | - | - | Female | - | 0 |
3 | - | - | - | Widowed | - | - | - | 91.0 | 0 |
4 | - | Self-Employed | - | - | - | - | - | 7.0 | 0 |
5 | - | - | - | - | - | - | - | 56.0 | 0 |
6 | - | - | - | Single | - | Other | - | - | 0 |
7 | - | Private | Assoc | - | - | - | - | - | 0 |
8 | 29.0 | - | - | - | - | - | - | 66.0 | 0 |
9 | - | - | - | Widowed | - | - | - | - | 0 |
Query instance (original outcome : 0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 33 | Private | HS-grad | Divorced | Blue-Collar | White | Male | 45 | 0 |
Counterfactual set (new outcome: 1.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 67.0 | - | - | Married | - | - | - | - | 1 |
1 | 69.0 | - | - | - | White-Collar | - | - | - | 1 |
2 | - | - | - | Married | - | - | Female | - | 1 |
3 | 75.0 | - | - | - | White-Collar | - | - | - | 1 |
4 | 69.0 | - | - | Married | - | - | - | - | 1 |
5 | - | - | Bachelors | - | Professional | Other | - | - | 1 |
6 | 66.0 | - | - | Married | - | - | - | - | 1 |
7 | - | - | - | Married | - | - | - | 97.0 | 1 |
8 | 77.0 | - | - | - | White-Collar | - | - | - | 1 |
9 | 90.0 | - | - | - | White-Collar | - | - | - | 1 |
Query instance (original outcome : 0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 41 | Government | Some-college | Divorced | Service | White | Female | 45 | 0 |
Counterfactual set (new outcome: 1.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | - | - | Married | Professional | - | - | - | 1 |
1 | - | - | - | Married | Other/Unknown | - | - | - | 1 |
2 | - | - | Doctorate | Married | - | - | - | - | - |
3 | - | - | Bachelors | Married | - | - | - | - | - |
4 | - | Private | - | - | Professional | - | - | - | 1 |
5 | - | - | - | Married | - | - | Male | - | 1 |
6 | - | - | Prof-school | - | - | - | Male | - | - |
7 | - | - | Prof-school | - | Professional | - | - | - | 1 |
8 | - | - | - | Married | Sales | - | - | - | 1 |
9 | - | - | - | Married | White-Collar | - | - | - | 1 |
Query instance (original outcome : 0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 20 | Private | Assoc | Single | Sales | White | Female | 40 | 0 |
Counterfactual set (new outcome: 1.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 70.0 | - | Doctorate | - | - | - | - | 69.0 | - |
1 | 75.0 | - | - | - | White-Collar | - | Male | 65.0 | 1 |
2 | 68.0 | - | - | Married | - | - | - | 54.0 | 1 |
3 | 65.0 | - | - | Married | Blue-Collar | - | Male | - | 1 |
4 | 83.0 | - | Prof-school | Married | White-Collar | - | - | - | 1 |
5 | 40.0 | - | - | Married | - | - | - | 48.0 | 1 |
6 | 66.0 | - | - | Married | - | - | - | 67.0 | - |
7 | 63.0 | - | - | Married | - | - | - | 79.0 | - |
8 | 44.0 | Self-Employed | - | Married | - | - | - | - | 1 |
9 | 45.0 | - | Bachelors | - | - | Other | - | 74.0 | 1 |
Query instance (original outcome : 1)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 53 | Private | Some-college | Married | White-Collar | White | Male | 40 | 1 |
Counterfactual set (new outcome: 0.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | - | - | - | - | - | - | 33.0 | 0 |
1 | 89.0 | - | - | - | - | - | - | - | 0 |
2 | - | - | School | - | - | - | - | 71.0 | 0 |
3 | - | Self-Employed | - | Divorced | - | - | - | - | 0 |
4 | - | - | - | - | Sales | - | Female | - | 0 |
5 | - | Government | - | Single | - | - | - | - | 0 |
6 | - | - | - | - | Blue-Collar | - | Female | - | 0 |
7 | 23.0 | - | - | - | Sales | - | - | - | 0 |
8 | - | Government | - | - | - | - | - | 7.0 | 0 |
9 | 42.0 | - | - | - | - | - | - | - | 0 |
Query instance (original outcome : 1)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | 46 | Government | Bachelors | Married | Professional | White | Male | 38 | 1 |
Counterfactual set (new outcome: 0.0)
age | workclass | education | marital_status | occupation | race | gender | hours_per_week | income | |
---|---|---|---|---|---|---|---|---|---|
0 | - | Self-Employed | - | - | - | Other | - | - | 0 |
1 | 88.0 | - | - | - | - | Other | - | - | 0 |
2 | - | - | - | Separated | - | - | - | 77.0 | 0 |
3 | 36.0 | - | - | - | - | - | - | 62.0 | 0 |
4 | 59.0 | - | - | - | - | - | - | 6.0 | 0 |
5 | 59.0 | Self-Employed | - | - | - | - | - | - | 0 |
6 | - | - | School | - | - | - | Female | - | 0 |
7 | - | - | - | Separated | - | - | - | 63.0 | 0 |
8 | 17.0 | - | - | - | - | - | - | - | 0 |
9 | - | Self-Employed | - | - | - | - | - | 7.0 | 0 |
[None, None, None, None, None, None, None, None, None, None]
[{'marital_status': 0.7, 'age': 0.5, 'education': 0.3, 'hours_per_week': 0.3, 'occupation': 0.2, 'workclass': 0.0, 'race': 0.0, 'gender': 0.0}, {'hours_per_week': 0.5, 'gender': 0.4, 'age': 0.3, 'marital_status': 0.3, 'education': 0.2, 'race': 0.1, 'workclass': 0.0, 'occupation': 0.0}, {'age': 0.9, 'education': 0.8, 'workclass': 0.6, 'marital_status': 0.4, 'occupation': 0.3, 'race': 0.3, 'hours_per_week': 0.3, 'gender': 0.2}, {'hours_per_week': 0.7, 'education': 0.6, 'marital_status': 0.6, 'occupation': 0.6, 'workclass': 0.5, 'gender': 0.2, 'age': 0.1, 'race': 0.1}, {'hours_per_week': 0.5, 'workclass': 0.3, 'marital_status': 0.3, 'age': 0.2, 'education': 0.2, 'race': 0.1, 'gender': 0.1, 'occupation': 0.0}, {'age': 0.7, 'marital_status': 0.5, 'occupation': 0.5, 'education': 0.1, 'race': 0.1, 'gender': 0.1, 'hours_per_week': 0.1, 'workclass': 0.0}, {'marital_status': 0.7, 'occupation': 0.6, 'education': 0.4, 'gender': 0.2, 'workclass': 0.1, 'age': 0.0, 'race': 0.0, 'hours_per_week': 0.0}, {'age': 1.0, 'marital_status': 0.7, 'hours_per_week': 0.7, 'education': 0.3, 'occupation': 0.3, 'gender': 0.2, 'workclass': 0.1, 'race': 0.1}, {'age': 0.3, 'workclass': 0.3, 'occupation': 0.3, 'hours_per_week': 0.3, 'marital_status': 0.2, 'gender': 0.2, 'education': 0.1, 'race': 0.0}, {'age': 0.5, 'hours_per_week': 0.5, 'workclass': 0.3, 'marital_status': 0.2, 'race': 0.2, 'education': 0.1, 'gender': 0.1, 'occupation': 0.0}]
{'marital_status': 0.46, 'age': 0.45, 'hours_per_week': 0.39, 'education': 0.31, 'occupation': 0.28, 'workclass': 0.22, 'gender': 0.17, 'race': 0.1}