SolasAI Disparity Plots#

import solas_disparity as sd
import pandas as pd

Certain notebook environments have limited rendering functionality. Uncomment this cell as a potential workaround if plots are not displaying.

# import plotly.io as pio
# pio.renderers.default = "png"

It’s preferable to explicitly and specifically handle warnings. For the purposes of this notebook, we will filter out all warnings.

from warnings import simplefilter
simplefilter("ignore")

Some predictions have already been created using a tree model run on an HMDA dataset.

label = "Interest Rate"
data = pd.read_parquet("hmda_test.parquet")

Store commonly reused function arguments.

protected_groups = ["Black", "Asian", "Native American", "Hispanic", "Female"]
reference_groups = ["White", "White", "White", "Non-Hispanic", "Male"]
groups = sd.pgrg_ordered(
    protected_groups=protected_groups,
    reference_groups=reference_groups,
)
reused_arguments = dict(
    group_data=data[groups],
    protected_groups=protected_groups,
    reference_groups=reference_groups,
    group_categories=["Race", "Race", "Race", "Ethnicity", "Sex"],
    sample_weight=None,
)
binary_outcome = data["Prediction"] <= data["Prediction"].quantile(0.5)
binary_label = data[label] <= data[label].quantile(0.5)

Single-Level Plots#

Certain disparity functions provide a result for each group. Their associated plots are single figures and are referred to as single-level plots.

Calculate Disparity#

Let’s use a result from the AIR function as an example for single-index plots.

air = sd.adverse_impact_ratio(
    outcome=binary_outcome,
    air_threshold=0.8,
    percent_difference_threshold=0.0,
    **reused_arguments
)

Output Results#

The default output for a disparity calculation result object includes a default plot.

air

Disparity Calculation: Adverse Impact Ratio

┌───────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐
│ Protected Groups                          │ Black, Asian, Native American, Hispanic, Female                     │
│ Reference Groups                          │ White, White, White, Non-Hispanic, Male                             │
│ Group Categories                          │ Race, Race, Race, Ethnicity, Sex                                    │
│ AIR Threshold                             │ 0.8                                                                 │
│ Percent Difference Threshold              │ 0.0                                                                 │
│ Shortfall Method                          │ to_reference_mean                                                   │
│ Affected Groups                           │ Hispanic                                                            │
│ Affected Reference                        │ Non-Hispanic                                                        │
│ Affected Categories                       │ Ethnicity                                                           │
└───────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘

Adverse Impact Ratio Summary Table

* Percent Missing: Ethnicity: 13.68%, Race: 13.56%, Sex: 46.88%

Group	Reference Group	Group Category	Total	Favorable	Percent Favorable	Percent Difference Favorable	AIR	P-Values	Practically Significant	Shortfall
Black	White	Race	340.0	141.0	41.47%	9.70%	0.810	0.001	No
Asian	White	Race	327.0	243.0	74.31%	-23.14%	1.452	0.000	No
Native American	White	Race	20.0	9.0	45.00%	6.17%	0.879	0.657	No
White		Race	3,623.0	1,854.0	51.17%
Hispanic	Non-Hispanic	Ethnicity	508.0	167.0	32.87%	21.54%	0.604	0.000	Yes	109.4
Non-Hispanic		Ethnicity	3,808.0	2,072.0	54.41%
Female	Male	Sex	1,034.0	414.0	40.04%	9.78%	0.804	0.000	No
Male		Sex	1,622.0	808.0	49.82%

The .plot() method on the result object returns the plotly figure directly.

figure = air.plot()
type(figure)

plotly.graph_objs._figure.Figure

figure

In the case of AIR, the plot function also takes an column argument, allowing specification of a different column in the summary table to be plotted.

air.plot(column=sd.const.TOTAL)

The .plot() method is simply a convenience wrapper for the associated plot function in the solas_disparity.plots namespace. For further information, reference this plot function in rendered documention. To have stronger linting support, one can optionally call this function directly.

sd.plots.plot_adverse_impact_ratio(disparity=air)

Multi-Level Plots#

Certain other disparity functions provide a result for each secondary level for each group.

Calculate Disparity#

Use AIR by quantile as an example for multi-level plots.

airq = sd.adverse_impact_ratio_by_quantile(
    outcome=data["Prediction"],
    air_threshold=0.8,
    percent_difference_threshold=0.0,
    quantiles=[decile / 10 for decile in range(1, 11)],
    **reused_arguments,
)

Output Results#

The default output for a disparity calculation result object includes a default plot. Note that a new subplot is created for each quantile.

airq

Disparity Calculation: Adverse Impact Ratio By Quantile

┌───────────────────────────────────────────┬─────────────────────────────────────────────────────────────────────┐
│ Protected Groups                          │ Black, Asian, Native American, Hispanic, Female                     │
│ Reference Groups                          │ White, White, White, Non-Hispanic, Male                             │
│ Group Categories                          │ Race, Race, Race, Ethnicity, Sex                                    │
│ AIR Threshold                             │ 0.8                                                                 │
│ Percent Difference Threshold              │ 0.0                                                                 │
│ Lower Score Favorable                     │ True                                                                │
│ Affected Groups                           │ Black, Hispanic, Female                                             │
│ Affected Reference                        │ White, Non-Hispanic, Male                                           │
│ Affected Categories                       │ Race, Ethnicity, Sex                                                │
└───────────────────────────────────────────┴─────────────────────────────────────────────────────────────────────┘

Adverse Impact Ratio By Quantile Summary Table

Group	Quantile	Reference Group	Group Category	Quantile Cutoff	Observations	Percent Missing	Total	Favorable	Percent Favorable	Percent Difference Favorable	AIR	P-Values	Practically Significant
Black	10.0%	White	Race	0.044761	4,322	13.56%	340.0	13.0	3.82%	4.93%	0.437	0.001	Yes
Asian	10.0%	White	Race	0.044761	4,322	13.56%	327.0	91.0	27.83%	-19.08%	3.181	0.000	No
Native American	10.0%	White	Race	0.044761	4,322	13.56%	20.0	1.0	5.00%	3.75%	0.571	1.000	No
White	10.0%		Race	0.044761	4,322	13.56%	3,623.0	317.0	8.75%
Hispanic	10.0%	Non-Hispanic	Ethnicity	0.044761	4,316	13.68%	508.0	15.0	2.95%	7.79%	0.275	0.000	Yes
Non-Hispanic	10.0%		Ethnicity	0.044761	4,316	13.68%	3,808.0	409.0	10.74%
Female	10.0%	Male	Sex	0.044761	2,656	46.88%	1,034.0	66.0	6.38%	3.05%	0.677	0.006	Yes
Male	10.0%		Sex	0.044761	2,656	46.88%	1,622.0	153.0	9.43%
Black	20.0%	White	Race	0.045863	4,322	13.56%	340.0	37.0	10.88%	9.85%	0.525	0.000	Yes
Asian	20.0%	White	Race	0.045863	4,322	13.56%	327.0	132.0	40.37%	-19.64%	1.947	0.000	No
Native American	20.0%	White	Race	0.045863	4,322	13.56%	20.0	2.0	10.00%	10.73%	0.482	0.403	No
White	20.0%		Race	0.045863	4,322	13.56%	3,623.0	751.0	20.73%
Hispanic	20.0%	Non-Hispanic	Ethnicity	0.045863	4,316	13.68%	508.0	42.0	8.27%	14.95%	0.356	0.000	Yes
Non-Hispanic	20.0%		Ethnicity	0.045863	4,316	13.68%	3,808.0	884.0	23.21%
Female	20.0%	Male	Sex	0.045863	2,656	46.88%	1,034.0	155.0	14.99%	4.92%	0.753	0.002	Yes
Male	20.0%		Sex	0.045863	2,656	46.88%	1,622.0	323.0	19.91%
Black	30.0%	White	Race	0.046427	4,322	13.56%	340.0	62.0	18.24%	11.30%	0.617	0.000	Yes
Asian	30.0%	White	Race	0.046427	4,322	13.56%	327.0	175.0	53.52%	-23.98%	1.812	0.000	No
Native American	30.0%	White	Race	0.046427	4,322	13.56%	20.0	4.0	20.00%	9.53%	0.677	0.464	No
White	30.0%		Race	0.046427	4,322	13.56%	3,623.0	1,070.0	29.53%
Hispanic	30.0%	Non-Hispanic	Ethnicity	0.046427	4,316	13.68%	508.0	69.0	13.58%	19.14%	0.415	0.000	Yes
Non-Hispanic	30.0%		Ethnicity	0.046427	4,316	13.68%	3,808.0	1,246.0	32.72%
Female	30.0%	Male	Sex	0.046427	2,656	46.88%	1,034.0	225.0	21.76%	5.74%	0.791	0.001	Yes
Male	30.0%		Sex	0.046427	2,656	46.88%	1,622.0	446.0	27.50%
Black	40.0%	White	Race	0.046703	4,322	13.56%	340.0	103.0	30.29%	16.38%	0.649	0.000	Yes
Asian	40.0%	White	Race	0.046703	4,322	13.56%	327.0	238.0	72.78%	-26.11%	1.559	0.000	No
Native American	40.0%	White	Race	0.046703	4,322	13.56%	20.0	8.0	40.00%	6.67%	0.857	0.656	No
White	40.0%		Race	0.046703	4,322	13.56%	3,623.0	1,691.0	46.67%
Hispanic	40.0%	Non-Hispanic	Ethnicity	0.046703	4,316	13.68%	508.0	139.0	27.36%	22.30%	0.551	0.000	Yes
Non-Hispanic	40.0%		Ethnicity	0.046703	4,316	13.68%	3,808.0	1,891.0	49.66%
Female	40.0%	Male	Sex	0.046703	2,656	46.88%	1,034.0	380.0	36.75%	7.27%	0.835	0.000	No
Male	40.0%		Sex	0.046703	2,656	46.88%	1,622.0	714.0	44.02%
Black	50.0%	White	Race	0.047009	4,322	13.56%	340.0	141.0	41.47%	9.70%	0.810	0.001	No
Asian	50.0%	White	Race	0.047009	4,322	13.56%	327.0	243.0	74.31%	-23.14%	1.452	0.000	No
Native American	50.0%	White	Race	0.047009	4,322	13.56%	20.0	9.0	45.00%	6.17%	0.879	0.657	No
White	50.0%		Race	0.047009	4,322	13.56%	3,623.0	1,854.0	51.17%
Hispanic	50.0%	Non-Hispanic	Ethnicity	0.047009	4,316	13.68%	508.0	167.0	32.87%	21.54%	0.604	0.000	Yes
Non-Hispanic	50.0%		Ethnicity	0.047009	4,316	13.68%	3,808.0	2,072.0	54.41%
Female	50.0%	Male	Sex	0.047009	2,656	46.88%	1,034.0	414.0	40.04%	9.78%	0.804	0.000	No
Male	50.0%		Sex	0.047009	2,656	46.88%	1,622.0	808.0	49.82%
Black	60.0%	White	Race	0.047266	4,322	13.56%	340.0	161.0	47.35%	13.62%	0.777	0.000	Yes
Asian	60.0%	White	Race	0.047266	4,322	13.56%	327.0	260.0	79.51%	-18.54%	1.304	0.000	No
Native American	60.0%	White	Race	0.047266	4,322	13.56%	20.0	11.0	55.00%	5.97%	0.902	0.648	No
White	60.0%		Race	0.047266	4,322	13.56%	3,623.0	2,209.0	60.97%
Hispanic	60.0%	Non-Hispanic	Ethnicity	0.047266	4,316	13.68%	508.0	214.0	42.13%	21.53%	0.662	0.000	Yes
Non-Hispanic	60.0%		Ethnicity	0.047266	4,316	13.68%	3,808.0	2,424.0	63.66%
Female	60.0%	Male	Sex	0.047266	2,656	46.88%	1,034.0	520.0	50.29%	7.60%	0.869	0.000	No
Male	60.0%		Sex	0.047266	2,656	46.88%	1,622.0	939.0	57.89%
Black	80.0%	White	Race	0.048018	4,322	13.56%	340.0	248.0	72.94%	7.96%	0.902	0.001	No
Asian	80.0%	White	Race	0.048018	4,322	13.56%	327.0	308.0	94.19%	-13.29%	1.164	0.000	No
Native American	80.0%	White	Race	0.048018	4,322	13.56%	20.0	14.0	70.00%	10.90%	0.865	0.250	No
White	80.0%		Race	0.048018	4,322	13.56%	3,623.0	2,931.0	80.90%
Hispanic	80.0%	Non-Hispanic	Ethnicity	0.048018	4,316	13.68%	508.0	364.0	71.65%	10.83%	0.869	0.000	No
Non-Hispanic	80.0%		Ethnicity	0.048018	4,316	13.68%	3,808.0	3,141.0	82.48%
Female	80.0%	Male	Sex	0.048018	2,656	46.88%	1,034.0	765.0	73.98%	4.44%	0.943	0.010	No
Male	80.0%		Sex	0.048018	2,656	46.88%	1,622.0	1,272.0	78.42%
Black	90.0%	White	Race	0.048694	4,322	13.56%	340.0	288.0	84.71%	5.41%	0.940	0.003	No
Asian	90.0%	White	Race	0.048694	4,322	13.56%	327.0	321.0	98.17%	-8.05%	1.089	0.000	No
Native American	90.0%	White	Race	0.048694	4,322	13.56%	20.0	17.0	85.00%	5.12%	0.943	0.441	No
White	90.0%		Race	0.048694	4,322	13.56%	3,623.0	3,265.0	90.12%
Hispanic	90.0%	Non-Hispanic	Ethnicity	0.048694	4,316	13.68%	508.0	428.0	84.25%	6.85%	0.925	0.000	No
Non-Hispanic	90.0%		Ethnicity	0.048694	4,316	13.68%	3,808.0	3,469.0	91.10%
Female	90.0%	Male	Sex	0.048694	2,656	46.88%	1,034.0	887.0	85.78%	4.04%	0.955	0.002	No
Male	90.0%		Sex	0.048694	2,656	46.88%	1,622.0	1,457.0	89.83%
Black	100.0%	White	Race	0.058530	4,322	13.56%	340.0	340.0	100.00%	0.00%	1.000	1.000	No
Asian	100.0%	White	Race	0.058530	4,322	13.56%	327.0	327.0	100.00%	0.00%	1.000	1.000	No
Native American	100.0%	White	Race	0.058530	4,322	13.56%	20.0	20.0	100.00%	0.00%	1.000	1.000	No
White	100.0%		Race	0.058530	4,322	13.56%	3,623.0	3,623.0	100.00%
Hispanic	100.0%	Non-Hispanic	Ethnicity	0.058530	4,316	13.68%	508.0	508.0	100.00%	0.00%	1.000	1.000	No
Non-Hispanic	100.0%		Ethnicity	0.058530	4,316	13.68%	3,808.0	3,808.0	100.00%
Female	100.0%	Male	Sex	0.058530	2,656	46.88%	1,034.0	1,034.0	100.00%	0.00%	1.000	1.000	No
Male	100.0%		Sex	0.058530	2,656	46.88%	1,622.0	1,622.0	100.00%

The .plot() method on the result object returns the plotly figure directly.

type(airq.plot())

plotly.graph_objs._figure.Figure

The .plot() method also takes an optional argument column just like a single-index plot.

airq.plot(column=sd.const.PERCENT_DIFFERENCE_FAVORABLE)

A user can also specify a single group to extract a single by-level plot for.

airq.plot(group="Black")

airq.plot(group="Black", column=sd.const.PERCENT_DIFFERENCE_FAVORABLE)

.plot() also has a quantile argument to return a figure for a single quantile. The quantile argument is specific to AIR by quantile. For example, the equivalent argument for a categorical AIR calculation would be category.

airq.plot(quantile=0.1)

airq.plot(quantile=0.5)

Another argument exposed by multi-level plots is separate. It is used to separate a single plotly figure containing multiple subplots into a list of separate plotly figures for each level. It is convenience argument equivalent to calling .plot() with the quantile argument for every quantile.

airq_figures = airq.plot(separate=True)
type(airq_figures)

list

airq_figures[0]

airq_figures[4]

As with any other plot, the full documentation and typing support can be found in the solas_disparity.plots namespace.

sd.plots.plot_adverse_impact_ratio_by_quantile

<cyfunction plot_adverse_impact_ratio_by_quantile at 0x7f25fa61c860>

More Plot Functionality#

Since the figures returned by plot functions are plotly figures, reference the plotly documentation for more functionality. https://plotly.com/python-api-reference/generated/plotly.graph_objects.Figure.html#plotly.graph_objects.Figure

The update_layout method to modify overall attributes of the figure, including its height and width. https://plotly.com/python-api-reference/generated/plotly.graph_objects.Figure.html#plotly.graph_objects.Figure.update_layout

air.plot().update_layout(height=500, width=500)

Plots can be saved as images using the write_image method. Here’s an example saving a plot as an svg file. https://plotly.com/python-api-reference/generated/plotly.graph_objects.Figure.html#plotly.graph_objects.Figure.write_image

air.plot().write_image("air.svg")

Or as a png…

air.plot().write_image("air.png")

The size of plot when being saved to an image can also be controlled without affecting the original figure object.

air.plot().write_image("air_resized.svg", height=800, width=1100)

Clean up files.

from pathlib import Path

to_clean = ["air.svg", "air_resized.svg", "air.png"]
for name in to_clean:
    if Path(name).exists():
        Path(name).unlink()

SolasAI documentation

SolasAI Disparity Plots

Contents

SolasAI Disparity Plots#

Single-Level Plots#

Calculate Disparity#

Output Results#

Multi-Level Plots#

Calculate Disparity#

Output Results#

More Plot Functionality#