solas_disparity.categorical_adverse_impact_ratio#

solas_disparity.categorical_adverse_impact_ratio(group_data: pandas.core.frame.DataFrame, protected_groups: List[str], reference_groups: List[str], group_categories: List[str], outcome: pandas.core.series.Series, air_threshold: float, percent_difference_threshold: float, category_order: List[str], label: Optional[pandas.core.series.Series] = None, sample_weight: Optional[pandas.core.series.Series] = None, max_for_fishers: int = 100) → solas_disparity.types._disparity.Disparity#

Calculate the Adverse Impact Ratio for a set of favorability-ordinal categorical outcomes.

AIR is defined as the percentage of favorable outcomes of the protected group divided by the percentage of favorable outcomes of the reference group.

\[\text{AIR}_\text{Protected Group} = \frac{\text{% Favorable Outcome}_\text{Protected Group}}{\text{% Favorable Outcome}_\text{Reference Group}}\]

An AIR is considered practically significant if the AIR is:

less than a chosen air_threshold,
statistically significantly different than parity,
AND greater than a chosen percent_difference_threshold.

Parameters

group_data (DataFrame) – Dataframe containing columns for group data.
protected_groups (List[str]) – List of protected groups.
reference_groups (List[str]) – List of reference groups with the same length as protected_groups.
group_categories (List[str]) – List of group categories to which each protected and reference group pair belongs to (e.g. race, gender, age, etc.). Has the same length as protected_groups.
outcome (Series) – Outcome series of elements of the set category_order.
air_threshold (float) – Adverse Impact Ratio threshold value.
percent_difference_threshold (float) – Percent difference threshold value. For example, a 20% difference is input as percent_difference_threshold=0.2.
category_order (List[str]) – Series of outcome categories in ascending order of favorability (e.g. ["bad", "good", "great", "best"]).
label (Optional[Series], optional) – Label, true outcome, and/or target series evaluated alongside outcome. Defaults to None.
sample_weight (Optional[Series], optional) – Sample weight series. Has the same length as group_data. Defaults to None.
max_for_fishers (int, optional) – Maximum value of samples for Fisher’s exact test to be used. Defaults to MAX_FOR_FISHERS.

Returns

Object containing results of the disparity calculation.

Return type

Disparity

SolasAI documentation

solas_disparity.categorical_adverse_impact_ratio

solas_disparity.categorical_adverse_impact_ratio#