solas_disparity.adverse_impact_ratio_by_quantile#

solas_disparity.adverse_impact_ratio_by_quantile(group_data: pandas.core.frame.DataFrame, protected_groups: List[str], reference_groups: List[str], group_categories: List[str], outcome: pandas.core.series.Series, air_threshold: float, percent_difference_threshold: float, quantiles: List[float], label: Optional[pandas.core.series.Series] = None, sample_weight: Optional[pandas.core.series.Series] = None, max_for_fishers: int = 100, lower_score_favorable: bool = True, merge_bins: bool = True) solas_disparity.types._disparity.Disparity#

Calculate the Adverse Impact Ratio for specified quantiles.

AIR is defined as the percentage of favorable outcomes of the protected group divided by the percentage of favorable outcomes of the reference group.

\[\text{AIR}_\text{Protected Group} = \frac{\text{% Favorable Outcome}_\text{Protected Group}}{\text{% Favorable Outcome}_\text{Reference Group}}\]
An AIR is considered practically significant if the AIR is:
  1. less than a chosen air_threshold,

  2. statistically significantly different than parity,

  3. AND greater than a chosen percent_difference_threshold.

Parameters
  • group_data (DataFrame) – Dataframe containing columns for group data.

  • protected_groups (List[str]) – List of protected groups.

  • reference_groups (List[str]) – List of reference groups with the same length as protected_groups.

  • group_categories (List[str]) – List of group categories to which each protected and reference group pair belongs to (e.g. race, gender, age, etc.). Has the same length as protected_groups.

  • outcome (Series) – Outcome series.

  • air_threshold (float) – Adverse Impact Ratio threshold value.

  • percent_difference_threshold (float) – Percent difference threshold value. For example, a 20% difference is input as percent_difference_threshold=0.2.

  • quantiles (List[float]) – Set of quantiles at which the AIR will be calculated (e.g. [0.2, 0.4, 0.6, 0.8, 1.0]).

  • label (Optional[Series], optional) – Label, true outcome, and/or target series evaluated alongside outcome. Defaults to None.

  • sample_weight (Optional[Series], optional) – Sample weight series. Has the same length as group_data. Defaults to None.

  • max_for_fishers (int, optional) – Maximum value of samples for Fisher’s exact test to be used. Defaults to MAX_FOR_FISHERS.

  • lower_score_favorable (bool, optional) – Whether a lower value of outcome is favorable. Defaults to True.

  • merge_bins (bool, optional) – Whether quantiles with same cutoff are merged into one. Defaults to True.

Returns

Object containing results of the disparity calculation.

Return type

Disparity