bootstrapping#

solas_disparity.statistical_significance.bootstrapping(...)#

Conduct bootstrapping for statistical significance.

Parameters
  • group_data (DataFrame) – Dataframe containing columns for group data.

  • protected_groups (List[str]) – List of protected groups.

  • reference_groups (List[str]) – List of reference groups with the same length as protected_groups.

  • group_categories (List[str]) – List of group categories to which each protected and reference group pair belongs to (e.g. race, gender, age, etc.). Has the same length as protected_groups.

  • outcome (Series) – Outcome series.

  • sample_weight (Optional[Series], optional) – Sample weight series. Has the same length as group_data. Defaults to None.

  • resamples (int, optional) – The number of independent resamples. Defaults to const.RESAMPLES.

  • sample (Union[float, int], optional) – The sample size or sample fraction. Defaults to const.SAMPLE.

  • seed (Optional[int], optional) – Random seed passed through to numpy.random.default_rng. Defaults to None.

  • replace (bool, optional) – Whether to sample with replacement. Defaults to False.

Raises

NotImplementedError – Bootstrapping will be implemented soon.

Returns

Statistical significance result object.

Return type

StatSig