Skip to main content

majority_sampling_ratio: <list[float]> (Optional)

Description

A list of majority sampling ratios for AutoML to explore. The majority_sampling_ratio parameter controls undersampling of the majority class in binary classification tasks.
It specifies how many majority-class examples to keep per minority-class example during training.

In other words:

For every example in the minority class, we sample majority_sampling_ratio examples from the majority class.
This parameter must be greater than 0.

Behavior

  • If the dataset’s actual majority-to-minority ratio is greater than the specified majority_sampling_ratio, undersampling is applied to reduce the imbalance.
  • If the dataset’s actual ratio is less than or equal to the specified ratio, the parameter has no effect (i.e., all data are used).
If row weights are also present for binary classification (from SDK/custom training-table workflows or from weight_col), both signals are used together according to weight_mode:
  • weight_mode=sample: row and class weights are combined for sampling.
  • weight_mode=weighted_loss: row and class weights are combined for loss weighting.
  • weight_mode=None: invalid when majority_sampling_ratio is set.
Example 1: Undersampling applied Suppose your dataset has:
  • Majority-class examples: 10,000
  • Minority-class examples: 100
    → Actual ratio = 100:1
If you set:
majority_sampling_ratio = 20
Then for each minority-class example, we keep 20 majority-class examples.
Resulting sampled data:
  • Majority-class examples kept: 100 × 20 = 2,000
  • Minority-class examples: 100
    → Resulting ratio = 20:1
Undersampling is applied because the actual ratio (100) is greater than the desired ratio (20). Example 2: No effect (ignored) Using the same dataset (10,000 majority, 100 minority → 100:1 ratio), if you set:
majority_sampling_ratio = 150
Then the desired ratio (150:1) is larger than the dataset’s actual ratio (100:1).
Since the dataset is already less imbalanced than the target, no undersampling occurs.
All majority examples are kept, and this setting is ignored.
Summary table
Dataset Majority:Minoritymajority_sampling_ratioAction TakenResulting Ratio
100:120Undersample majority20:1
100:150Undersample majority50:1
100:1100No change (equal ratio)100:1
100:1120Ignored (ratio already smaller)100:1

Supported Task Types

  • Binary Classification

Default Values

run_modeDefault Value
FASTNone
NORMALNone
BESTNone