Model Tuner 0.0.14a
Pre-release
Pre-release
Version 0.0.14a
In previous versions, the train_val_test_split
method allowed for stratification either by y (stratify_y
) or by specified columns (stratify_cols
), but not both at the same time. There are use cases where stratification by both the target variable (y) and specific columns is necessary to ensure a balanced and representative split across different data segments.
Enhancement
Modified the train_val_test_split
method to support simultaneous stratification by both stratify_y
and stratify_cols
. This was inside the method achieved by implementing the following logic that ensures both y and the specified columns are considered during the stratification process.
stratify_key = pd.concat([X[stratify_cols], y], axis=1)
strat_key_val_test = pd.concat(
[X_valid_test[stratify_cols], y_valid_test], axis=1
)