Categorical features

Attention. Do not use one-hot encoding during preprocessing. This affects both the training speed and the resulting quality.

CatBoost supports numerical and categorical features. Categorical features are used to build new numeric features based on categorical features and their combinations. See the Transforming categorical features to numerical features section for more details.

CatBoost uses one-hot encoding for categorical features with small amount different values (2 by default). Use the following parameters to change this behaviour:

CLI parametersPython parametersR parametersDescription
--one-hot-max-sizeone_hot_max_sizeone_hot_max_size

Use one-hot encoding for all features with a number of different values less than or equal to the given parameter value. Ctrs are not calculated for such features.