CatBoostClassifier

Сlass CatBoostClassifier(iterations=None,
                         learning_rate=None,
                         depth=None,
                         l2_leaf_reg=None,
                         model_size_reg=None,
                         rsm=None,
                         loss_function='Logloss',
                         border_count=None,
                         feature_border_type=None,
                         old_permutation_block_size=None,
                         od_pval=None,
                         od_wait=None,
                         od_type=None,
                         nan_mode=None,
                         counter_calc_method=None,
                         leaf_estimation_iterations=None,
                         leaf_estimation_method=None,
                         thread_count=None,
                         random_seed=None,
                         use_best_model=None,
                         verbose=None,
                         logging_level=None,
                         metric_period=None,
                         ctr_leaf_count_limit=None,
                         store_all_simple_ctr=None,
                         max_ctr_complexity=None,
                         has_time=None,
                         allow_const_label=None,
                         classes_count=None,
                         class_weights=None,
                         one_hot_max_size=None,
                         random_strength=None,
                         name=None,
                         ignored_features=None,
                         train_dir=None,
                         custom_loss=None,
                         custom_metric=None,
                         eval_metric=None,
                         bagging_temperature=None,
                         save_snapshot=None,
                         snapshot_file=None,
                         fold_len_multiplier=None,
                         used_ram_limit=None,
                         gpu_ram_part=None,
                         allow_writing_files=None,
                         final_ctr_computation_mode=None,
                         approx_on_full_history=None,
                         boosting_type=None,
                         simple_ctr=None,
                         combinations_ctr=None,
                         per_feature_ctr=None,
                         task_type=None,
                         device_config=None,
                         devices=None,
                         bootstrap_type=None,
                         subsample=None,
                         max_depth=None,
                         n_estimators=None,
                         num_boost_round=None,
                         num_trees=None,
                         colsample_bylevel=None,
                         random_state=None,
                         reg_lambda=None,
                         objective=None,
                         eta=None,
                         max_bin=None,
                         scale_pos_weight=None,
                         gpu_cat_features_storage=None,
                         data_partition=None
                         metadata=None)
Purpose

Training and applying models for the classification problems. When using the applying methods only the probability that the object belongs to the class is returned. Provides compatibility with the scikit-learn tools.

Parameters
Parameter Description Default value
metadata The key-value string pairs to store in the model's metadata storage after the training. None

See Training parameters for the full list of parameters.

Note. Some parameters duplicate the ones specified for the fit method. In these cases the values specified for the fit method take precedence.
Attributes
Attribute Type Description
is_fitted_ bool

Check whether the model is trained.

tree_count_ int

Return the number of trees in the model.

This number can differ from the value specified in the iterations training parameter in the following cases:
  • The training is stopped by the overfitting detector.
  • The use_best_model training parameter is set to “True”.
feature_importances_ list Output the calculated feature importances.
random_seed_ int

The random seed used for training.

learning_rate_ int

The learning rate used for training.

metadata_ string

Return a proxy object with metadata from the model's internal key-value string storage. Emulates a Python dictionary and allows to iterate, get, set and delete key-values from the model's metadata storage.

Methods
Method Description
fit

Train a model.

predict

Apply the model to the given dataset.

predict_proba

Apply the model to the given dataset to predict the probability that the object belongs to the given classes.

staged_predict

Apply the model to the given dataset and calculate the results for each i-th tree of the model taking into consideration only the trees in the range [1;i].

staged_predict_proba

Apply the model to the given dataset to predict the probability that the object belongs to the class and calculate the results for each i-th tree of the model taking into consideration only the trees in the range [1;i].

eval_metrics

Calculate the specified metrics for the specified dataset.

get_feature_importance

Calculate and return the feature importances.

get_object_importance
Calculate the effect of objects from the train dataset on the optimized metric values for the objects from the input dataset:
  • Positive values reflect that the optimized metric increases.
  • Negative values reflect that the optimized metric decreases.
load_model

Load the model from a file.

save_model

Save the model to a file.

shrink

Shrink the model. Only trees with indices from the range [ntree_start, ntree_end) are kept.

get_param

Return the value of the specified training parameter.

get_params

Return the training parameters.

set_params

Set the training parameters.

score

Calculate the Accuracy metric for the objects in the given dataset.

copy

Copy the CatBoost object.

get_test_eval

Returns the formula values that were calculated for the objects from the test dataset provided for training.