Implemented metrics

CatBoost provides built-in metrics for various machine learning problems. These functions can be used for model optimization or reference purposes. See the Objectives and metrics section for details on the calculation principles.

Choose the implementation for more details.

Python package

The following parameters can be set for the corresponding classes and are used when the model is trained.

Parameters for trained model

Classes:

loss-function

The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
  • RMSE

  • Logloss

  • MAE

  • CrossEntropy

  • Quantile

  • LogLinQuantile

  • Lq

  • MultiRMSE

  • MultiClass

  • MultiClassOneVsAll

  • MultiLogloss

  • MultiCrossEntropy

  • MAPE

  • Poisson

  • PairLogit

  • PairLogitPairwise

  • QueryRMSE

  • QuerySoftMax

  • Tweedie

  • YetiRank

  • YetiRankPairwise

  • StochasticFilter

  • StochasticRank

A custom python object can also be set as the value of this parameter (see an example).

For example, use the following construction to calculate the value of Quantile with the coefficient α=0.1\alpha = 0.1:

Quantile:alpha=0.1

custom_metric

Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

Examples:

  • Calculate the value of CrossEntropy

    CrossEntropy
    
  • Calculate the value of Quantile with the coefficient α=0.1\alpha = 0.1

    Quantile:alpha=0.1
    
  • Calculate the values of Logloss and AUC

    ['Logloss', 'AUC']
    

Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv and test_error.tsv respectively). The directory for these files is specified in the --train-dir (train_dir) parameter.

Use the visualization tools to see a live chart with the dynamics of the specified metrics.

use-best-model

If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:

  1. Build the number of trees defined by the training parameters.
  2. Use the validation dataset to identify the iteration with the optimal value of the metric specified in  --eval-metric (--eval-metric).

No trees are saved after this iteration.

This option requires a validation dataset to be provided.

eval-metric

The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

A user-defined function can also be set as the value (see an example).

Examples:

R2

The following parameters can be set for the corresponding methods and are used when the model is trained or applied.

Parameters for trained or applied model

The following parameters can be set for the corresponding methods and are used when the model is trained or applied.

Classes:

use_best_model

If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:

  1. Build the number of trees defined by the training parameters.
  2. Use the validation dataset to identify the iteration with the optimal value of the metric specified in  --eval-metric (--eval-metric).

No trees are saved after this iteration.

This option requires a validation dataset to be provided.

verbose

Output the measured evaluation metric to stderr.

plot

Plot the following information during training:

  • the metric values;
  • the custom loss values;
  • the loss function change during feature selection;
  • the time has passed since training started;
  • the remaining time until the end of training.
    This option can be used if training is performed in Jupyter notebook.

R package

The following parameters can be set for the corresponding methods and are used when the model is trained or applied.

Method: catboost.train

loss_function

Description

The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
  • RMSE

  • Logloss

  • MAE

  • CrossEntropy

  • Quantile

  • LogLinQuantile

  • Lq

  • MultiRMSE

  • MultiClass

  • MultiClassOneVsAll

  • MultiLogloss

  • MultiCrossEntropy

  • MAPE

  • Poisson

  • PairLogit

  • PairLogitPairwise

  • QueryRMSE

  • QuerySoftMax

  • Tweedie

  • YetiRank

  • YetiRankPairwise

  • StochasticFilter

  • StochasticRank

For example, use the following construction to calculate the value of Quantile with the coefficient α=0.1\alpha = 0.1:

Quantile:alpha=0.1

custom_loss

Parameters

Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

Examples:

  • Calculate the value of CrossEntropy

    c('CrossEntropy')
    

    Or simply:

    'CrossEntropy'
    
  • Calculate the values of Logloss and AUC

    c('Logloss', 'AUC')
    
  • Calculate the value of Quantile with the coefficient α=0.1\alpha = 0.1

    c('Quantilealpha=0.1')
    

Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv and test_error.tsv respectively). The directory for these files is specified in the --train-dir (train_dir) parameter.

use-best-model

If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:

  1. Build the number of trees defined by the training parameters.
  2. Use the validation dataset to identify the iteration with the optimal value of the metric specified in  --eval-metric (--eval-metric).

No trees are saved after this iteration.

This option requires a validation dataset to be provided.

eval-metric

Parameters

The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

Quantile:alpha=0.3

Command-line version

The following command keys can be specified for the corresponding commands and are used when the model is trained or applied.

Params for the catboost fit command:

--loss-function

The metric to use in training. The specified value also determines the machine learning problem to solve. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]
Supported metrics
  • RMSE

  • Logloss

  • MAE

  • CrossEntropy

  • Quantile

  • LogLinQuantile

  • Lq

  • MultiRMSE

  • MultiClass

  • MultiClassOneVsAll

  • MultiLogloss

  • MultiCrossEntropy

  • MAPE

  • Poisson

  • PairLogit

  • PairLogitPairwise

  • QueryRMSE

  • QuerySoftMax

  • Tweedie

  • YetiRank

  • YetiRankPairwise

  • StochasticFilter

  • StochasticRank

For example, use the following construction to calculate the value of Quantile with the coefficient α=0.1\alpha = 0.1:

Quantilealpha=0.1

--custom-metric

Metric values to output during training. These functions are not optimized and are displayed for informational purposes only. Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric 1>[:<parameter 1>=<value>;..;<parameter N>=<value>],<Metric 2>[:<parameter 1>=<value>;..;<parameter N>=<value>],..,<Metric N>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

Examples:

  • Calculate the value of CrossEntropy

    CrossEntropy
    
  • Calculate the value of Quantile with the coefficient α=0.1\alpha = 0.1

    Quantilealpha=0.1
    

Values of all custom metrics for learn and validation datasets are saved to the Metric output files (learn_error.tsv and test_error.tsv respectively). The directory for these files is specified in the --train-dir (train_dir) parameter.

--use-best-model

If this parameter is set, the number of trees that are saved in the resulting model is defined as follows:

  1. Build the number of trees defined by the training parameters.
  2. Use the validation dataset to identify the iteration with the optimal value of the metric specified in  --eval-metric (--eval-metric).

No trees are saved after this iteration.

This option requires a validation dataset to be provided.

--eval-metric

The metric used for overfitting detection (if enabled) and best model selection (if enabled). Some metrics support optional parameters (see the Objectives and metrics section for details on each metric).

Format:

<Metric>[:<parameter 1>=<value>;..;<parameter N>=<value>]

Supported metrics

Examples:

R2
Quantile:alpha=0.3

--logging-level

The logging level to output to stdout.

Possible values:

  • Silent — Do not output any logging information to stdout.

  • Verbose — Output the following data to stdout:

    • optimized metric
    • elapsed time of training
    • remaining time of training
  • Info — Output additional information and the number of trees.

  • Debug — Output debugging information.