predict

Apply the model to the given dataset.

Note. The model can not be correctly applied if the order of the columns in the testing and training datasets differs.

Method call format

predict(data,
    prediction_type='RawFormulaVal', 
    ntree_start=0, 
    ntree_end=0, 
    thread_count=-1 (the number of threads is equal to the number of cores),
    verbose=None)

Parameters

ParameterPossible typesDescriptionDefault value
data
  • catboost.Pool
  • list of lists
  • numpy.array of shape (doc_count, feature_count)
  • pandas.DataFrame
  • pandas.Series
  • catboost.FeaturesData

A file or matrix with the input dataset.

Required parameter
prediction_typestring

The required prediction type.

Supported prediction types:
  • Probability
  • Class
  • RawFormulaVal
RawFormulaVal
ntree_startint

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end).

This parameter defines the index of the first tree to be used when applying the model or calculating the metrics (the inclusive left border of the range). Indices are zero-based.

0
ntree_endint

To reduce the number of trees to use when the model is applied or the metrics are calculated, set the range of the tree indices to [ntree_start; ntree_end).

This parameter defines the index of the first tree not to be used when applying the model or calculating the metrics (the exclusive right border of the range). Indices are zero-based.

0 (the index of the last tree to use equals to the number of trees in the model minus one)
thread_countint

The number of threads to use during training.

Optimizes the speed of execution. This parameter doesn't affect results.

-1 (the number of threads is equal to the number of cores) (The number of processor cores)
verbosebool

Output the measured evaluation metric to stderr.

None

Type of return value

numpy.array