catboost.save_pool

catboost.save_pool(data, 
                   label = NULL, 
                   weight = NULL, 
                   baseline = NULL, 
                   pool_path = "data.pool", 
                   cd_path = "cd.pool")

Purpose

Save the dataset to the CatBoost format. Files with the following data are created:

Use the catboost.load_pool function to read the resulting files. These files can also be used in the  Command-line version and the Python package.

Arguments

ArgumentDescriptionDefault value
data

A data.frame or matrix with features.

The following column types are supported:
  • double
  • factor. It is assumed that categorical features are given in this type of columns. A standard CatBoost processing procedure is applied to this type of columns:
    1. The values are converted to strings.
    2. The ConvertCatFeatureToFloat function is applied to the resulting string.
Required argument
label

The target variables (in other words, the objects' label values) for the training dataset.

NULL
weightThe weights of the label values vector.NULL
baseline

A vector of formula values for all input objects. The training starts from these values for all input objects instead of starting from zero.

NULL
pool_path

The path to the output file that contains the dataset description.

data.pool
cd_pathThe path to the output file that contains the column descriptions.cd.pool