Model values

The results of applying the model on a dataset.

The output information and format depends on the machine learning problem being solved:

Regression

Contains
A number resulting from applying the model.
Header format

The first row in the output file contains a tab-separated description of data in the corresponding column.

Format:
[EvalSet:]DocId<\t><Prediction type 1><\t>..<\t><Prediction type N><\t>Label
  • EvalSet: is output for the evaluation file only if several test datasets are input.
  • Prediction type is specified in the starting parameters and takes one or several of the following values:

    • Probability
    • Class
    • RawFormulaVal
  • Label is output for test datasets and in cross-validation mode only.
Format

Each row starting from the second contains tab-separated information about a single object from the input dataset.

Format:
[<Test dataset ID>:]<DocId><\t><model value for prediction type 1><\t>..<\t><model value for prediction type N><\t><Label>
  • Test dataset ID is the serial number of the input test dataset. The value is output if several test datasets are input for model evaluation purposes.
  • DocId is an alphanumeric ID of the object given in the Dataset description. If the identifiers are not set in the input data the objects are sequentially numbered, starting from zero.
  • model value for prediction type is the float number resulting from applying the model for the corresponding prediction type.
  • Label is the label value for the object. This value is output in model training and cross-validation modes.
Example

The resulting file without alphanumeric IDs:

DocId<\t>Probability<\t>Class
0<\t>0.8<\t>1
1<\t>0.3<\t>0

The resulting file for the cross-validation mode with alphanumeric IDs set:

DocId<\t>Probability<\t>Label
LT<\t>75.1<\t>73.6
LV<\t>73.2<\t>72.15
PL<\t>78.22<\t>77.5

Classification

Contains

Depends on the selected output mode for approximated values of the formula:

  • RawFormulaVal —A number resulting from applying the model.
  • Probability — A number indicating the probability that the object belongs to the class (a sigmoid of the result of applying the model).
  • Class — The predicted class (output with the value “1” if the probability is higher than 0.5, otherwise “0”).
Header format

The first row in the output file contains a tab-separated description of data in the corresponding column.

Format:
[EvalSet:]DocId<\t><Prediction type 1><\t>..<\t><Prediction type N><\t>Label
  • EvalSet: is output for the evaluation file only if several test datasets are input.
  • Prediction type is specified in the starting parameters and takes one or several of the following values:

    • Probability
    • Class
    • RawFormulaVal
  • Label is output for test datasets and in cross-validation mode only.
Format

Each row in the output file contains tab-separated information about a single object from the input dataset.

Format:
[<Test dataset ID>:]<DocId><\t><model value><\t><Label>
  • Test dataset ID is the serial number of the input test dataset. The value is output if several test datasets are input for model evaluation purposes.
  • DocId is an alphanumeric ID of the object given in the Dataset description. If the identifiers are not set in the input data the objects are sequentially numbered, starting from zero.
  • model value is the number resulting from applying the model for the corresponding prediction type.
  • Label is the label value for the object. This value is output in model training and cross-validation modes.
Example

The resulting file for the RawFormulaVal cross-validation mode:

DocId<\t>RawFormulaVal<\t>Label
0<\t>0.1685379577<\t>1
1<\t>0.2379356203<\t>1
2<\t>-0.04871954376<\t>1
The resulting file for the Probability cross-validation mode with alphanumeric IDs set for objects:
DocId<\t>Probability<\t>Label
DocId1<\t>0.5592048528<\t>1
DocId2<\t>0.5595881735<\t>1
DocId3<\t>0.5592048528<\t>1
The resulting file for the Class mode:
DocId<\t>Class
0<\t>0
1<\t>1
2<\t>1
3<\t>s0

Multiclassification

Contains

Depends on the selected output mode for approximated values of the formula:

  • RawFormulaVal — A list of numbers resulting from applying the model. Values for the different classes are tab-separated.
  • Probability — A list of numbers indicating the probability that the object belongs to each of the classes. Values for the different classes are tab-separated.
  • Class —The number of the class that the object most likely belongs to.
Header format

The first row in the output file contains a tab-separated description of data in the corresponding column.

Format:

[EvalSet:]DocId</t><PredictionType1>[:Class=<ClassID>]</t>..</t><PredictionTypeN>:Class=<ClassID>[<\t>Label]
  • EvalSet: is output for the evaluation file only if several test datasets are input.
  • Prediction type is specified in the starting parameters and takes one or several of the following values:

    • Probability
    • Class
    • RawFormulaVal
  • ClassID is the identifier of the class being described in the column. It is omitted for the Class prediction type.
  • Label is the label value for the object. This value is output in model training and cross-validation modes.

The number of “Prediction type–ClassID” pairs depends on the input parameters. It is always limited to one pair for the Class prediction type.

Format

Each row in the output file contains tab-separated information about a single object from the input dataset.

Format:
[Test dataset ID:]<DocId><\t><Model value 1>..<Model value N><\t><Label>
  • Test dataset ID is the serial number of the input test dataset. The value is output if several test datasets are input for model evaluation purposes.
  • DocId is an alphanumeric ID of the object given in the Dataset description. If the identifiers are not set in the input data the objects are sequentially numbered, starting from zero.
  • Model value is a number or a list of numbers depending on the selected output mode for approximated values of the formula for the corresponding prediction type.
  • Label is the label value for the object. This value is output in model training and cross-validation modes.
Example

The resulting file for prediction in  Class mode with alphanumeric IDs set for objects:

DocId<\t>Class
DocId1<\t>2
DocId2<\t>1
DocId3<\t>2

The resulting file for the Probability cross-validation mode:

DocId<\t>Probability:Class=0<\t>CProbability:Class=1<\t>Probability:Class=2<\t>Label
1<\t>0.3232259635</t>0.315456703</t>0.3613173334</t>2
2<\t>0.335771253</t>0.3247524917</t>0.3394762553</t>0
3<\t>0.3181931812</t>0.3242628483</t>0.3575439705</t>1

The resulting file for the RawFormulaVal cross-validation mode:

DocId<\t>RawFormulaVal:Class=0<\t>RawFormulaVal:Class=1<\t>RawFormulaVal:Class=2<\t>Label
1<\t>0.001232427024</t>-0.04141999431</t>0.04018756728</t>2
2<\t>-0.04822847313</t>-0.05520994445</t>0.1034384176</t>2
3<\t>-0.05717915565</t>-0.06548867981</t>0.1226678355</t>2
The resulting file for prediction in RawFormulaVal and Probability modes:
DocId<\t>Probability:Class=0<\t>Probability:Class=1<\t>RawFormulaVal:Class=0<\t>RawFormulaVal:Class=1
1<\t>0.01593276944<\t>0.02337982256<\t>-1.494255509<\t>-1.110760101
2<\t>0.4060707366<\t>0.09565861257<\t>0.4137085351<\t>-1.032033103
3<\t>0.006235130003<\t>0.01759049831<\t>-2.03020042<\t>-0.9930409613