Feature importance

The following types of feature importance files are created depending on the task and the execution parameters:

FeatureImportance

Contains

The individual importance values for each of the input features.

Format
  • The rows are sorted in descending order of the feature importance value.

  • Each row contains information related to one feature.

    Format:
    <feature strength><\t><feature name>
    • feature strength is the value of the of the regular feature importance.
    • feature name is the zero-based index of the feature.

      An alphanumeric identifier is used instead if specified in the corresponding Num or Categ column of the input data.

      For example, let's assume that the column descriptions file has the following structure:
      0<\t>Label value<\t>
      1<\t>Num
      2<\t>Num<\t>ratio
      3<\t>Categ
      4<\t>Auxiliary
      5<\t>Num
      The input dataset description file contains the following line:
      120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12
      The table below shows the compliance between the given feature values and the feature indices.
      Feature valueFeature index
      800
      0.8ratio
      rock2
      123
Example
8.4 <\t> 2
5.5 <\t> 0
2.6 <\t> 3
1.5 <\t> ratio

InternalFeatureImportance

Contains

The importance values both for each of the input features and for their combinations (if any).

Format
  • The rows are sorted in descending order of the feature importance value.

  • Each row contains information related to one feature or a combination of features.

    Format:
    <feature strength><\t><{feature name 1,.., feature name n} pr<value> tb<value> type<value>
    • feature strength is the value of the internal feature importance.

    • feature name is the zero-based index of the feature.

      An alphanumeric identifier is used instead if specified in the corresponding Num or Categ column of the input data.

      For example, let's assume that the column descriptions file has the following structure:
      0<\t>Label value<\t>
      1<\t>Num
      2<\t>Num<\t>ratio
      3<\t>Categ
      4<\t>Auxiliary
      5<\t>Num
      The input dataset description file contains the following line:
      120<\t>80<\t>0.8<\t>rock<\t>some useless information<\t>12
      The table below shows the compliance between the given feature values and the feature indices.
      Feature valueFeature index
      800
      0.8ratio
      rock2
      123
    • pr is the prior value.
    • tb is the label value border value.
    • type is the feature border type.
Example
8.4<\t>0
5.2<\t>{2, ratio} pr2 tb0 type0
2.6<\t>{2} pr2 tb0 type0