- CatBoost: unbiased boosting with categorical features
Anna Veronika Dorogush, Andrey Gulin, Gleb Gusev, Liudmila Ostroumova Prokhorenkova, Aleksandr Vorobev. arXiv:1706.09516
NIPS 2018 paper with explanation of Ordered boosting principles and ordered categorical features statistics.
- CatBoost: gradient boosting with categorical features support
Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin. Workshop on ML Systems at NIPS 2017
A paper explaining the CatBoost working principles: how it handles categorical features, how it fights overfitting, how GPU training and fast formula applier are implemented.
- Finding Influential Training Samples for Gradient Boosted Decision Trees
Boris Sharchilev, Yury Ustinovsky, Pavel Serdyukov, Maarten de Rijke. arXiv:1802.06640
A paper explaining several ways of extending the framework for finding influential training samples for a particular case of tree ensemble-based models to non-parametric GBDT ensembles under the assumption that tree structures remain fixed and introducing a general scheme of obtaining further approximations to this method that balance the trade-off between performance and computational complexity.
- A Unified Approach to Interpreting Model Predictions
Scott Lundberg, Su-In Lee. arXiv:1705.07874
A paper explaining a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations).
- Consistent feature attribution for tree ensembles
Scott M. Lundberg, Su-In Lee. arXiv:1706.06060
A paper explaining fast exact solutions for SHAP (SHapley Additive exPlanation) values, a unique additive feature attribution method based on conditional expectations that is both consistent and locally accurate.
- Winning The Transfer Learning Track of Yahoo!’s Learning To Rank Challenge with YetiRank
Andrey Gulin, Igor Kuralenok, Dimitry Pavlov. PMLR 14:63-76
The theory underlying the YetiRank and YetiRankPairwise modes in CatBoost.