Linear Aggregation in Tree-based Estimators

Author(s):

Sören R Künzel, Theo F Saarinen, Edward W Liu, Jasjeet S Sekhon

ISPS ID:

isps22-41

Full citation:

Sören R. Künzel, Theo F. Saarinen, Edward W. Liu & Jasjeet S. Sekhon (2022) Linear Aggregation in Tree-Based Estimators, Journal of Computational and Graphical Statistics, 31:3, 917-934, DOI: 10.1080/10618600.2022.2026780

Abstract:

Regression trees and their ensemble methods are popular methods for nonparametric regression: they combine strong predictive performance with interpretable estimators. To improve their utility for locally smooth response surfaces, we study regression trees and random forests with linear aggregation functions. We introduce a new algorithm that finds the best axis-aligned split to fit linear aggregation functions on the corresponding nodes, and we offer a quasilinear time implementation. We demonstrate the algorithm’s favorable performance on real-world benchmarks and in an extensive simulation study, and we demonstrate its improved interpretability using a large get-out-the-vote experiment. We provide an open-source software package that implements several tree-based estimators with linear aggregation functions. Supplementary materials for this article are available online.

Supplemental information:

Link to article here (gated).

Publication date:

2022

Publication type:

Peer Reviewed Article

Publication name:

Journal of Computational and Graphical Statistics

Discipline:

Political Science

Institution for Social and Policy Studies

Advancing Research • Shaping Policy • Developing Leaders

Linear Aggregation in Tree-based Estimators