Resumen de Enhanced variable selection for distributional regression

We present an approach for enhanced variable selection for distributional regression via component-wise boosting. Boosting is an alternative method for tting regression models and is applicable for high-dimensional data problems.

Furthermore, the algorithm leads to data-driven variable selection. In practice, however, the algorithm still tends to select too many variables in some situations including false positives. This occurs particularly for low-dimensional data (p < n) in which case we observe a slow over tting behavior. Due to the slow over tting, the stopping iteration gets larger and more variables get included in the model. Many of the false positives are incorporated with a small coecient and therefore have a small impact, but lead to a larger model with dicult interpretation.

We try to overcome this issue by giving the algorithm the chance to de-select those variables. We consider the impact on variable selection and prediction and additionally compare the new approach to the One Standard Error Rule.

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: