- Statist. Sci.
- Volume 32, Number 3 (2017), 367-384.
Logistic Regression: From Art to Science
A high quality logistic regression model contains various desirable properties: predictive power, interpretability, significance, robustness to error in data and sparsity, among others. To achieve these competing goals, modelers incorporate these properties iteratively as they hone in on a final model. In the period 1991–2015, algorithmic advances in Mixed-Integer Linear Optimization (MILO) coupled with hardware improvements have resulted in an astonishing 450 billion factor speedup in solving MILO problems. Motivated by this speedup, we propose modeling logistic regression problems algorithmically with a mixed integer nonlinear optimization (MINLO) approach in order to explicitly incorporate these properties in a joint, rather than sequential, fashion. The resulting MINLO is flexible and can be adjusted based on the needs of the modeler. Using both real and synthetic data, we demonstrate that the overall approach is generally applicable and provides high quality solutions in realistic timelines as well as a guarantee of suboptimality. When the MINLO is infeasible, we obtain a guarantee that imposing distinct statistical properties is simply not feasible.
Statist. Sci., Volume 32, Number 3 (2017), 367-384.
First available in Project Euclid: 1 September 2017
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zentralblatt MATH identifier
Bertsimas, Dimitris; King, Angela. Logistic Regression: From Art to Science. Statist. Sci. 32 (2017), no. 3, 367--384. doi:10.1214/16-STS602. https://projecteuclid.org/euclid.ss/1504253122
- Supplement to “Logistic Regression: From Art to Science”.