Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso.
Published in | American Journal of Theoretical and Applied Statistics (Volume 4, Issue 3) |
DOI | 10.11648/j.ajtas.20150403.12 |
Page(s) | 78-84 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2015. Published by Science Publishing Group |
MM Estimate, Sparse Model, LTS Estimate, Robust Regression
[1] | A. E. Hoerl and R. W. Kennard, “Ridge Regression: Biased Estimation for Nonorthogonal Problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970. |
[2] | R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Royal. Statist. Soc B., vol. 58, no. 1, pp. 267–288, 1996. |
[3] | B. Efron, T. Hastie, and R.Tibshirani, “Least angle regression,” The Annals of Statistics, vol. 32, pp, 407–499, 2004. |
[4] | K. Knight and W. Fu, “Asymptotics for Lasso-Type Estimators,” The Annals of Statistics, vol. 28, pp. 1356–1378, 2000. |
[5] | J. Fan and R. Li, “Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties,” Journal of the American Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001 |
[6] | A. Alfons, C. Croux, and S. Gelper, “Sparse least trimmed squares regression for analyzing high dimensional large data sets,” The Annals of Applied Statistics, vol. 7, no. 1, pp. 226–248, 2013. |
[7] | H.Wang, G. Li, and G. Jiang, “Robust regression shrinkage and consistent variable selection through the LAD-lasso,” Journal of Business & Economic Statistics, vol. 25, pp. 347-355, 2007. |
[8] | G. Li, H. Peng, and L. Zhu,“Nonconcave penalized M-estimation with a diverging number of parameters,” Statitica Sinica , vol. 21, no. 1, pp. 391–419, 2013. |
[9] | R. A. Maronna, “Robust ridge regression for high-dimensional data,” Technometrics, vol. 53, pp. 44–53, 2011. |
[10] | J. A. Khan, Aelst, S. Van. and R. H. Zamar, “Robust linear model selection based on least angle regression,” Journal of the Statistical Association, vol. 102, pp. 1289–1299, 2007. |
[11] | P. Rousseeuw and A. Leroy, Robust regression and outlier detection. John Wiley & Sons, 1987. |
[12] | V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression,” The Annals of Statistics, vol. 15, pp. 642-65, 1987. |
[13] | R. Maronna, D. Martin, and V. Yohai, Robust Statistics. John Wiley & Sons, Chichester. ISBN 978-0-470-01092-1, 2006. |
[14] | A. E. Beaton, and J. W. Tukey, “The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data,” Technometrics, vol. 16, pp. 147-185, 1974. |
[15] | R. A. Maronna, and V. J. Yohai, “Correcting MM Estimates for Fat Data Sets,” Computational Statistics & Data Analysis, vol. 54, pp. 3168-3173, 2010. |
[16] | V. J. Yohai and R.H. Zamar, “High breakdown-point estimates of regression by means of the minimization of an efficient scale,” Journal of the American Statistical Association, vol. 83, pp. 406–413, 1988. |
[17] | A. Alfons, simFrame: Simulation framework. R package version 0.5, 2012b. |
[18] | A. Alfons, robustHD: Robust methods for high-dimensional R pakage version 0.1.0, 2012a. |
[19] | R. Koenker, quantreg: Quantile regression. R package version 4.67, 2011. |
[20] | T. Hasti and B. Efron, lars: Least angle regression, lasso and forward stagewise. R package version 0.9-8, 2011. |
APA Style
Kamal Darwish, Ali Hakan Buyuklu. (2015). Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data. American Journal of Theoretical and Applied Statistics, 4(3), 78-84. https://doi.org/10.11648/j.ajtas.20150403.12
ACS Style
Kamal Darwish; Ali Hakan Buyuklu. Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data. Am. J. Theor. Appl. Stat. 2015, 4(3), 78-84. doi: 10.11648/j.ajtas.20150403.12
AMA Style
Kamal Darwish, Ali Hakan Buyuklu. Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data. Am J Theor Appl Stat. 2015;4(3):78-84. doi: 10.11648/j.ajtas.20150403.12
@article{10.11648/j.ajtas.20150403.12, author = {Kamal Darwish and Ali Hakan Buyuklu}, title = {Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data}, journal = {American Journal of Theoretical and Applied Statistics}, volume = {4}, number = {3}, pages = {78-84}, doi = {10.11648/j.ajtas.20150403.12}, url = {https://doi.org/10.11648/j.ajtas.20150403.12}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajtas.20150403.12}, abstract = {Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso.}, year = {2015} }
TY - JOUR T1 - Robust Linear Regression Using L1-Penalized MM-Estimation for High Dimensional Data AU - Kamal Darwish AU - Ali Hakan Buyuklu Y1 - 2015/03/30 PY - 2015 N1 - https://doi.org/10.11648/j.ajtas.20150403.12 DO - 10.11648/j.ajtas.20150403.12 T2 - American Journal of Theoretical and Applied Statistics JF - American Journal of Theoretical and Applied Statistics JO - American Journal of Theoretical and Applied Statistics SP - 78 EP - 84 PB - Science Publishing Group SN - 2326-9006 UR - https://doi.org/10.11648/j.ajtas.20150403.12 AB - Large datasets, where the number of predictors p is larger than the sample sizes n, have become very popular in recent years. These datasets pose great challenges for building a linear good prediction model. In addition, when dataset contains a fraction of outliers and other contaminations, linear regression becomes a difficult problem. Therefore, we need methods that are sparse and robust at the same time. In this paper, we implemented the approach of MM estimation and proposed L1-Penalized MM-estimation (MM-Lasso). Our proposed estimator combining sparse LTS sparse estimator to penalized M-estimators to get sparse model estimation with high breakdown value and good prediction. We implemented MM-Lasso by using C programming language. Simulation study demonstrates the favorable prediction performance of MM-Lasso. VL - 4 IS - 3 ER -