R Regression: Comparing Speed Using lm() lm.fit() and RCPP

One of the problems of R is speed and memory. Below I compare three methods to perform multiple linear regression.

The built in R function is lm(). It is the slowest. A bare bones R implementation is lm.fit() which is substantially faster but still slow. The fastest method is to use Rcpp and RcppArmadillo which is the C++ Armadillo linear algebra library.

Using a 31400 x 4 design matrix a simulation is run to compare the three methods:

A simulation of 1000 multiple linear regressions using the R function lm() provides the below average system time:

> mean(s_lm) [1] 0.067614

A simulation of 1000 multiple linear regressions using the R function lm.fit() provides the below average system time:

> mean(s_lmfit) [1] 0.006888
This is an improvement of almost 9 times over lm()

A simulation of 1000 multiple linear regressions using the C++ implementation using Rcpp and RcppArmadillo code below:
 // [[Rcpp::depends(RcppArmadillo)]]  
 #include <RcppArmadillo.h>  
 using namespace Rcpp;  
 using namespace arma;  
 // [[Rcpp::export]]  
 arma::mat lm_rcpp(arma::mat X,arma::vec y) {  
 arma::vec b_hat;  
 b_hat = (X.t()*X).i()*X.t()*y;  
 return(b_hat);  
 }  
> mean(s_rcpp) [1] 0.002169

The Rcpp code is 30 times faster than the basic R lm() implementation!


Robust Regression Package for R

I wrote this package in 2006 when the major statistical software companies did not have a robust regression package available.  It has been downloaded over 100k times.  Using iteratively reweighted least squares (IRLS), the function calculates the optimal weights to perform m-estimator or bounded influence regression. Returns robust beta estimates and prints robust ANOVA table using either a Huber or bisquare function.

Recent changes make the structure of the arguments similar to glm() or lm() and speed has dramatically increased.

Download Robust Regression Package for R

*Updated 8/25/2015 version  robustreg_0.1-9
*R functions median(), mad() replaced with faster Rcpp equivalents
*R functions Huber and bisquare replaced with faster Rcpp equivalents