R Regression: Comparing Speed Using lm() lm.fit() and RCPP

One of the problems of R is speed and memory. Below I compare three methods to perform multiple linear regression.

The built in R function is lm(). It is the slowest. A bare bones R implementation is lm.fit() which is substantially faster but still slow. The fastest method is to use Rcpp and RcppArmadillo which is the C++ Armadillo linear algebra library.

Using a 31400 x 4 design matrix a simulation is run to compare the three methods:

A simulation of 1000 multiple linear regressions using the R function lm() provides the below average system time:

> mean(s_lm) [1] 0.067614

A simulation of 1000 multiple linear regressions using the R function lm.fit() provides the below average system time:

> mean(s_lmfit) [1] 0.006888
This is an improvement of almost 9 times over lm()

A simulation of 1000 multiple linear regressions using the C++ implementation using Rcpp and RcppArmadillo code below:
 // [[Rcpp::depends(RcppArmadillo)]]  
 #include <RcppArmadillo.h>  
 using namespace Rcpp;  
 using namespace arma;  
 // [[Rcpp::export]]  
 arma::mat lm_rcpp(arma::mat X,arma::vec y) {  
 arma::vec b_hat;  
 b_hat = (X.t()*X).i()*X.t()*y;  
 return(b_hat);  
 }  
> mean(s_rcpp) [1] 0.002169

The Rcpp code is 30 times faster than the basic R lm() implementation!