Shige's Research Blog

Wednesday, September 17, 2014

R package to convert statistical analysis objects to tidy data frames

This post introduces the broom package, which can tidy up R output in the same dplyr and tidyr package do to R dataframes.

Monday, September 15, 2014

ffbases2

The "ffbase2" packages aims to combine the "dplyr" and "ff" packages, really cool.

Tuesday, September 09, 2014

Ceemple vs. Rcpp

Ceemple is a cool way to do C++. Rcpp is another cool way to do C++. Each of them has its own strengths and weaknesses. I am amazed to see how little change is required to get the same source to compile and run under these environments. For example, Ceemple comes with an example that uses the Eigen matrix library:

--------------------------------------------
#include <Eigen/Dense>
#include <iostream>
using namespace Eigen;
using namespace std;

int main()
{
  ArrayXXf  m(2,2);
  // assign some values coefficient by coefficient
  m(0,0) = 1.0; m(0,1) = 2.0;
  m(1,0) = 3.0; m(1,1) = m(0,1) + m(1,0);
  // print values to standard output
  cout << m << endl << endl;
  // using the comma-initializer is also allowed
  m << 1.0,2.0,
       3.0,4.0;
  // print values to standard output
  cout << m << endl;
}
-------------------------------------------

With Rcpp (using the Rstudio IDE), this becomes:
-------------------------------------------
// [[Rcpp::depends(RcppEigen)]]
#include <RcppEigen.h>
using namespace std;
using namespace Rcpp;
using namespace Eigen;

// [[Rcpp::export]]
int test_eigen()
{
  ArrayXXf  m(2,2);
  // assign some values coefficient by coefficient
  m(0,0) = 1.0; m(0,1) = 2.0;
  m(1,0) = 3.0; m(1,1) = m(0,1) + m(1,0);
  // print values to standard output
  cout << m << endl << endl;
  // using the comma-initializer is also allowed
  m << 1.0,2.0,
       3.0,4.0;
  // print values to standard output
  cout << m << endl;
  return 0;
}

/*** R
test_eigen()
*/
-------------------------------------------

Virtually no changes required! 

The rgl package needs a new libpng!

The new version of the "rgl" package (0.94.1131) requires "libpng15.so.15". On my Ubuntu 14.04 system, I have to get the source tarball, install it, and make a soft link to "/usr/lib".

Wednesday, August 27, 2014

Data science toolkit

Here is a list of useful resources for data science.

Friday, August 22, 2014

Two R packages for piplining

PipeR and magrittr are two packages for piplining in R. This post explains some design differences between them.

Regular expressions in R

This post explains how to use different regular expression commands to achieve various tasks.

Here is a good tutorial, and a YouTube video here.

Sunday, August 17, 2014

Friday, August 15, 2014

Atom editor

The Atom editor is really cool. The Windows build keeps up the pace of development but the Linux build seriously lags behind. The developmental version is 0.124, the Windows build is 0.123, whereas the Linux build is 0.161. Have to build it from source myself, don't have a choice.

Counter