Shige's Research Blog

Wednesday, November 30, 2016


Apache Bigtop is by far the easiest to install Hadoop distribution. I was able to get it work on an
CentOS VM. Unfortunately the bundled Spark is 1.5.1, which does not work well with the sparklyr package. I have not figured out how to make the yarn manager work with a third-party copy of Spark  yet. Need to wait for the Bigtop distribution to upgrade.

Saturday, November 26, 2016

The importance of googling ...

Trying to submit a manuscript today. The main manuscript was written in Rmarkdown whereas the supplementary materials in Word. The online submission system kept complaining abut my pdf file. After some googling and tweaking, it turns out the problem was caused by indicator function "\mathbbm{1}". Following the suggestion given here, I replaced it with "\mathds{1}". Problem solved!

Sunday, November 20, 2016

Saturday, November 12, 2016

RStudio IDE Easy Tricks You Might’ve Missed

I missed a few of the tricks mentioned here. Very neat indeed!

Saturday, November 05, 2016

Tidy Text Mining with R

Cool text mining book.


This is a simple but useful package for downloading product information and reviews from