Apache Bigtop is by far the easiest to install Hadoop distribution. I was able to get it work on an
CentOS VM. Unfortunately the bundled Spark is 1.5.1, which does not work well with the sparklyr package. I have not figured out how to make the yarn manager work with a third-party copy of Spark yet. Need to wait for the Bigtop distribution to upgrade.
Saturday, November 26, 2016
The importance of googling ...
Trying to submit a manuscript today. The main manuscript was written in Rmarkdown whereas the supplementary materials in Word. The online submission system kept complaining abut my pdf file. After some googling and tweaking, it turns out the problem was caused by indicator function "\mathbbm{1}". Following the suggestion given here, I replaced it with "\mathds{1}". Problem solved!
Sunday, November 20, 2016
Building Scalable Data Pipelines with Microsoft R Server and Azure Data Factory
Useful information on big data computation using Microsoft platform.
Saturday, November 12, 2016
RStudio IDE Easy Tricks You Might’ve Missed
I missed a few of the tricks mentioned here. Very neat indeed!
Saturday, November 05, 2016
Subscribe to:
Posts (Atom)