# Shige's Research Blog

## Monday, August 15, 2016

### Sparklyr

The new sparklyr package from rstudio provides a convenient interface between R/Rstudio and Spark. It runs well on Linux; it also works on Windows for Spark 1.6.2 and lower. For some reasons, it does not work with Spark 2.0 on Windows. I assume it will get fixed in subsequent releases.

## Saturday, July 16, 2016

### LaplacesDemon is back

Looks like LaplacesDemon package is back. Now we have a pure R-based Bayesian computation platform.

## Friday, July 01, 2016

### Microsoft Analytics in 2016

Here is a thorough introduction of data science solution offered by Microsoft.

## Saturday, June 18, 2016

## Wednesday, May 25, 2016

## Wednesday, April 06, 2016

## Thursday, March 31, 2016

## Friday, February 26, 2016

### Multiple imputation using R

R has a long list of packages for multiple imputation. The main problem is integration: statistical procedures in other packages may or may not work with the imputation procedures. I have been using Amelia together with Zelig. Because they were written by the same group, they work well together. However, I have been having trouble with making multiple imputation to work with the plm package. After searching the internet, here comes the solution:

- Impute the missing data using Amelia or Mice.
- Estimate the model on each imputed data.
- Use the mitools package to extract and combine results.

For example, here is a simple example:

...

imp <- mice(d)

mydata <- imputationList(lapply(1:5, complete, x = imp))

fit <- lapply(mydata$imputations, function(x){

plm(cog3pl ~ oc + grade9 + boy + han + ruralbirth, data = x,

index = c("schids"), model = "pooling")})

betas <- MIextract(fit, fun = coef)

vars <- MIextract(fit, fun = vcov)

summary(MIcombine(betas, vars))I bet this will work for most, if not all, estimation procedures in R.

Labels:
missing values,
multiple imputation

## Sunday, February 21, 2016

### Another text analysis package

Quanteda seems to be a serious contender for analyzing textual data using R.

Subscribe to:
Posts (Atom)