Monday, February 28, 2011


Rstudio, an open source cross-platform IDE for R, seems to be really attractive. It probably will not drag me away from Emacs, but having more options is always a good thing.

The only downside of it is that it automatically installs R from the Ubuntu software repository, even though I have already had a manually compiled version installed on the system. It will be better if Rstudio does not try to install a version of R as default but gives the choice to the user.

I think this IDE is an excellent choice for teaching purpose: by now the only reason I am hesitating to teach my students using R as opposed to other statistical package is the lack of a "modern" IDE. I mean, Emacs + ESS suites me just fine, but I cannot imagine what will happen if I try to make my undergraduate students learn to do this on their Windows machine!

After a few tweaks, and many thanks to the guys working on Rstudio, I was able to compile and install the source distribution.

Commands used:

  1. git clone
  2. git submodule update --init --recursive
  3. cmake -DRSTUDIO_TARGET=Desktop -DCMAKE_BUILD_TYPE=Release
  4. sudo make install

Sunday, February 27, 2011

LibreOffice 3.3.1

LibreOffice 3.3.1 has a much improved startup performance, compared to both LibreOffice 3.3 and OpenOffice. This improvement has real significance for an ordinary user.

Saturday, February 26, 2011

Simulate parameters

This blog shows how to simulate parameters from a tobit model with a few line of R code.

Tuesday, February 22, 2011

Population Studies

Population Studies is one of my favorite academic journals. Only recently it started to accepts PDF submission. This is a very positive development.

Tuesday, February 15, 2011

Sunday, February 13, 2011

A few summary functions used by ggplot2

I could not quite figure out what the summary function "mean_sdl" stands for, so I took at a look at the source code "stat-summary.r". It turns out that this is a wrapper function of the "smean_sdl()" of the Hmisc package, as well as the "median_hilow", "mean_cl_normal" and "mean_cl_boot".

Friday, February 11, 2011

Simulating second difference using Zelig

I am trying to simulate second difference using Zelig, here is my code:


# estimation

z.out <- zelig(vote ~ race*age + educate + income,
               model = "logit", data = turnout)


# first difference

x.low <- setx(z.out, educate = 12)
x.high <- setx(z.out, educate = 16)

s.out <- sim(z.out, x = x.low, x1 = x.high)

s.low <- sim(z.out, x=x.low)
s.high <- sim(z.out, x=x.high)

dif <- (s.high$qi$ev - s.low$qi$ev)

# second difference

x.low.low <- setx(z.out, educate = 12, age = 20)
x.low.high <- setx(z.out, educate = 12, age = 30)
x.high.low <- setx(z.out, educate = 16, age = 20)
x.high.high <- setx(z.out, educate = 16, age = 30)

s1 <- sim(z.out, x = x.low.low, x1 = x.low.high)
s2 <- sim(z.out, x = x.high.low, x1 = x.high.high)

did1 <- s1$qi$fd - s2$qi$fd

# or equivalent

s3 <- sim(z.out, x = x.low.low, x1 = x.high.low)
s4 <- sim(z.out, x = x.low.high, x1 = x.high.high)

did2 <- s3$qi$fd - s4$qi$fd
It would be great if Zelig can do this directly though. 

Monday, February 07, 2011

R, windows, linux, etc.

For some reasons I had to work on a Windows machine for the last couple of days. So I installed Revolution R and played with it. It is an elegant piece of software with clearly defined targeted user group. It fits the Windows world well. On the other hand, this means that it has distanced itself from the free software world as represented by Linux and R: while Linux and R make it easy for you to gets your hands dirty by playing with the C/C++ code under the surface and blur the distinction between "developer" and "user", the distinction is so prominent in the Windows world (I assume in the Mac world too) and by making view, changing, and compiling source code very difficult, binary files are deified.