Tuesday, December 31, 2019

Saturday, December 28, 2019


I enjoyed reading the book Agent-Based Modeling & Geographical Information Systems. I first got interested in agent-based modeling around 1999. At that time, the only software platform for this kind of modeling is Mathematica. Now with free and open source software like NetLogo, it is much easier to translate ideas into executable code and see the results.

A large number of agent-based models have been developed. OpenABM is an online repository for such models, the majority of which were developed in NetLogo.

Saturday, November 16, 2019

Interactive computing using C++ within the Atom editor

With the xeus-cling project, it is possible to do interactive computing using C++ inside a Jupyter notebook. With the Hydrogen project, I would think that the same can be done inside the Atom editor. I gave it a try with no luck. The problem was that while xeus-cling created a number of C++ kernels such as "C++ 11", "C++14", and "C++17", it did not create a kernel named "C++", which is what Hydrogen looks for. Follow the suggestion mentioned in this post, I was able to get rid of the error message and get the system to work. Wow!

Wednesday, November 06, 2019

Spatial Data Analysis with INLA

A post on the use of INLA. Worth reading.

Tuesday, October 01, 2019

How to use math symbols with ggdag

The wonderful package ggdag can easily make DAG like this:

However, what we really want to include in publications is something like this:

The second one can include subscript and superscript, among many others. After some tweaking, I found a solution, not perfect but usable for now.


```{r, echo=FALSE}
dag <- dagify(Y1 ~ X + Z1 + Z0 + U + P,
              Y0 ~ Z0 + U,
              X ~ Y0 + Z1 + Z0 + P,
              Z1 ~ Z0,
              P ~ Y0 + Z1 + Z0,
              exposure = "X",
              outcome = "Y1")

dag %>% 
  tidy_dagitty(layout = "auto", seed = 12345) %>%
  arrange(name) %>% 
  ggplot(aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_dag_point() +
  geom_dag_edges() +
  geom_dag_text(parse = TRUE, label = c("P", "U", "X", expression(Y[0]), expression(Y[1]), expression(Z[0]), expression(Z[1]))) +

Here the trick is to sort the tidy version of the DAG data by "name", then we can assign labels by the order of the name of the nodes. I hope a more automated approach can be developed in the future.

By the way, with the package latex2exp, it is straightforward to use LaTeX instead of plotmath commands.

Friday, September 06, 2019

A useful disk.frame tutorial

Useful package and useful tutorial here.

Monday, August 26, 2019

Sunday, July 07, 2019

Saturday, July 06, 2019

V8 on Arch/Manjaro

There is finally an V8 package for Arch/Manjaro Linux that supports the V8 R package.

Wednesday, June 26, 2019

Thursday, June 13, 2019

R vs. Python for Data Science

Here is something from somebody who understands what he is talking about.

Tuesday, June 04, 2019

Top IDE index

Here is a ranking of IDE. They also provide a ranking for programming language.

Wednesday, May 29, 2019

Antergos is dead

Very unfortunate news. It was one of the useful efforts to bring Arch Linux to the masses. I tried it a few times but, in the end, I always found myself crawling back to Manjaro.

Sunday, March 24, 2019

Play with the cyphr package

The cyphr package seems to provide a good choice for small research group that shares sensitive data over internet (e.g., DropBox). I did some simple experiment myself and made sure it can actually serve my purpose.

I did my experiment on two computers (using openssl): I created the test data on my Linux workstation running Manjaro then I tried to access the data on a Windows 7 laptop.

For creating the data (Linux workstation):


# Create the test data

data_dir <- file.path("~/Dropbox/temp_files", "data")

# Encrypt the test data


key <- cyphr::data_key(data_dir)

filename <- file.path(data_dir, "iris.rds")

cyphr::encrypt(saveRDS(iris, filename), key)

# Cannot read the data with decrypting it


# Read the decrypted version of the data

head(cyphr::decrypt(readRDS(filename), key))

For accessing the data (Windows laptop):


key <- data_key("C:/Users/Ssong/Dropbox/temp_files/data", path_user = "C:/Users/Ssong/.ssh")

# Make data access request

path_user = "C:/Users/Ssong/.ssh")

On Windows 7,  the system cannot locate the public located in "~/.ssh", which is pretty dumb.

Going back to the Linux workstation to approve the data access request:

# Review the request and approve (to share with other users)
req <- data_admin_list_requests(data_dir)
data_admin_authorise(data_dir, yes = TRUE)

Now I can access the data on my Windows laptop:

key <- data_key("C:/Users/Ssong/Dropbox/temp_files/data", path_user = "C:/Users/Ssong/.ssh")

d <- decrypt( readRDS( "C:/Users/Ssong/Dropbox/temp_files/data/iris.rds"), key)

Monday, February 18, 2019

‘Meta’ machine learning packages in R

Instead of using multiple packages for different machine learning tasks, it is now possible to use one of those meta packages to do (almost) all of them.

Wednesday, February 06, 2019

Thursday, January 31, 2019

How GPL makes me leave R for Python

Informative post, especially David's response.