Tuesday, October 09, 2012

Prediction, missing data, etc. in Stan

library(rstan)

N <- 1001
N_miss <- ceiling(N / 10)
N_obs <- N - N_miss

mu <- 3
sigma <- 2

y_obs <- rnorm(N_obs, mu, sigma)

missing_data_code <-
'
data {
  int N_obs;
  int N_miss;
  real y_obs[N_obs];
}
parameters {
  real mu;
  real sigma;
  real y_miss[N_miss];
}
model {
  // add prior on mu and sigma here if you want
  y_obs ~ normal(mu,sigma);
  y_miss ~ normal(mu,sigma);
}
generated quantities {
  real y_diff;
  y_diff <- y_miss[101] - y_miss[1];
}
'

results <- stan(model_code = missing_data_code,
                data = list(N_obs = N_obs, N_miss = N_miss, y_obs = y_obs))

y_diff <- apply(extract(results, c("y_miss[1]", "y_miss[101]")), 1:2, diff)

5 comments:

Unknown said...
This comment has been removed by the author.
Unknown said...

A try-out of this code resulted in the following error and warning messages:

SAMPLING FOR MODEL 'missing_data_code' NOW (CHAIN 4).
Error in function stan::prob::normal_log(N4stan5agrad3varE): Scale parameter is -1.76001910463052:0, but must be > 0!Rejecting proposed initial value with zero density.
Iteration: 1 / 2000 [ 0%] (Warmup)

Informational Message: The current Metropolis proposal is about to be rejected becuase of the following issue:
Error in function stan::prob::normal_log(N4stan5agrad3varE): Scale parameter is -14285.211098355563:0, but must be > 0!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected becuase of the following issue:
Error in function stan::prob::normal_log(N4stan5agrad3varE): Scale parameter is -408.24603833169351:0, but must be > 0!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected becuase of the following issue:
Error in function stan::prob::normal_log(N4stan5agrad3varE): Scale parameter is -0.59762021868567095:0, but must be > 0!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

Informational Message: The current Metropolis proposal is about to be rejected becuase of the following issue:
Error in function stan::prob::normal_log(N4stan5agrad3varE): Scale parameter is -0.11291993295658864:0, but must be > 0!
If this warning occurs sporadically, such as for highly constrained variable types like covariance matrices, then the sampler is fine,
but if this warning occurs often then your model may be either severely ill-conditioned or misspecified.

How can we deal with these?

Shige said...

According to the Stan team, the "informational messages" should not be a problem as long as it does not show up too frequently.

Unknown said...

Clearly, this message occurs more than once. It occurs in other chains as well. Is that sporadically? Or is it neccesary to look for possible ill conditioning or misspecification?

Shige said...

It runs on simulated data and the results seem fine. If you post this question on the Stan user list, you probably get a short answer assuring that things are fine and you don't need to worry about it.

Counter