I am running GEE logistic regression model for my fetal loss paper. As usual, I compare results between Stata and R and make sure they are consistent. To my surprise, the models assuming independent correlation structure give similar results but the models assuming exchangeable correlation structure give drastically different results.

It turns out that there is only one woman in my sample who reported a total number of eleven pregnancies (all others reported ten or less) and the presence of this single observation had huge influence on the algorithm used in R but not the one used in Stata. After excluding this single observation, the two sets of results look identical.

Subscribe to:
Post Comments (Atom)

## 5 comments:

Hi,

I am not a statistician, but statistics is been a favorite subject for me recently. So, based on your article, do you want to say that R is more sensitive than Stata? Is it good or bad? do you already publish your paper so i can get more explanation? thanks.

That's what the results seem to suggest. It will be worthwhile to dig deeper to figure out how these different packages handle such "abnormal" cases.

My paper is not about GEE; instead, it is a demographic research on involuntary fetal loss that makes use of GEE and statistical simulation.

how did you assess influence in R in the GEE model?

Hi Shige,

How did you assess influence in R for the GEE model? I get errors when I try influence.measures(model). Would be curious to find out how you did it?

how did you assess influence in R in the GEE model?

Post a Comment