RockJake28 RockJake28 - 1 month ago 10
R Question

R - glmer different results on different machines (non-deterministic)

Is there any reason why the

glmer
function from
lme4
would produce different results on different machines? The hardware in the machines are substantially different, though are all running the same OS, R and package versions (turns out this is not actually true).

The formula has a grouped binomial response variable and 22 continuous fixed effects which are all on the same scale and several random effects, which are strings and I am using the
logit
link function.

cbind(ill, not_ill) ~ 0 + fix1 + fix2 + ... + fix22 + (1|id/region/country) +
(1|season)


When using a train and test data set for leave one out cross validation, I get very similar results. However on one machine I get consistently clean output with no warnings; on another I get convergence warnings on every fold of the test.

N.B. The train/test sets are identical across machines

EDIT: adding
sessionInfo()


Machine 1 (this is the one that puts out nice results

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] blmeco_1.1 arm_1.9-1 MASS_7.3-45 lme4_1.1-12 Matrix_1.2-7.1

loaded via a namespace (and not attached):
[1] minqa_1.2.4 coda_0.18-1 abind_1.4-5 Rcpp_0.12.7
[5] MuMIn_1.15.6 splines_3.3.1 nlme_3.1-128 grid_3.3.1
[9] nloptr_1.0.4 stats4_3.3.1 lattice_0.20-34


Machine 2 (Not so nice results)

R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_GB.UTF-8 LC_COLLATE=en_GB.UTF-8
[5] LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] blmeco_1.1 arm_1.9-1 MASS_7.3-45 lme4_1.1-12 Matrix_1.2-3

loaded via a namespace (and not attached):
[1] minqa_1.2.4 coda_0.18-1 abind_1.4-5 Rcpp_0.12.7
[5] MuMIn_1.15.6 splines_3.2.3 nlme_3.1-124 grid_3.2.3
[9] nloptr_1.0.4 stats4_3.2.3 lattice_0.20-33


Obviously there's a few differences here that I missed so I will rectify that and see if there are any changes in output. Of the differences that exist,
Matrix
is the one most likely to be causing an issue as (I think) it's a dependency of
lme4
. Thanks for comments that led me here.

Answer

I'm not sure what you mean by "non-deterministic" here; I would usually take that to mean that successive runs of the same code, on the same machine, could give different results.

For large, unstable problems it would be mildly surprising, but not impossible, to get different results on different hardware platforms under the same operating system. We certainly see cases where the same version of the package (same R and C++ code) gives different results when compiled with different compilers under different operating systems. If those differences are on either side of a tolerance test, then you will get warnings in one case and not in the other. I would be more concerned by how far apart the estimates are on different platforms than in whether you get warnings or not.

It would certainly narrow things down to make sure you were doing everything as similarly as possible (e.g. you are still using different versions of R and, as you pointed out, different versions of Matrix, on the different machines ...)

Comments