user3747260 user3747260 - 1 year ago 49
R Question

environment not behaving as expected after using transformEnvir in RevoScaleR function

I have a function where I'm reading an xdf file using rxXdfToDataFrame and using a variable in my expression for rowSelection. If I don't pass

, the variable is not found. My problem is that after calling the function with
, I can't seem to reliably access
. If I hardcode a number into
I don't need to use
and everything works correctly. I tried setting the environment, but I'm not sure I was even doing it correctly.

The following code reproduces my problem:

envirtest = function()
df = data.frame(x=1:10)
selectnum = 5
rxDataFrameToXdf(df, "testxdf.xdf")
testdf = rxXdfToDataFrame("testxdf.xdf",rowSelection=(x==selectnum),transformEnvir=environment())
testdt = setDT(testdf)

The error that occurs:

Error in envirtest() : could not find function "setDT"

However, if instead of
is used, then the function executes.

edit: I forgot to mention that I had tried it without
set and everything worked properly. Also, tables() was changed to setDT() to avoid possible confusion.

Answer Source

Here is a solution to your problem, together with a partial explanation:

  • At the completion of the transformation, the transformation environment gets cleared.
  • This means it is safer to create an environment and then adding any objects into this environment before starting the rx-function.


env <- new.env()
env$selectnum = 5

Set up your function like this:

envirtest = function()
  df = data.frame(x=1:10)
  env <- new.env()
  env$selectnum = 5

  rxDataFrameToXdf(df, "testxdf.xdf", overwrite=TRUE)
  testdf <- rxXdfToDataFrame("testxdf.xdf",

Now try it:

x <- envirtest()

Rows Read: 10, Total Rows Processed: 10, Total Chunk Time: 0.006 seconds 
Rows Processed: 1
Time to read data file: 0.00 secs.
Time to convert to data frame: less than .001 secs.


Classes ‘data.table’ and 'data.frame':  1 obs. of  1 variable:
 $ x: int 5
 - attr(*, ".internal.selfref")=<externalptr>