Brandon McCormick Brandon McCormick - 3 months ago 10
R Question

Error in FUN(X[[i]], ...) : object 'X' not found

I have the following data set:

> str(e.2015.1990)
'data.frame': 4813807 obs. of 42 variables:
$ GAME.ID : Factor w/ 60464 levels "ANA201504100",..: 1 1 1 1 1 1 1 1 1 1 ...
$ INNING : num 1 1 1 1 1 1 1 1 1 2 ...
$ BATTING.TEAM : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 2 2 2 1 ...
$ OUTS : int 0 1 2 2 2 2 0 1 2 0 ...
$ BATTER : Factor w/ 5107 levels "abrej003","ackld001",..: 73 167 33 120 163 100 34 256 200 209 ...
$ BATTER.HAND : Factor w/ 2 levels "L","R": 2 1 2 1 2 1 1 2 2 2 ...
$ RES.BATTER : Factor w/ 5107 levels "abrej003","ackld001",..: 73 167 33 120 163 100 34 256 200 209 ...
$ RES.BATTER.HAND : Factor w/ 2 levels "L","R": 2 1 2 1 2 1 1 2 2 2 ...
$ PITCHER : Factor w/ 3481 levels "abadf001","albem001",..: 187 187 187 187 187 187 204 204 204 187 ...
$ PITCHER.HAND : Factor w/ 2 levels "L","R": 1 1 1 1 1 1 1 1 1 1 ...
$ RES.PITCHER : Factor w/ 3481 levels "abadf001","albem001",..: 187 187 187 187 187 187 204 204 204 187 ...
$ RES.PITCHER.HAND : Factor w/ 2 levels "L","R": 1 1 1 1 1 1 1 1 1 1 ...
$ FIRST.RUNNER : Factor w/ 4369 levels "","abrej003",..: 1 1 1 1 104 140 1 1 1 1 ...
$ SECOND.RUNNER : Factor w/ 4048 levels "","abrej003",..: 1 1 1 26 1 90 1 1 1 1 ...
$ THIRD.RUNNER : Factor w/ 3729 levels "","ackld001",..: 1 1 1 1 1 1 1 1 1 1 ...
$ EVENT.TEXT : chr "63/G" "6/P" "D8/L+" "S9/G.2-H" ...
$ EVENT.TYPE : num 1 1 19 18 18 1 1 1 1 1 ...
$ AB.FLAG : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
$ HIT.VALUE : int 1 1 3 2 2 1 1 1 1 1 ...
$ SH.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ SF.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ DOUBLE.PLAY.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ TRIPLE.PLAY.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ RBI.ON.PLAY : num 0 0 0 1 0 0 0 0 0 0 ...
$ BATTED.BALL.TYPE : Factor w/ 5 levels "","F","G","L",..: 3 5 4 3 4 5 3 3 5 4 ...
$ BATTER.DEST : int 0 0 2 1 1 0 0 0 0 0 ...
$ RUNNER.ON.1ST.DEST : int 0 0 0 0 2 1 0 0 0 0 ...
$ RUNNER.ON.2ND.DEST : int 0 0 0 4 0 2 0 0 0 0 ...
$ RUNNER.ON.3RD.DEST : int 0 0 0 0 0 0 0 0 0 0 ...
$ SB.FOR.RUNNER.ON.1ST.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ SB.FOR.RUNNER.ON.2ND.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ SB.FOR.RUNNER.ON.3RD.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ CS.FOR.RUNNER.ON.1ST.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ CS.FOR.RUNNER.ON.2ND.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ CS.FOR.RUNNER.ON.3RD.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ PO.FOR.RUNNER.ON.1ST.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ PO.FOR.RUNNER.ON.2ND.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ PO.FOR.RUNNER.ON.3RD.FLAG : logi FALSE FALSE FALSE FALSE FALSE FALSE ...
$ RESPONSIBLE.PITCHER.FOR.RUNNER.ON.1ST: Factor w/ 3433 levels "","albua001",..: 1 1 1 1 161 161 1 1 1 1 ...
$ RESPONSIBLE.PITCHER.FOR.RUNNER.ON.2ND: Factor w/ 3408 levels "","abadf001",..: 1 1 1 133 1 133 1 1 1 1 ...
$ RESPONSIBLE.PITCHER.FOR.RUNNER.ON.3RD: Factor w/ 3337 levels "","abadf001",..: 1 1 1 1 1 1 1 1 1 1 ...
$ EVENT.NUM : Factor w/ 177 levels "1","10","100",..: 1 90 101 112 123 134 145 156 167 2 ...


I was able to, successfully, create the following data sets:

p.hit = aggregate(x = list(HIT = e.2015.1990$HIT.VALUE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x > 1))
p.single = aggregate(x = list(SINGLE = e.2015.1990$HIT.VALUE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == 2))
p.double = aggregate(x = list(DOUBLE = e.2015.1990$HIT.VALUE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == 3))
p.triple = aggregate(x = list(TRIPLE = e.2015.1990$HIT.VALUE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == 4))
p.home.run = aggregate(x = list(HOME.RUN = e.2015.1990$HIT.VALUE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == 5))
p.at.bat = aggregate(x = list(AT.BAT = e.2015.1990$AB.FLAG), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == "TRUE"))
p.rbi = aggregate(x = list(RBI = e.2015.1990$RBI.ON.PLAY), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x > 0))
p.sf = aggregate(x = list(SACRIFICE.FLY = e.2015.1990$SF.FLAG), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == "TRUE"))
p.hbp = aggregate(x = list(HIT.BY.PITCH = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == 16))
p.ibb = aggregate(x = list(INTENTIONAL.WALK = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == 15))


However, when I, similarly, try to create the following data sets:

p.sh = aggregate(x = list(SACRIFICE.HIT = e.2015.1990$SH.FLAG), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(X == "TRUE"))
p.so = aggregate(x = list(STRIKE.OUT = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(X == 3))
p.ha = aggregate(x = list(HITS.ALLOWED = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(X > 1))
p.hb = aggregate(x = list(HIT.BATSMAN = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(X == 16))


I get the same error message:

> p.sh = aggregate(x = list(SACRIFICE.HIT = e.2015.1990$SH.FLAG), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(X == "TRUE"))
Error in FUN(X[[i]], ...) : object 'X' not found
> p.so = aggregate(x = list(STRIKE.OUT = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(X == 3))
Error in FUN(X[[i]], ...) : object 'X' not found
> p.ha = aggregate(x = list(HITS.ALLOWED = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(X > 1))
Error in FUN(X[[i]], ...) : object 'X' not found
> p.hb = aggregate(x = list(HIT.BATSMAN = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(X == 16))
Error in FUN(X[[i]], ...) : object 'X' not found


What's the difference? What's going on, here? And, how do I fix it?

In similar questions that I found, it appeared this error has something to do with an identity violation of some sort, where the variable refers to itself. However, that's not the case here.

Thank you for your help!

Answer

it seems that it's just a case letter problem.

In the code where you got error, you replaced x by X in your sum function.

Please try the following:

p.sh = aggregate(x = list(SACRIFICE.HIT = e.2015.1990$SH.FLAG), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$BATTER), FUN = function(x) sum(x == "TRUE"))
p.so = aggregate(x = list(STRIKE.OUT = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(x == 3))
p.ha = aggregate(x = list(HITS.ALLOWED = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(x >  1))
p.hb = aggregate(x = list(HIT.BATSMAN = e.2015.1990$EVENT.TYPE), by = list(GAME.ID = e.2015.1990$GAME.ID, PLAYER.ID = e.2015.1990$RES.PITCHER), FUN = function(x) sum(x == 16))