Ted Mosby Ted Mosby - 1 month ago 15
R Question

Blanking on how to perform a chisq test in R

I know I learned this back in some class but I can't recall right now.

I have data like so:

dput(tbl)
structure(c(160L, 7094L, 0L, 0L, 3287L, 373L, 164L, 2406L, 0L,
0L, 33L, 0L, 0L, 0L, 0L, 122L, 20775L, 0L, 0L, 0L, 0L, 0L, 0L,
417L, 0L, 1709L, 0L, 0L, 471L, 0L, 499L, 0L, 0L, 0L, 1029L, 4399L,
3413L, 0L, 890L, 57L, 3185L, 0L, 0L, 1137L, 103L, 105L, 899L,
0L, 0L, 7L, 0L, 0L, 0L, 0L, 69L, 8852L, 0L, 0L, 0L, 0L, 0L, 0L,
53L, 0L, 776L, 0L, 0L, 222L, 0L, 193L, 0L, 0L, 0L, 312L, 1889L,
1417L, 0L, 352L), .Dim = c(39L, 2L), .Dimnames = structure(list(
c("ARSON", "ASSAULT", "BAD CHECKS", "BRIBERY", "BURGLARY",
"DISORDERLY CONDUCT", "DRIVING UNDER THE INFLUENCE", "DRUG/NARCOTIC",
"DRUNKENNESS", "EMBEZZLEMENT", "EXTORTION", "FAMILY OFFENSES",
"FORGERY/COUNTERFEITING", "FRAUD", "GAMBLING", "KIDNAPPING",
"LARCENY/THEFT", "LIQUOR LAWS", "LOITERING", "MISSING PERSON",
"NON-CRIMINAL", "OTHER OFFENSES", "PORNOGRAPHY/OBSCENE MAT",
"PROSTITUTION", "RECOVERED VEHICLE", "ROBBERY", "RUNAWAY",
"SECONDARY CODES", "SEX OFFENSES, FORCIBLE", "SEX OFFENSES, NON FORCIBLE",
"STOLEN PROPERTY", "SUICIDE", "SUSPICIOUS OCC", "TREA", "TRESPASS",
"VANDALISM", "VEHICLE THEFT", "WARRANTS", "WEAPON LAWS"),
c("Weekday", "Weekend")), .Names = c("", "")), class = "table")


I tried to do a
chisq.test(tbl)
but the results come back as NA, most likely because of the zeros. Does anyone have any insight? I'm looking to calculate the difference between weekday and weekend, the type of crime can be combined to just be all crime.

Answer

Well if you think its the zeroes try it without the zeroes:

> chisq.test(tbl[tbl[,1]!=0,])

    Pearson's Chi-squared test

data:  tbl[tbl[, 1] != 0, ]
X-squared = 194.13, df = 16, p-value < 2.2e-16

And that seems to produce some numbers.

Is one pair of zeroes enough to throw it?

> chisq.test(rbind(tbl[tbl[,1]!=0,],c(0,0)))

    Pearson's Chi-squared test

data:  rbind(tbl[tbl[, 1] != 0, ], c(0, 0))
X-squared = NaN, df = 17, p-value = NA

Warning message:
In chisq.test(rbind(tbl[tbl[, 1] != 0, ], c(0, 0))) :
  Chi-squared approximation may be incorrect

Yes. Clearly having 0 crimes at all can't add any information to whether weekdays or weekends are worse. I suppose you could submit a request that it drops zeroes and gives a warning that its done so, but I can't see that being implemented.

Comments