Nick - 1 year ago 250

R Question

I am having trouble creating a data frame of

`prop.test`

`prop.test`

Sample data:

`set.seed(1234)`

tc <- sample(c("test", "control"), 1000, replace = TRUE, prob = c(.8, .2))

target <- sample(LETTERS[1:2], 1000, replace = TRUE, prob=c(1/3, 1/3, 1/3))

pa <- sample(c(0, 1), 1000, replace = TRUE)

sc <- sample(c(0, 1), 1000, replace = TRUE)

ig <- sample(c(0, 1), 1000, replace = TRUE)

test <- data.frame(tc, target, pa, sc, ig)

Run

`prop.test`

`#define loop variables`

target_var <- c("A", "B") #targets

metric <- c("pa", "sc", "ig") #columns to loop through

#loop through combinations of targets and metrics and run prop.test

for (i in target_var) {

for (j in metric) {

d <- subset(test, target == i)

X <- d[,"tc"]

Y <- d[,j]

print(prop.test(table(X,Y),c(1,0),alternative="two.sided",

conf.level=0.95, correct=FALSE))

}

}

I am unsure on how to write the test results of all prop.test run to a data frame. Specifically, I would need i, j, statistic, parameter, p.value, estimate, conf.int, null.value, alternative, method, data.name for each test run.

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

To augment @alistaire's comment: you can call `broom::tidy`

to turn the `prop.test`

output to a dataframe, then wrap the call in a couple of `do.call(rbind, lapply(...))`

constructs:

```
library(broom)
out <- do.call(rbind, lapply(c("A", "B"), function(i) {
do.call(rbind, lapply(c("pa", "sc", "ig"), function(j) {
d <- subset(test, target == i)
X <- d[,"tc"]
Y <- d[,j]
tidy(prop.test(table(X,Y),c(1,0),alternative="two.sided",
conf.level=0.95, correct=FALSE))
}))
}))
```

The inner `lapply`

creates a list of length 3 (for "pa", "sc", and "ig"), with each element of the list a dataframe returned by `tidy(prop.table(...))`

, which we then `rbind`

together; the outer `lapply`

creates a list of length 2 (for "A", "B"), with each element a dataframe returned by the inner loop, which we again `rbind`

together.

We can finish off by adding `target_var`

and `metric`

to the dataframe to identify the rows:

```
out <- cbind(
setNames(expand.grid(c("pa", "sc", "ig"), c("A", "B")), c("metric", "target_var")),
out)
```

Output:

```
out
# metric target_var estimate1 estimate2 statistic p.value parameter ...
# 1 pa A 0.5142857 0.5169492 0.00153355 0.96876237 1 ...
# 2 sc A 0.5142857 0.4872881 0.15742455 0.69153883 1 ...
# 3 ig A 0.4285714 0.4915254 0.85764039 0.35439986 1 ...
# 4 pa B 0.5000000 0.4629630 0.31977168 0.57174489 1 ...
# 5 sc B 0.4324324 0.5592593 3.75231435 0.05273445 1 ...
# 6 ig B 0.5540541 0.4851852 1.10190190 0.29384909 1 ...
```

If the `broom`

package is unavailable, we can make our own stripped down version of the `tidy`

method for `htest`

objects (like the ones produced by `prop.test()`

):

```
tidy.proptest <- function(x) {
ret <- x[c("estimate", "statistic", "p.value", "parameter")]
names(ret$estimate) <- paste0("estimate", seq_along(ret$estimate))
ret <- c(ret$estimate, ret)
ret$estimate <- NULL
ret <- c(ret, conf.low = x$conf.int[1], conf.high = x$conf.int[2],
method = as.character(x$method),
alternative = as.character(x$alternative))
data.frame(ret)
}
```

Replace `tidy`

with `tidy.proptest`

in the above code snippet. Then a couple more steps to prettify the output:

```
rownames(out) <- seq_len(nrow(out)) # remove row names
out <- cbind(
setNames(expand.grid(c("pa", "sc", "ig"), c("A", "B")), c("metric", "target_var")),
out)
```

Recommended from our users: **Dynamic Network Monitoring from WhatsUp Gold from IPSwitch**. ** Free Download**