Starbucks - 1 year ago 61
R Question

# R Converting Factors into New Variables

I have two variables with multiple levels; V1 has 400 levels and V2 has ≈ 250 levels. How can I transform V2's factors into several different variables and use variable V1 as the unique identifier?

``````V1             V2
Garza, Mike    a
Garza, Mike    b
Smith, James   a
Smith, James   f
Smith, James   z
Moore, Jen     b
Klein, April   f
``````

The dataframe should look like the example below. Note: How variables can contain multiple factors, not one variable per factor. Considering Mike has two factors associated with him, factors a and b go into V2 and V3, where Jen, factor b also goes into V2, not V3.

``````V1             V2 V3 V4 V5
Garza, Mike    a  b
Smith, James   a  f  z
Moore, Jen     b
Klein, April   f
``````

Any help would be greatly appreciated!

Thank you.

You can do the first part with `dcast` in the `reshape` package and then sort them further to your desired output with `apply`.

``````dat <- data.frame(V1 = factor(c("Garza", "Garza",
"Smith", "Smith", "Smith",
"Moore", "Klein")),
V2 = c("a","b","a","f","z","b","f"))

dd <- dcast(dat, V1~V2)

#make a function to use with apply

shift_values<- function(x){
notna <-which(!is.na(x[-1]))
val <- x[notna+1]
x[-1] <- c(as.character(val), rep("", (length(x)-1-length(val))))
return(x)
}

# use it in an apply loop, transpose the data, and turn it into a data.frame
result <- data.frame(t(apply(dd, 1, shift_values)))

# change the column names
colnames(result)[-1] <- paste0("V", 2:(ncol(result)))
``````

The data then looks like this:

``````     V1 V2 V3 V4 V5
1 Garza  a  b
2 Klein  f
3 Moore  b
4 Smith  a  f  z
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download