Dale Kube Dale Kube -4 years ago 76
R Question

Recursive Manipulation of Employee/Supervisor Data to Produce Org Tree Hierarchy Columns in R

I commonly need to analyze data in an "org tree" format to understand the frequency of activities under a given leader within the organization. I need to produce a wide hierarchy from two columns of data: employee name and supervisor name.

----------
df <- data.frame("Employee"=c("Bill","James","Amy","Jen","Henry"),
"Supervisor"=c("Jen","Jen","Steve","Amy","Amy"))
df
# Employee Supervisor
# 1 Bill Jen
# 2 James Jen
# 3 Amy Steve
# 4 Jen Amy
# 5 Henry Amy


End with a wide data frame that specifies the org chart, starting with the CEO (or the topmost employee):

# Employee H1 H2 H3
# 1 Bill Steve Amy Jen
# 2 James Steve Amy Jen
# 3 Amy Steve NA NA
# 4 Jen Steve Amy NA
# 5 Henry Steve Amy NA


After much research, the
data.tree
package seems to offer the most help. How can I perform this operation?

Answer Source

Try this:

library(data.table)
setDT(df)

setnames(df, 'Supervisor', 'Supervisor.1')

j=1
while (df[, any(get(paste0('Supervisor.',j)) %in% Employee)]) {
  df[df, on=paste0('Supervisor.',j,'==Employee'),
     paste0('Supervisor.',j+1):= i.Supervisor.1]
  j = j + 1
}

> df
#    Employee Supervisor.1 Supervisor.2 Supervisor.3
# 1:     Bill          Jen          Amy        Steve
# 2:    James          Jen          Amy        Steve
# 3:      Amy        Steve           NA           NA
# 4:      Jen          Amy        Steve           NA
# 5:    Henry          Amy        Steve           NA

To reorder within rows:

df = cbind(df[, 1], t(apply(df[, -1], 1, function(r) c(rev(r[!is.na(r)]), r[is.na(r)]))))
> df
#    Employee    V1  V2  V3
# 1:     Bill Steve Amy Jen
# 2:    James Steve Amy Jen
# 3:      Amy Steve  NA  NA
# 4:      Jen Steve Amy  NA
# 5:    Henry Steve Amy  NA
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download