user2117258 - 27 days ago 6x
R Question

# Create data.frame by subtracting row and column name delimiters

I have a distance matrix with row and columns delimited by a numeric value denoted after the first underscore (e.g., 7A_0_AAGCCTAGCGAC = 0). I would like a way to compare these values in a row versus column manner. Say, for example, I'd like to subtract the row delimiters from the column delimiters.

Input:

``````                  7A_0_AAGCCTAGCGAC 7A_4_AAATGACTGGCC 7A_7_CATCTCGTTCTA
7A_0_AAGCCTAGCGAC        0.00000000       0.034312102        0.04539427
7A_4_AAATGACTGGCC        0.03431210       0.000000000        0.01422137
7A_7_CATCTCGTTCTA        0.04539427       0.014221369        0.00000000
``````

Expected output:

``````                  7A_0_AAGCCTAGCGAC 7A_4_AAATGACTGGCC 7A_7_CATCTCGTTCTA
7A_0_AAGCCTAGCGAC        0.00000000                -4                -7
7A_4_AAATGACTGGCC                 4       0.000000000                -3
7A_7_CATCTCGTTCTA                 7                 3        0.00000000
``````

Any help would be much appreciated.

You can extract the numeric values from the column names and row names respectively and then do an outer subtraction:

``````# extract numeric values from the dimension names of the matrix
cols = as.numeric(sub(".*_(\\d+)_.*", "\\1", colnames(mat)))
rows = as.numeric(sub(".*_(\\d+)_.*", "\\1", rownames(mat)))

# calculate an outer subtract from the two vectors
output <- outer(cols, rows, "-")

# set up the dimension name
dimnames(output) <- list(rownames(mat), colnames(mat))

output
#                  7A_0_AAGCCTAGCGAC 7A_4_AAATGACTGGCC 7A_7_CATCTCGTTCTA
#7A_0_AAGCCTAGCGAC                 0                -4                -7
#7A_4_AAATGACTGGCC                 4                 0                -3
#7A_7_CATCTCGTTCTA                 7                 3                 0
``````