StevenL StevenL - 1 year ago 75
R Question

extracting varible from file names in R

I have files that contain multiple rows, I want to add two new rows that I create by extracting varibles from the filename and multipling them by current rows.
For example I have a bunch of file that are named something like this



between the
there are always 2 numbers spearated by a comma

the file itself has multiple columns, for example

I want for each file to extract the 2 values in the name of the file and then use them as variables to make 2 new columns that used the variable to modify the values.

for example


the file contains two columns

column1 column2
1 2
2 4

I want at the end to add the first file name value to column 1 to create column3 and add the second file name value to column 2 to create column 4, ending up with something like this

column1 column2 column3 column4
1 2 1001 2002
2 4 1002 2004

thanks for the helpalmost there just a few more issues
original files has 2 columns "X_Parameter" "Y_Parameter", the file name is "test(64084,4224).txt
your code works great at extracting the two values V1 "64084" and V2 "4224" from the file name. I then add these values to the origninal dataset. this yields 4 columns. "X_Parameter" "Y_Parameter" "V1" "V2".

txt_names = list.files(pattern = ".txt")
for (i in 1:length(txt_names)){assign(txt_names[i], read.delim(txt_names[i]))
DS1 <- read.delim(file = txt_names[i], header = TRUE, stringsAsFactors = TRUE)
remove_text <- str_extract(txt_names, pattern = "\\[[0-9,0-9]+\\]")
step1 <- gsub("(\\[)", "", remove_text)
step2 <- gsub("(\\])", "", step1)
DS2<"rbind", (str_split(step2, ","))))

My issue arises when tying to sum "X_Parameter" and "V1" to make "absoluteX" and sum "Y_Parameter"with "V2" to make "absoluteY" for each row.

below are the two ways I have tried with the errors


In Ops.factor(DS1$X_Parameter, DS1$V1) : ‘+’ not meaningful for factors

other try was


Error in rowSums(DS1[, c("X_Parameter", "V1")]) : 'x' must be numeric

Any thoughts?Thanks

Answer Source
## set path to wherever your files are

## make a vector with names of your files
txt_names <- list.files(pattern = ".txt")

## read your files in
for (i in 1:length(txt_names)) assign(txt_names[i], read.csv(txt_names[i], sep = "whatever your separator is"))

## grab the text you require from the file names

remove_text <- str_extract(txt_names, pattern = "\\[[0-9,0-9]+\\]")
step1 <- gsub("(\\[)", "", remove_text)
step2 <- gsub("(\\])", "", step1)

## step2 should look like this
> step2

[1] "1000,1001"

## split each string and convert to data frame with two columns"rbind", (str_split(step2, ","))))

NOw you have a data.frame with two columns , each containing the file name parts you need. Just rep them times length of each of your data.frames and cbind.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download