Ashwin Ashwin - 1 year ago 113
R Question

Extract as Numbers from a String to Calculate

I have a vector as below

data <- c("6X75ML","24X37.5ML(KKK)", "6X2X75ML", "168X5CL (UUU)", "168X5CLKK (BUU)")

The above data is basically pack sizes of bottles in a case. What it means is in case of "6X75ML" there are 6 bottles of 75CL (Centi liters) liquid each. For "6X2X75ML" which is basically a promotion pack has 6X2 = 12 bottles of 75ML (Milli liters) in a case.
I need to find the volume in liters available in each case:
e.g -

"6X75CL" should be 4.5 Liters <- 6 * 75 * 0.01
"24X37.5ML" should be 0.9 Liters <- 6 * 37.5 * 0.001
"6X2X75CL" should be 9 Liters <- 6 * 2 * 75 * 0.001 [there can only be a maximum of 3 digits between the X]

ML - Milli liters
CL - Centi liters
LTR - Liters

1ML = 0.001LTR
1CL = 0.01LTR

In some cases as in the above there could be values like "168X5CLKK (BUU)" where only CL needs to be taken.

I have the below code helping me to find the quantity of bottles in a case

dataList <- strsplit(data, split="X")
Pack <- sapply(dataList, function(x) prod(as.numeric(head(x, -1))))

eg. "6X2X75ML" ll give 12; "168X5CL (UUU)" ll give 168 etc

strplit breaks up the vector along "X". The resulting list is fed to sapply which the performs an operation on all but the final element of each vector in the list. The operation is to transform the elements into numeric s and the multiply them. The final element is dropped using head(x, -1).

I am not able to find a way around the an efficient way to split the last element to get the volume.

Answer Source
data <- c("6X75ML","24X37.5ML(KKK)", "6X2X75ML", "168X5CL (UUU)", "168X5CLKK (BUU)")

Replace ML with X0.001

data <- gsub("ML", "X0.001", data)

Replace CL with X0.01

data <- gsub("CL", "X0.01", data)

split the string and do the multiplication

unlist(lapply(strsplit(gsub("[A-Z()]*$", "", data), "X"), function(x){ prod(as.numeric(x))}))


[1] 0.45 0.90 0.90 8.40 8.40
