Analyzer Analyzer - 8 days ago 6
R Question

Extracting desired text from the string using regular expression using R

I am reading the column of data.csv file and trying to extract desired text before the last forward slash from strings in column using regular expression. My column data looks like:

class:

org/apache/flume/api/virtual/loeadBalancing.java
org/apache/flume/file/Channel/testing/test2.java
org/apache/flume/recoverable/memory/test1.java
org/apache/flume/source/scribe/LogEntry.java
org/apache/flume/source/jms/TestJMSMessageConsumer.java


My desired output is:

org/apache/flume/ap/virtual
org/apache/flume/file/Channel/testing
org/apache/flume/recoverable/memory
org/apache/flume/source/scribe
org/apache/flume/source/jms/TestJMSMessageConsumer


So, basically, I am trying to extract sub string from class colum that excludes the text and backlash appearing after it. My current code is:

dfkg<- gsub( "\\.[^/]*$", "", data$class)


Can some one correct my regular string to generate the desired output?

Answer

We can do

sub("\\/[^/]+$", "", data$class)
#[1] "org/apache/flume/api/virtual"          "org/apache/flume/file/Channel/testing" "org/apache/flume/recoverable/memory"  
#[4] "org/apache/flume/source/scribe"        "org/apache/flume/source/jms"      
Comments