John Doe - 3 years ago 168
R Question

# splitting column in R

I'm facing the following problem. I've got a table with a column called 'title'.
The title column contains rows with values like 'To kill a mockingbird (1960)'.

So basically the format of the column is [title] ([year]). What I need is two
columns: title and year, year without brackets.

One other problem is that some rows contain a title including brackets. But

basically the last 6 characters of every row are year wrapped in brackets.

How do I create the two columns, title and year?

what i have is: Books\$title <- c("To kill a mockingbird (1960)", "Harry Potter and the order of the phoenix (2003)", "Of mice and men (something something) (1937)")

``````Title
To kill a mockingbird (1960)
Harry Potter and the order of the phoenix (2003)
Of mice and men (something something) (1937)
``````

what I need is: Books\$title <- c("To kill a mockingbird", "Harry Potter and the order of the phoenix", "Of mice and men (something something)")
Book\$year <- c("1960", "2003", "1937")

``````Title                                              Year
To kill a mockingbird                             (1960)
Harry Potter and the order of the phoenix         (2003)
Of mice and men (something something)             (1937)
``````

We can work around `substr`ing the last 6 characters.

First we recreate your `data.frame`:

``````df <- read.table(h=T, sep="\n", stringsAsFactors = FALSE,
text="
Title
To kill a mockingbird (1960)
Harry Potter and the order of the phoenix (2003)
Of mice and men (something something) (1937)")
``````

Then we create a new one. The first column, `Title` is everything from `df\$Title` but the last 7 characters (we also remove the trailing space). The second column, `Year` is the last 6 characters from `df\$Title` and we remove any space, opening or closing bracket. (`gsub("[[:punct:]]", ...`) would have worked as well.

``````data.frame(Title=substr(df\$Title, 1, nchar(df\$Title)-7),
Year=gsub(" |\\(|\\)", "", substr(df\$Title, nchar(df\$Title)-6, nchar(df\$Title))))

Title Year
1                     To kill a mockingbird 1960
2 Harry Potter and the order of the phoenix 2003
3     Of mice and men (something something) 1937
``````