I'm facing the following problem. I've got a table with a column called 'title'.
The title column contains rows with values like 'To kill a mockingbird (1960)'.
So basically the format of the column is [title] ([year]). What I need is two
columns: title and year, year without brackets.
One other problem is that some rows contain a title including brackets. But
basically the last 6 characters of every row are year wrapped in brackets.
How do I create the two columns, title and year?
what i have is: Books$title <- c("To kill a mockingbird (1960)", "Harry Potter and the order of the phoenix (2003)", "Of mice and men (something something) (1937)")
Title
To kill a mockingbird (1960)
Harry Potter and the order of the phoenix (2003)
Of mice and men (something something) (1937)
Title Year
To kill a mockingbird (1960)
Harry Potter and the order of the phoenix (2003)
Of mice and men (something something) (1937)
We can work around substr
ing the last 6 characters.
First we recreate your data.frame
:
df <- read.table(h=T, sep="\n", stringsAsFactors = FALSE,
text="
Title
To kill a mockingbird (1960)
Harry Potter and the order of the phoenix (2003)
Of mice and men (something something) (1937)")
Then we create a new one. The first column, Title
is everything from df$Title
but the last 7 characters (we also remove the trailing space). The second column, Year
is the last 6 characters from df$Title
and we remove any space, opening or closing bracket. (gsub("[[:punct:]]", ...
) would have worked as well.
data.frame(Title=substr(df$Title, 1, nchar(df$Title)-7),
Year=gsub(" |\\(|\\)", "", substr(df$Title, nchar(df$Title)-6, nchar(df$Title))))
Title Year
1 To kill a mockingbird 1960
2 Harry Potter and the order of the phoenix 2003
3 Of mice and men (something something) 1937
Does that solve your problem?