sophie.h sophie.h - 4 months ago 22
R Question

vector of punctuation

For digits I can write a vector like this:

digits <- c("0","1","2","3","4","5","6","7","8","9")


How can I get an analogous vector of punctuation marks?

Answer

You could convert numbers to punctuation using Unicode code points (thanks Konrad, for point that out).

strsplit(intToUtf8(c(33:47, 58:64, 91:96)), "")[[1]]
# [1] "!"  "\"" "#"  "$"  "%"  "&"  "'"  "("  ")"  "*"  "+"  ","  "-"  "." 
#[15] "/"  ":"  ";"  "<"  "="  ">"  "?"  "@"  "["  "\\" "]"  "^"  "_"  "`"

some Ethiopian punctuation (0x1361:0x1367):

strsplit(intToUtf8(0x1361:0x1367), "")[[1]]
[1] "፡" "።" "፣" "፤" "፥" "፦" "፧"

If this is missing punctuation you want to use, you can look up the unicode points associated with the punctuation you want, and use it (e.g. somewhere like http://www.fileformat.info/info/unicode/category/Po/list.htm). You can also get the integers from utf8ToInt. For instance "~" isn't included above:

utf8ToInt("~")
#[1] 126