petergensler petergensler - 3 months ago 23
Markdown Question

Print out Regular Expression Table with kable

I am trying to create a basic reference table in RMarkdown for regular expressions, and I'm having some trouble trying to concatenate strings of characters together. I'm not really sure if I should be using the `` instead of the "" to specify these strings literally, but I'm pretty stuck. Seems like I keep getting a ton of errors around syntax. Any help would be appreciated. Thanks.

This is how the table would look in markdown code:


POSIX Class Name| Description |Examples
------------- | ------------------------|------------------------
[:alpha:] |Alphanumeric characters |[[:alpha:][:digit:]] or [A-z09]
[:punct:] |Punctation Characters |! \ \" # $ % & '( ) * + , - . / : ; ? @ [ \ \ ] ^ _`{ | }~


However, some of these characters are hard to render in a string, (such as a null character). Below is my code for attempting to do this sample table in a dataframe.

#Create Charatctor class table
class_name <- c("[:alnum:]","[:alpha:]","[:ascii:]","[:blank:]","[:cntrl:]","[:digit:]","[:graph:]","[:lower:]","[:print:]",
"[:punct:]","[:space:]","[:upper:]","[:xdigit:]")
description <- c("Alphanumeric characters","Alphabetic characters","ASCII characters","Space and tab","Control characters",
"Digits","Visible characters (anything except spaces and control characters)","Lowercase letters","Visible characters and spaces (anything except control characters)","Punctuation and symbols.","All whitespace characters, including line breaks",
"Uppercase letters","Hexadecimal digits")
examples <- c(`[[:alpha:][:digit:]] or [A-z0-9]`,
`[[:lower:][:upper:]] or [A-z]`,
`ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz`,
`[ \t]`,
"\nor\r,[\x\0\0-\x1F\x7F]",
`or \d: digits, 0 1 2 3 4 5 6 7 8 9, equivalent to [0-9]`,
`[:alnum:] and [:punct:]`,
`[a-z]`,
`[[:alnum:][:punct:]\\s]`,
`! \ \" # $ % & '( ) * + , - . / : ; < = > ? @ [ \ \ ] ^ _`{ | }~` )

char_class <- data.frame(class_name,description,examples)
names(char_class) <- c("Class Name","Description","Examples")

#View the Table
kable(char_class, col.names = names(char_class), align = c('c','l'), caption = "Character Class Examples")


Errors I'm getting:

Error: '\x' used without hex digits in character string starting ""\nor\r,[\x"
Error: nul character not allowed (line 5)


Part of what I'm trying to do is to put together a reference guide for Regular Expressions in R, but it's pretty hard to print out these characters other than with normal markdown, but I'd like to get the data in a data frame if possible to use kable's formatting.

Any help would be greatly appreciated. Thanks.

42- 42-
Answer

After replacing all the single backslashes (with doubled backslashes) before items that were not listed R specials like "\n" and "\t" as described in ?Syntax and ?Quotes and omitting the last three of the class_name and description vectors (since they had no corresponding items the examples vector, it is possible to make a legal R dataframe:

class_name <- c("[:alnum:]","[:alpha:]","[:ascii:]","[:blank:]","[:cntrl:]","[:digit:]","[:graph:]","[:lower:]","[:print:]",
"[:punct:]","[:space:]","[:upper:]","[:xdigit:]")
description <- c("Alphanumeric characters","Alphabetic characters","ASCII characters","Space and tab","Control characters",
"Digits","Visible characters (anything except spaces and control characters)","Lowercase letters","Visible characters and spaces (anything except control characters)","Punctuation and symbols.","All whitespace characters, including line breaks",
"Uppercase letters","Hexadecimal digits")
examples <- c('[[:alpha:][:digit:]] or [A-z0-9]',
              '[[:lower:][:upper:]] or [A-z]',
              'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',
              '[ \t]',
              "\\nor\\r,[\\x\\0\\0-\\x1F\\x7F]",
              'or \\d: digits, 0 1 2 3 4 5 6 7 8 9, equivalent to [0-9]',
              '[:alnum:] and [:punct:]',
              '[a-z]',
              '[[:alnum:][:punct:]\\s]',
              '! \\ \\" # $ % & \\\' ( ) * + \\, - . / : ; < = > ? @ [ ] ^ _ { | } ~ ' )

char_class <- data.frame(class_name[1:10],description[1:10],examples)
names(char_class) <- c("Class Name","Description","Examples")
#
> char_class
   Class Name                                                        Description
1   [:alnum:]                                            Alphanumeric characters
2   [:alpha:]                                              Alphabetic characters
3   [:ascii:]                                                   ASCII characters
4   [:blank:]                                                      Space and tab
5   [:cntrl:]                                                 Control characters
6   [:digit:]                                                             Digits
7   [:graph:] Visible characters (anything except spaces and control characters)
8   [:lower:]                                                  Lowercase letters
9   [:print:] Visible characters and spaces (anything except control characters)
10  [:punct:]                                           Punctuation and symbols.
                                                                Examples
1                                       [[:alpha:][:digit:]] or [A-z0-9]
2                                          [[:lower:][:upper:]] or [A-z]
3                   ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
4                                                                  [ \t]
5                                        \\nor\\r,[\\x\\0\\0-\\x1F\\x7F]
6               or \\d: digits, 0 1 2 3 4 5 6 7 8 9, equivalent to [0-9]
7                                                [:alnum:] and [:punct:]
8                                                                  [a-z]
9                                                [[:alnum:][:punct:]\\s]
10 ! \\ \\" # $ % & \\' ( ) * + \\, - . / : ; < = > ? @ [ ] ^ _ { | } ~ 

The R print function (which is what is displaying these above) shows backslashes as "\". A character value of "\" contains a single character, namely a backslash. If you display it with cat you will see only that character, but there is no cat-method for items of class-"data.frame":

> print("\\")
[1] "\\"
> cat("\\")
\