Vincent Vincent - 1 year ago 400
R Question

Why subset doesn't mind missing subset argument for dataframes?

Normally I wonder where mysterious errors come from but now my question is where a mysterious lack of error comes from.


numbers <- c(1, 2, 3)
frame <-

If I type

subset(numbers, )

(so I want to take some subset but forget to specify the subset-argument of the subset function) then R reminds me (as it should):

Error in subset.default(numbers, ) :

argument "subset" is missing, with no default

However when I type


(so the same thing with a
instead of a vector), it doesn't give an error but instead just returns the (full) dataframe.

What is going on here? Why don't I get my well deserved error message?

lmo lmo
Answer Source

R has a couple of object-oriented systems built-in. The simplest and most common is called S3. This OO programming style implements what Wickham calls a "generic-function OO." Under this style of OO, an object called a generic function looks at the class of an object and the applies the proper method to the object. (this is a brief sketch of S3. To get a better idea of how it works you might check out the relevant portion of the Advanced R site).

The subset function works on this principle. If the first argument to subset is an object with the data.frame class, then R uses the function It is defined as below:
function (x, subset, select, drop = FALSE, ...) 
    r <- if (missing(subset)) 
        rep_len(TRUE, nrow(x))
    else {
        e <- substitute(subset)
        r <- eval(e, x, parent.frame())
        if (!is.logical(r)) 
            stop("'subset' must be logical")
        r & !
    vars <- if (missing(select)) 
    else {
        nl <- as.list(seq_along(x))
        names(nl) <- names(x)
        eval(substitute(select), nl, parent.frame())
    x[r, vars, drop = drop]

Note that if the subset argument is missing, the first lines

    r <- if (missing(subset)) 
        rep_len(TRUE, nrow(x))

produce a vector of TRUES of the same length as the data.frame, and the last line

    x[r, vars, drop = drop]

feeds this vector into the row argument. This means that if you did not include a subset argument, then the subset function will return all of the rows of the data.frame.

As your error

Error in subset.default(numbers, )

shows, when you apply subset to a vector, R calls the subset.default method which is defined as

function (x, subset, ...) 
    if (!is.logical(subset)) 
        stop("'subset' must be logical")
    x[subset & !]

Here, an error is thrown when the subset argument is missing.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download