Micah Stubbs Micah Stubbs - 2 months ago 8
R Question

Error: could not find function "distinct" when using dplyr library for R on Windows 7

I'd like to get the unique values from a column in a dataframe. With the R package dplyr, it should be possible.

enter image description here

This

distinct(select(dataframe, column))
works great on my Mac. In RStudio on Windows 7 I encounter this:

enter image description here

when I run this R code:

library(dplyr)
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))


enter image description here

unique_values <- distinct(select(df, X1))


enter image description here

EDIT

Please check if
dplyr::distinct(select(df, X1))
works? – akrun

Of course - here is the console output:

enter image description here

EDIT

I've not used distinct, but perhaps unique would work for you?
unique(df$X1)
– NPE

It does work, and it's concise too! I would still like to understand this dplyr error...

enter image description here

EDIT

Please add the output of
sessionInfo()
instead. – Roland

enter image description here

EDIT

some comments note that
dplyr_0.2
version is old.
install.packages("dplyr")
gets a CRAN link to the old package. Now to figure out how to manually install
dplyr_0.3.0.2
.

enter image description here

Answer

Figured it out! Old R means old dplyr means no distinct() function.

To fix this, install the latest version of R:

  1. go to http://www.r-project.org
  2. click on 'CRAN'
  3. then choose the CRAN site that you like. I like Kansas: http://rweb.quant.ku.edu/cran/
  4. click on 'Download R for X' [where X is your operating system]
  5. follow the installation procedure for your operating system
  6. restart RStudio
  7. rejoice

source: this very nice answer

Then run the command install.packages("dplyr") in the RStudio Console.

Now you can create a dataframe and use the distinct() function to get the unique values from one of its columns:

library(dplyr)

# create a dataframe with some values
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))
df

# select a column from that dataframe and get a list of the unique values
unique_values <- distinct(select(df, X1))
unique_values

In the console you should see:

enter image description here

Thanks to David Arenburg and Richard Scriven for pointing our that dplyr-0.2 is old and lacks the distinct() function. This line of thinking led to the answer.