Micah Stubbs - 5 months ago 14

R Question

I'd like to get the unique values from a column in a dataframe. With the R package dplyr, it should be possible.

This

`distinct(select(dataframe, column))`

when I run this R code:

`library(dplyr)`

df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))

`unique_values <- distinct(select(df, X1))`

Please check if

`dplyr::distinct(select(df, X1))`

Of course - here is the console output:

I've not used distinct, but perhaps unique would work for you?

`unique(df$X1)`

It does work, and it's concise too! I would still like to understand this dplyr error...

Please add the output of

`sessionInfo()`

some comments note that

`dplyr_0.2`

`install.packages("dplyr")`

`dplyr_0.3.0.2`

Answer

Figured it out! Old `R`

means old `dplyr`

means no `distinct()`

function.

To fix this, install the latest version of R:

- go to http://www.r-project.org
- click on 'CRAN'
- then choose the CRAN site that you like. I like Kansas: http://rweb.quant.ku.edu/cran/
- click on 'Download R for X' [where X is your operating system]
- follow the installation procedure for your operating system
- restart RStudio
- rejoice

source: this very nice answer

Then run the command `install.packages("dplyr")`

in the RStudio Console.

Now you can create a dataframe and use the `distinct()`

function to get the unique values from one of its columns:

```
library(dplyr)
# create a dataframe with some values
df <- data.frame(replicate(4,sample(0:1,10,rep=TRUE)))
df
# select a column from that dataframe and get a list of the unique values
unique_values <- distinct(select(df, X1))
unique_values
```

In the console you should see:

Thanks to David Arenburg and Richard Scriven for pointing our that dplyr-0.2 is old and lacks the `distinct()`

function. This line of thinking led to the answer.