Marre -4 years ago 113
R Question

# Getting randomly latitude/longitude data in R

I simulated a dataset for an online Retail market. The customer can purchase their products in different stores in Germany (e.g. Munich, Berlin, Hamburg..) and in Online stores. To get the latitude/longitude data from the cities I use

`geocode`
from the
`ggmap package`
. But customers who purchase Online are able to purchase them all over the country. Now I want to generate random latitude/longitude data within Germany for the online purchases, to map them later with shiny leaflet. Is there any way to do this?

My df looks like this:

``````View(df)
ClientId   Store ... lat   lon
1          Berlin    52    13
2          Munich    48    11
3          Online    x     x
4          Online    x     x
``````

But my aim is a data frame for example like this:

``````ClientId   Store ... lat   lon
1          Berlin    52    13
2          Munich    48    11
3          Online    50    12
4          Online    46    10
``````

Is there any way to get these random latitude/longitude data and integrate it to my data frame?

Your problem is twofold. First of all, as a newbie to R, you are not yet used to the semantics required to do what you need. Fundamentally, what you are asking to to do is:

• First, Identify which orders are sourced from Online
• Second, generate a random lat and lon for these orders

First, to identify elements of your data frame which fit a criterion, you use the `which` function. Thus, to find the rows in your data frame which have the Store column equal to "Online", you do:

``````df[which(df\$Store=="Online")]
``````

To update the lat or lon for a particular row, we need to be able to access the column. To get values of a particular column, we use `\$`. For example, to get the lat values for the online orders you use:

``````df\$lat[which(df\$Store=="Online")]
``````

Great! The problem now diverges and increases in complexity. For the new values, do you want to generate simple values to accomplish your demo, or do you want to come up with new logic to generate spacial results in a given region? You indicate you would like to generate data points in Germany itself, however, to accomplish that is beyond the scope of this question. For now, we will consider the easy example of generating values in a bounded box and updating your `data.frame` accordingly.

To generate integer values in a given range, we can use the `sample` function. Assuming that you would want `lat` values in the range of 45 and 55 and `lon` values in the range of 9 to 14 we can do the following:

``````df\$lat[which(df\$Store=="Online")]<-sample(45:55,length(which(df\$Store=="Online")))
df\$lon[which(df\$Store=="Online")]<-sample(9:14,length(which(df\$Store=="Online")))
``````

Reading this code, we have update the `lat` values in `df` that are "Online" orders with a vector of random numbers from 48:52 that is the proper length (the number of "Online" orders).

If you wanted more decimal precision, you can use similar logic with the `runif` function which samples from the uniform distribution and `round` to get the appropriate amount of precision. Good luck!

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download