Andrew Budsock - 7 months ago 36

R Question

Example of My Data

I have three matrix csv data files that I need to flatten and combine in R, so that I have three columns (Lat, Long, Data). The code I have for this is in matlab, but I need to convert this to R. Any thoughts? This is the matlab code that does this:

`LON=csvread(‘LONGITUDE.csv’);`

LAT=csvread(‘LATITUDE.csv’);

SM=csvread(‘soil_moisture20151008.csv’);

xyz=zeros(101*210,3);

k=0;

for i=1:101

for j=1:210

k=k+1;

xyz(k,1)=LAT(i,j);

xyz(k,2)=LON(i,j);

xyz(k,3)=SM(i,j);

end

end

csvwrite(‘xyz.csv’,xyz);

So far this is how I have changed it in R:

`LON<-read.csv("LONGITUDE.csv", header = T)`

LAT<-read.csv("LATITUDE.csv", header = T)

ET<-read.csv("actual_ET20100101.csv")

xyz=matrix(3,101,210)

k=0

for (i in 1:101){

for (j in 1:210){

k=k+1

xyz[k,1]=LAT[i,j]

xyz[k,2]=LON[i,j]

xyz[k,3]=ET[i,j]

}

}

write.csv("xyz.csv",xyz);

I'm not sure what I'm doing wrong. Any guidance on this issue would be greatly appreciated.

Finally, I have a whole directory of files that I need to run this script on, so any ideas on how to apply this to a directory would be great. The LAT/LON files don't change, just the data files.

Thank you!!

Answer

If I am understanding your data correctly, you have a large number of matrix files, where each index (row/column position) is assigned to the same data value. That is, (1,1) in each matrix gives the value of interest for the 1st data point, and (1,2) gives values for a different data point.

In that case, you should just be able to convert them all to a matrix, extract the values as a vector, then stitch them together.

To illustrate, here are three identical data.frames (so that we can see if they align correctly:

```
A <- B <- C <-
data.frame(matrix(runif(36), nrow = 6))
```

Each data.frame is this:

```
X1 X2 X3 X4 X5 X6
1 0.2462450 0.6887587 0.216578122 0.5982332 0.2402868 0.9588999
2 0.5924075 0.7511237 0.813704807 0.6892747 0.6253069 0.4648226
3 0.7482773 0.4808986 0.006036452 0.6576487 0.5752148 0.5554258
4 0.8545323 0.6822942 0.654128179 0.6582181 0.8173544 0.5191778
5 0.1748737 0.7456279 0.992209169 0.4468014 0.3491022 0.9736064
6 0.7189847 0.3424291 0.581840006 0.1460138 0.8071445 0.2920479
```

Then, I put them all in a `list`

(named, so that the columns come out named):

```
myList <- list(A = A, B = B, C = C)
```

Then, we loop through the list, converting each data.frame to a matrix, then extracting the values as a vector. Then, I convert the resulting list to a data.frame to get the column/row behavior you likely want (data.frames are just lists with special properties; each column is an element of the list, but data.frames assumes the value orders match). Note that I am using `magrittr`

/`dplyr`

piping to simplify the nesting in the code:

```
flattened <-
lapply(myList, function(x){
as.matrix(x) %>%
as.numeric()
}) %>%
as.data.frame()
```

Then, the head of this (from my randomization) looks like:

```
A B C
1 0.2462450 0.2462450 0.2462450
2 0.5924075 0.5924075 0.5924075
3 0.7482773 0.7482773 0.7482773
4 0.8545323 0.8545323 0.8545323
5 0.1748737 0.1748737 0.1748737
6 0.7189847 0.7189847 0.7189847
```

Of note, you mentioned that you may have multiple data sources that you want to merge -- as long as you load them all up into this list, the approach will generate a column for each.