public_html public_html - 1 year ago 71
R Question

How to get specific column and row value from multiple text files?

I have folder "data" with 36 text files. Each file have 3000 more columns and rows. I want to get specific column and row value as a vector

Example, column 10 and row 10. I want looping to get value that column and row on 36 text files in folder "Data". I'm new to R.

Its my code in matlab

function data = readImage

data = [];
listImage = ls('*.hdf');

for i = 1:size(listImage,1)
name = strtrim(listImage(i,:));
citra = hdfread(name,'PIXEL DATA');
result = point(citra);
data = [data; result];



function p = point(image)

p = [];

for i = 3941 %column number
for j = 1595 %row number
image = citra(i,j);
p = [image];


I have successfully import files

temp = list.files(pattern="*.txt")
for (i in 1:length(temp)) assign(temp[i], read.table(temp[i]))

Answer Source

If you want to grab a specific row and column from a collection of files, I recommend you use data.table::fread(). It is made very simple with the select argument. With it you can select the column, coupled with skip and nrow to grab any number of rows. Try the following for reading only row 10, column 10 from each file -

datalist <- lapply(temp, fread, select = 10, skip = 9, nrow = 1)

If you have a header row in each of those files, you can change to skip = 10 instead of 9 or add header = TRUE. Then you can name each element with

names(datalist) <- paste0("temp", seq_along(datalist)) 

Now you've got a list with named elements that can be accessed with the $ operator by name. This is usually better than assigning them all to the global environment.

The list elements in datalist will be data tables. If you need single atomic vector elements then the following may be better -

datalist <- lapply(temp, function(x) fread(x, select=10, skip=9, nrow=1)[[1L]])

With this you could use unlist(datalist) to drop the list to a named atomic vector with all the values, should you not want them in a list.

Another thing to take into consideration is that if you have row names in the file you'll need to compensate for those too. If you play around with the select and skip arguments it won't take long to get it right.

For a full example of these methods, we can look at the following. Here we are grabbing row 3, column 2 from the iris dataset, three times.

## write iris to a csv file
write.csv(iris, file = "iris.csv", quote = FALSE, row.names = FALSE)

temp <- rep("iris.csv", 3)
datalist <- lapply(temp, function(x) fread(x, select=2, skip=3, nrow=1)[[1L]])
names(datalist) <- paste0("temp", seq_along(datalist))

## results
# $temp1
# [1] 3.2
# $temp2
# [1] 3.2
# $temp3
# [1] 3.2
# temp1 temp2 temp3 
#   3.2   3.2   3.2 

## compare to
iris[3, 2]
[1] 3.2