Jen Mer Jen Mer - 21 days ago 5
Perl Question

Passing an array as an argument from a Perl script to a R script

I am new to R and I have a Perl Script in which I want to call a R Script, which calculates something for me (not important what in this context). I want to give as arguments an input file, an array which contains some numbers and a number for a total number of clusters. medoid.r is the name of my R Script.

my $R_out;
$R_out = qx{./script/medoid.r $output @cluster $NUMBER_OF_CLUSTERS}


My current R code looks like this. Right now I just print cluster to see what is inside.

args <- commandArgs(TRUE)
filename = args[1]
cluster = as.vector(args[2])
number_of_cluster = args[3]

matrix = read.table(filename, sep='\t', header=TRUE, row.names=1, quote="")
print(cluster)


Is it possible to give an array as an argument? How can I save it in R? Right now only the first number of the array is stored and printed, but I would like to have every number in a vector or something similar.

Answer

In perl, qx will expect a string as an argument. You may certainly use an array to generate that string, but ultimately it will still be a string. You cannot "pass an array" to a system call, you can only pass command-line text/arguments.

Keep in mind, you are executing a system call running Rscript as a child process. The way you're describing the issue, there is no inter-process communication beyond the command line. Think of it this way: how would you type an array on the command line? You may have some textual way of representing an array, but you can't type an array on the command line. Arrays are stored and accessed in memory differently by various different languages, and thus are not really portable between two languages like you're suggesting.

One solution: all that said, there may be a simple solution for you. You haven't provided any information on the type of data you want to pass in your array. If it is simple enough, you may try passing it on the command line as delimited text, and then break it up to use in your Rscript.

Here is an Rscript that shows you what I mean:

args = commandArgs(trailingOnly=TRUE)
filename = args[1]
cluster <- c(strsplit(args[2],"~"))

sprintf("Filename: %s",filename)
sprintf("Cluster list: %s",cluster)

print("Cluster:")
cluster

sprintf("First Item: %s",cluster[[1]][1])

Save it as "test.r" and try executing it with "Rscript test.r test.txt one~two" and you'll get the following output (tested on Rscript 46084, OpenBSD):

[1] "Filename: test.txt"
[1] "Cluster list: c(\"one\", \"two\")"
[1] "Cluster:"
[[1]]
[1] "one" "two"

[1] "First Item: one"

So, all you'd have to do on the perl side of things is join() your array using "~" or any other delimiter- it is highly dependent on your data, and you haven't provided it.

Summary: re-think how you want to communicate between perl and Rscript. Consider sending the data as a delimited string (if it's the right size) and breaking it up on the other side. Look into IPC if that won't work, consider environment variables or other options. There is no way to send an array reference on the command-line.

Note: you may want to read up on security risks of different system calls in perl.