Colleen - 1 year ago 63
SQL Question

# how do I set one variable equal to another in pig latin

I would like to do

register s3n://uw-cse344-code/myudfs.jar
-- load the test file into Pig
-- later you will load to other files, example:

-- parse each line into ntriples
ntriples = foreach raw generate FLATTEN(myudfs.RDFSplit3(line)) as (subject:chararray,predicate:chararray,object:chararray);

--filter 1
subjects1 = filter ntriples by subject matches '.*rdfabout\\.com.*' PARALLEL 50;
--filter 2
subjects2 = subjects1;


but I get the error:

2012-03-10 01:19:18,039 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: mismatched input ';' expecting LEFT_PAREN

so it seems pig doesn't like that. How do I accomplish this?

i don't think that kind of 'typical' assignment works in pig. It's not really a programming language in the strict sense - it's a high-level language on top of hadoop with some specialized functions.

i think you'll need to simply re-project the data from subjects1 to subjects2, such as:

subjects2 = foreach subjects1 generate $0,$1, \$2;


another approach might be to use the LIMIT function with some absurdly high parameter.

subjects2 = subjects2 LIMIT 100000000 ;

there could be a lot of reasons why that doesn't make sense, but it's a thought.

i sense you are considering doing things as you would in a programming language

• i have found that rarely works out like you want it to but you can always get the job done once you think like Pig.