iJava iJava - 5 months ago 33
Linux Question

Linux / Unix — pipe the output to join

I am very new at Linux / Unix stuff, and time to time I am doing some sort of exercise.
I was doing my exercises till I came up to one part.


Plain sort quotes.t5 and pipe the output to join.
In join use field separator, read from stdin and from quotes.comms, output to quotes.t6


The problem is, I don't understand what this part is asking.


A few days ago I have ran this command on the server:

wget 'http://finance.yahoo.com/d/quotes.csv?s=BARC.L+0992.HK+RHT+AAPL+ADI+AEIS+AGNC+AMAT+AMGN+AMRN+ARCC+ARIA+ARNA+ATVI+BBRY+BIDU+BRCD+BRCM+BSFT+CENX+CERE+CMCSA+COCO+CSCO+CSIQ+CSOD+CTRP+CTSH+CYTX+DRYS+DTV+DXM+EA+EBAY+EGLE+ENDP+ESRX+EXPD+EXTR+FANG+FAST+FB+FCEL+FITB+FLEX+FOXA+FSLR+FTR+GALE+GERN+GILD+GMCR+GRPN+GTAT+HBAN+HDS+HIMX+HSOL+IMGN+INTC+JASO+JBLU+JDSU+KERX+LINE+LINTA+MDLZ+MNKD+MPEL+MSFT+MU+MXIM+MYL+NFLX+NIHD+NUAN+NVDA+ONNN+ORIG+OTEX+OXBT+PENN+PMCS+PSEC+QCOM+RBCN+REGN+RFMD+RSOL+SCTY+SINA+SIRI+SNDK+SPWR+SYMC+TSLA+TUES+TWGP+TXN+VOLC+WEN+YHOO+ZNGA&f=nab' -O quotes.csv


But the produced file quotes.csv was not good enough to get insight into finances and stuff so I need some help from you!

Checkpointing. When finished this lesson you must get this:

$ sha256sum -c quotesshasums
quotes.t1: OK
quotes.t2: OK
quotes.t3: OK
quotes.t4: OK
quotes.t5: OK
quotes.t6: OK



  • quotes.csv

    We have a source file with stock prices data
    Lines are terminated with CRLF, which is not Unix style. Make it LF terminated.
    Means remove CR (\r) byte from each line. To do this use sed (man sed) substitute
    command, output to quotes.t1
    More info at http://en.wikipedia.org/wiki/Newline

  • Run checkpoint to test if quotes.t1 is OK.

  • Use head and tail commands to output all except first and last line of file
    quotes.t1 to quotes.t2

  • Make fields separated with pipe (vertical bar |) instead of comma.

    sed -re 's/,([0-9.]+),([0-9.]+)/|\1|\2/g' quotes.t2 > quotes.t3

  • Numeric sort by third field(key), don't forget the new separator, output to quotes.t4q
    Output last five lines, cut it leaving first and third fields in result.
    quotes.t5

  • Plain sort quotes.t5 and pipe the output to join.
    In join use field separator, read from stdin and from quotes.comms, output to quotes.t6




If needed, I can post all parts of this exercise, but I am thinking you may know what I need to do at this part.
Mainly what I need to know what that join means. I searched on Google about this, but still I don't get it.

Answer

Transferring an abbreviated version of the comments into an answer.

The original version of the question was asking about:

Plain sort quotes.t5 and pipe the output to join. In join use field separator, read from stdin and from quotes.comms, output to quotes.t6

You need to know that join is a command. It can read from standard input if you specify - as one of its two input file names.

The steps are then, it seems to me, quite straight-forward:

sort quotes.t5 | join -t'|' - quotes.comm > quotes.t6

or perhaps:

sort quotes.t5 | join -t'|' quotes.comm - >quotes.t6

I'm not sure how you tell which is required, except by interpreting 'read from stdin and quotes.comms' as meaning standard input first and quotes.comms second.