Dnaiel Dnaiel - 1 year ago 82
Linux Question

Count how many lines in large files

Forgive me for such basic Q.

I commonly work with text files of ~ 20 Gb size and I find myself counting the number of lines in a given file very often.

The way I do it now it's just cat fname | wc -l, and it takes very long. Is there any solution that'd be much faster.

I work in a high performance cluster with Hadoop installed, i was wondering if maybe a mapreduce approach could help.

Any ideas?

I'd like the solution to be as simple as one line run, like the cat solution, but not sure how feasible it is...


Answer Source

Try: sed -n '$=' filename

Also cat is unnecessary: wc -l filename is enough in your present way.

