patrick patrick - 2 months ago 8
Bash Question

How to select specific string between a line with sed or awk, without print the whole line

I want to select a specific string within a line in an big txt file with sed or awk. But I got always the whole line and each line is 100.000+ characters long.

I got for example:

</div><div class="follow withFollow" id="user-id-1234567890"> <a href="/app/users/id-1234567890/test/ </div><div class="follow withFollow" id="user-id-0123456789"> <a href="/app/users/id-0123456789/test/" 12345678990 1234877890 1234767890 1245456780 123456790 withFollow" id="user-id-9873456789">


The only thing I want is the numbers in:

withFollow" id="user-id-1234567890">, withFollow" id="user-id-0123456789">, withFollow" id="user-id-9873456789">



output:

1234567890

0123456789

9873456789


I tried a lot like:


sed -n '/user-id-/,/">/p' FILE

awk '/user-id-/,/">/p' FILE

awk '/user-id-/,/">/p' FILE | grep -Eo "[0-9]{1,15}" > output.txt


With the last one I got only other numbers in the same line, so not only within "id="user-id-1234567890">"

Answer

You could use grep:

$ grep -oP 'user-id-\K[^"]*' file
1234567890
0123456789
9873456789

Or if you only want to match digits:

grep -oP 'user-id-\K\d*' file
Comments