Linux Question

Trying to grep url with random ending

Url always ends with 8 random characters.

I can easily grep https://websitef.com/ with

grep https://websitef.com/ test.txt

but cant figure out how to get those 8 random characters that come after

This is what it looks like in file:

..."num_comments": 16, "url": "https://websitef.com/vkl6owav", "_has_fetched": true.....

Answer Source

If your input is JSON, you may want to consider using a JSON-specific tool.

Let' consider your test file:

$ cat file
..."num_comments": 16, "url": "https://websitef.com/vkl6owav", "_has_fetched": true.....    

To grep the string that you want:

$ grep -Po '(?<=https://websitef.com/)\w+' file

\w+ matches a string of word characters. (?<=https://websitef.com/) is a look-behind that restricts the match to characters that follow the string https://websitef.com/. This requires GNU grep.

If GNU grep is not available, sed can be used:

$ sed -En 's|.*https://websitef.com/([[:alnum:]]+).*|\1|p' file

If you wanted the whole URL, not just the random string:

$ grep -o 'https://websitef.com/[[:alnum:]]*' file
