natehome natehome - 2 months ago 23
Bash Question

multiple cURL downloading with 404 detection

I'm trying to write a shell script that downloads files using cURL and detects if a url leads to a 404 error. If a url is a 404 then I want to save the url link or filename to a text file.

The url format is http://server.com/somefile[00-31].txt

I've been messing around with what I've found on google and currently have the following code:

#!/bin/bash

if curl --fail -O "http://server.com/somefile[00-31].mp4"
then
echo "$var" >> "files.txt"
else
echo "Saved!"
fi

Answer
#!/bin/bash

URLFORMAT="http://server.com/somefile%02d.txt"

for num in {0..31}; do
    # build url and try to download
    url=$(printf "$URLFORMAT" $num)
    CURL=$(curl --fail -O "$url" 2>&1)

    # check if error and 404, and save in file
    if [ $? -ne 0 ]; then
        echo $CURL | grep --quiet 'The requested URL returned error: 404'
        [ $? -eq 0 ] && echo "$url" >> "files.txt"
    fi

done

Here is a version which uses the arguments for URLs:

#!/bin/sh

for url in "$@"; do
    CURL=$(curl --fail -O "$url" 2>&1)
    # check if error and 404, and save in file
    if [ $? -ne 0 ]; then
        echo $CURL | grep --quiet 'The requested URL returned error: 404'
        [ $? -eq 0 ] && echo "$url" >> "files.txt"
    fi
done

You can use this like: sh download-script.sh http://server.com/files{00..21}.png http://server.com/otherfiles{00..12}.gif

The range expansion will work on bash shells.

Comments