Josh Whittington Josh Whittington - 10 months ago 46
Python Question

wget: How do I specify both --directory-prefix AND --output-document

When I use either the

alone with
, everything works as advertised.

$: wget -P "test"
Saving to: `test/logo3w.png'


$: wget -O "google.png"
2012-01-23 21:47:33 (1.20 MB/s) - `google.png' saved [7007/7007]

However, combining the two causes
to ignore

$: wget -P "test" -O "google.png"
2012-01-23 21:47:51 (5.87 MB/s) - `google.png' saved [7007/7007]

I've set a variable for both the directory (generated by the last chunk of the URL) and the filename (generated through a counting loop) such that
, or, for item 1,

When substituting this in to the code:

Popen(['wget', '-O', file, theImg], stdout=PIPE, stderr=STDOUT)

silently fails (on each iteration of the loop).

When I turn on debugging
and logging
-a log.log
, each iteration prints

DEBUG output created by Wget 1.13.4 on darwin10.8.0.

When I remove the
, the operation proceeds normally.

My question is:
Is there a way to

A) Specify both
(preferred) or

B) Insert a string to
-characters that doesn't cause it to fail?

Any help would be appreciated.

Answer Source

You should just pass dir/000.jpg to -O of wget:

import subprocess
import os.path

subprocess.Popen(['wget', '-O', os.path.join(directory, filename), theImg])

It's not completely clear from your question whether you were already doing something similar to this, but if you were and it still failed, I can think of two reasons:

  • The argument to -O contains a leading /, making wget fail because it doesn't have permission to randomly create directories in / (root).

  • The directory you're telling wget to write to doesn't exist. You can make sure it exists by creating it first using os.mkdir in the Python standard library.

You can also try removing the arguments stdout= and stderr= from the Popen call so you can see the errors directly, or print them using Python.