John John - 1 year ago 193
Python Question

How do I import a local python module when running a python script in Oozie?

I have two python files - and The first file references the second (

from my_python_B import *

I'm executing the first python file from a shell action in Oozie (i.e. the script is simply
), and am receiving the following error:

Traceback (most recent call last):
File "", line 2, in <module>
from my_python_B import *
ImportError: No module named my_python_B
Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.ShellMain], exit code [1]

Both python files are located under the same directory in HDFS. How can I get this import statement to work?

Answer Source

I faced the same issue and the way I worked around this problem was by setting the environment variable PYTHONPATH to the current working directory inside the shell script before I execute my python code

export PYTHONPATH=`pwd`

Make sure that in your shell action you have included all the required python modules inside the <file></file> tags. Assuming that you have a shell script called (inside which you have the aforementioned commands) your workflow.xml file should look something like this

<workflow-app name="shellTest" xmlns="uri:oozie:workflow:0.4">
    <start to="shell-action"/>
    <action name="shell-action">
        <shell xmlns="uri:oozie:shell-action:0.2">
        <ok to="end"/>
        <error to="shell-action-failed"/>

    <kill name="shell-action-failed">
        <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>

    <end name="end" />