Matthias J. Sax Matthias J. Sax - 1 month ago 40
Python Question

How to run WordCountTopology from storm-starter in Intellij

I work with Storm for a while already, but want to get started with development. As suggested, I am using Intellij (up to now, I was using Eclipse and did only write Topologies against Java API).

I was also looking at
https://github.com/apache/storm/tree/master/examples/storm-starter#intellij-idea

This documentation is not complete. I was not able to run anything in Intellij first. I could figure out, that I need to remove the scope of storm-core dependency (in storm-starter pom.xml). (found here: storm-starter with intellij idea,maven project could not find class)

After that I wass able to build the project. I can also run ExclamationTopology with no problems within Intellij. However, WordCountTopology fails.

First I got the following error:


java.lang.RuntimeException: backtype.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read.
Serializer Exception:
Traceback (most recent call last):
File "splitsentence.py", line 16, in
import storm
ImportError: No module named storm


Update: installing
pyhton-storm
is not required to make it work


I was able to resolve it via: apt-get install python-storm (from StackOverflow)

However, I don't speak Python and was wondering what the problem is and why I could resolve it like this. Just want to get deeper into it. Maybe someone can explain.


Unfortunately, I am getting a different error now:


java.lang.RuntimeException: backtype.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read.
Serializer Exception:
Traceback (most recent call last):
File "splitsentence.py", line 18, in
class SplitSentenceBolt(storm.BasicBolt):
AttributeError: 'module' object has no attribute 'BasicBolt'


I did not find any solution on the Internet. Asking at
dev@storm.apache.org
did not help either. I go the following suggestion:


I think that it was always assumed that topology would always be invoked through storm-command line. Thus working directory would be ${STORM-INSTALLATION}/bin/storm Since storm.py is in the this directory, splitSentence.py would be able to find storm modules. Can you set the working directory to a path, where storm.py is present and then try. If it works, we can add it later to the documentation


However, chancing the working directory did not solve the Problme.

And as I am not familiar with Python and as I am new to Intellij, I am stuck now. Because ExclamationTopology runs, I guess my basic setup is correct.

What do I do wrong? It is possible at all to run
WordcountTopology
in LocalCluster in Intellij?

Answer

Unfortunately, AFAIK you can't run multilang feature with LocalCluster without having packaged file.

ShellProcess relies on codeDir of TopologyContext, which is used by supervisor. Workers are serialized to stormcode.ser, but multilang files should extracted to outside of serialized file so that python/ruby/node/etc can load it.

Accomplishing this with distribute mode is easy because there's always user submitted jar, and supervisor can know it is what user submitted.

But accomplishing this with local mode is not easy cause supervisor cannot know user submitted jar, and users can run topology to local mode without packaging.

So, Supervisor in local mode finds resource directory ("resources") from each jars (which ends with "jar") in classpath, and copy first occurrence to codeDir.

storm jar places user topology jar to the first of classpath, so it can be run without issue.

So normally, it's natural for ShellProcess to not find "splitsentence.py". Maybe your working directory or PYTHONPATH did the trick.

Comments