this is the first time I'm working with Apache Storm and I have the following problem. For my application, I have the requirement that the topology graph is different for each user that is using my application and there can also be multiple topology graphs per user.
Therefore, I had the idea to dynamically create the topology graph using the topology builder. For example, using the toplogy example from storm, this would just be:
TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("1", new TestWordSpout(true), 5);
builder.setSpout("2", new TestWordSpout(true), 3);
builder.setBolt("3", new TestWordCounter(), 3)
.fieldsGrouping("1", new Fields("word"))
.fieldsGrouping("2", new Fields("word"));
builder.setBolt("4", new TestGlobalCount())
Map defaultConf = Utils.readStormConfig();
Map conf = new HashMap();
conf.put(Config.NIMBUS_HOST, "IP to my remote cluster");
LocalCluster cluster = new LocalCluster();
cluster.submitTopology("mytopology", conf, builder.createTopology());
StormSubmitter.submitTopology("mytopology", conf, builder.createTopology());
java.lang.RuntimeException: Must submit topologies using the 'storm' client script so that StormSubmitter knows which jar to upload.
If you submit a topology to a remote cluster, the code (ie, class files) of all used spouts/bolts must be available to all nodes in the cluster. This is the purpose of the jar file that is submitted to the cluster. It has to contain all those files. Internally, Storm's Nimbus will distribute this jar to all worker nodes to make the code available to them.
The jar only needs to contain the set of classes you want to use (in your case
TestGlobalCount -- and maybe depended classes that are used within those three if you for example use some other library. Pay attention that nested jars are not supported, ie, a jar contained in a jar does not work -- for this, you would need to extract the classes of the inner jar first and add those classes directly into the final jar).
The structure of the topology is completely independent of the jar file. And yes, this is the jar you specify via the system property. The reason why many people build a jar that contains a
main together with a topology definition (that is often static but actually could be flexible, too) is that they submit the topology not via an IDE as you do, but via command line
bin/storm. For this to work, an entry point class contained in the jar that has a main method that assembled the topology structure is needed and the same jar is also used for code distribution of the class files because this works quite convenient (in contrast to providing a single entry point class and an additional jar file).