Friday, November 21, 2014

Setup Storm in a cluster

To setup storm in a cluster, suppose we have the following computers networked together:

192.168.2.1
192.168.2.2
192.168.2.3
192.168.2.4

Let's say we use computer 192.168.2.2 and 192.168.2.4 to run the zookeepers which maintain states and communications for storm cluster. We also use 192.168.2.1 as the storm master node (i.e, the nimbus host) and 192.168.2.2 and 192.168.2.3 (i.e., the supervisor nodes) as the storm work nodes. Furthermore, let's suppose we work on 192.168.2.4 as a client which submit storm topology to the storm cluster.

Firstly we need to setup zookeeper in the cluster consisting of 192.168.2.2 and 192.168.2.4. Following this link for instructions to setup the zookeeper in a cluster: http://czcodezone.blogspot.sg/2014/11/setup-zookeeper-in-cluster.html

Next on computers 192.168.2.1, 192.168.2.2, 192.168.2.3, and 192.168.2.4 which forms the storm cluster, let's setup the storm using instruction for storm setup in a single machine from this link: http://czcodezone.blogspot.sg/2014/11/setup-storm-on-single-machine-running.html

Once the storm has been set up in the 4 computers, on each computer, navigate to the STORM_HOME/conf directory and edit the storm.yaml using the following commands:

> cd $STORM_HOME/conf
> gedit storm.yaml

In the storm.yaml, modify the settings as followings:

storm.zookeeper.servers:
  - "192.168.2.2"
  - "192.168.2.4"
storm.zookeeper.port: 2181
nimbus.host: "192.168.2.1"
storm.local.dir: "/tmp/storm-data"
java.library.path: "/usr/local/lib"
storm.messaging.transport: baketype.storm.messaging.netty.Context
supervisor.slots.ports:
 - 6700
 - 6701
 - 6702
 - 6703

Save and close the storm.yaml in each of the storm cluster computers. Now on the client computer (i.e. 192.168.2.4), create a .storm folder its user root folder and copy the updated STORM_HOME/conf/storm.yaml into the .storm folder created:
> cd $HOME
> mkdir .storm
> cp $STORM_HOME/conf/storm.yaml $HOME/.storm

Now on the master node computer 192.168.2.1, run the following command to start the master node:

> cd $STORM_HOME
> bin/storm nimbus

Now on the two supervisor node computer (192.168.2.2 and 192.168.2.3), run the following command to start the work nodes:
> cd $STORM_HOME
> bin/storm supervisor

At this point, the storm has been setup and running. To view the status of the storm cluster, choose one of the computer (or the client computer) and run the following command to start the storm web front ui:

> cd $STORM_HOME
> bin/storm ui

Now navigate to the http://localhost:8080 on that computer's web browser, you should see the storm cluster status

Now we are ready to submit a storm topology, on the client computer 192.168.2.4, run the following command to submit the topology
> cd $STORM_HOME
> bin/storm jar [fullPathToTopoloyJarFile] [mainClassInJar] [arg0] [arg1] ...

Now navigate back to the http://localhost:8080, you should see the submitted topology running in the cluster.

No comments:

Post a Comment