You are supposed to use JDK 1.6.X but I left tthe 1.7.X in place. Apache Hadoop components have bugs with other JDKs besides 1.6.x, this is mostly for Apache Zookeeper compatibility. I have no information on the other Storm components and how the JDKs are compatible with different versions of clojure.
[dc@vivian-y1639vf3 ~]$ java -version
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build
1.7.0_07-b10)
Java HotSpot(TM) 64-Bit Server VM
(build 23.3-b01, mixed mode)
Remove JDK 7 and OpenJDK. While these JDKs may work there are known issues with OpenJDK and Hadoop. Since Zookeeper is a Hadoop component it wasn't worth the time investment to figure out if they were compatible or not. Same with JDK7.
Download java-6u34-linux-x64.bin,
>sudo chmod 777
java-6u-linux-x64.bin
>./java-6u34-linux-x64.bin
Most Hadoop components only run in
JDK-6 not JDK 7 which is the more recent versions. Zookeeper is a
hadoop related component, it may work in JDK7 but the core Hadoop
programs like HDFS haven't been debugged on JDK 7 yet. Most hadoop
related components also aren't guaranteed to run on OpenJDK. There
are known issues with OpenJDK which nobody is actively working on
debugging.
Once Java 6 JDK is installed then edit
.bashrc to set the environment variables
>cd
>nano .bashrc
If nano isnt installed,
>sudo yum install nano
Edit .bashrc to add JAVA_HOME and add
$JAVA_HOME to PATH
My .bashrc file looks like:
[dc@vivian-y1639vf3 ~]$ cat .bashrc
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
# User specific aliases and functions
export
MONGO_HOME=/home/dc/mongodb-linux-x86_64-2.2.0
export JAVA_HOME=/home/dc/jdk1.6.0_34
#export STORM_HOME=/home/dc/storm-0.8.1
export
PATH=$PATH:$PATH/bin:$JAVA_HOME/bin
Read the Google Groups for updates on Storm installs. Fortunately there was one.
Lets see if the Storm packaging works.
says to install zookeeper-cluster from
CDH 4, which is here;
Run this command: >sudo wget -O /etc/yum.repos.d/bigtop.repo http://archive.apache.org/dist/bigtop/bigtop-0.5.0/repos/centos6/bigtop.repo
You should see a new repo file in /etc/yum.repos.d/bigtop.repo which was downloaded after the above command.
To test we can install zookeeper run:
> sudo yum update
You don't have to say Y after the update. Say N and your bigtop.repo is now registered with the yum service.
>yum search zookeeper
============================ N/S Matched: zookeeper ============================
zookeeper-server.noarch : The Hadoop Zookeeper server
zookeeper.noarch : A high-performance coordination service for distributed
: applications.
To test we can install zookeeper run:
> sudo yum update
You don't have to say Y after the update. Say N and your bigtop.repo is now registered with the yum service.
>yum search zookeeper
============================ N/S Matched: zookeeper ============================
zookeeper-server.noarch : The Hadoop Zookeeper server
zookeeper.noarch : A high-performance coordination service for distributed
: applications.
Install zookeeper using:
>sudo yum install zookeeper-server.noarch zookeeper.noarch
On centos we can use RPM or YUM to
install. Use YUM since this takes care of the dependencies if there
are any.
YUM works using files ending in .repo.
The convention is to store all repo files for centos under
/etc/yum.repos.d/ for all s/w packages. These repo files describe the
location of the URL of where to download the s/w and when you do a
sudo yum update the yum program reads all the repo files in this
directory and assumes all of these programs are installed onto the
system. The yum program also adds updates from the repo URL when they
are available. From the above link, the CDH 4 link, the repo file
for CDH4 of which zookeeper is one component should look like:
Click on the Add the CDH 4 repository
under the heading On Red Hat-compatible Systems
Click on the RedHat/CentOS 6 link under
the heading To add the CDH4 repository:
You should end up at this url:
http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/cloudera-cdh4.repo
Showing this:
[cloudera-cdh4] name=Cloudera's Distribution for Hadoop, Version 4 baseurl=http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/4/ gpgkey = http://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera gpgcheck = 1
If all the above screenshots can't be
read, just cut and paste the above contents into a file cdh.repo and
store it under /etc/yum.repos.d/cdh.repo
Then do
> sudo yum update
system will do a lot of stuff,
downloading, updating system, just say Y to the prompts
after done in many minutes, verify
zookeeper-server is there using
>yum search zookeeper-server
you should see a verification you have
this:
[dc@vivian-y1639vf3 ~]$ yum search
zookeeper-server
Loaded plugins: fastestmirror,
refresh-packagekit, security
Loading mirror speeds from cached
hostfile
* base: mirror.nwresd.org
* extras: centos.mirror.lstn.net
* updates: centos.mirrors.hoobly.com
======================== N/S Matched:
zookeeper-server =========================
zookeeper-server.noarch : The Hadoop
Zookeeper server
Name and summary matches only, use
"search all" for everything.
[dc@vivian-y1639vf3 ~]$
OK, now install zookeeper-server.
>sudo yum install zookeeper-server
and prompt Y when it comes back asking.
Should see below:
Is this ok [y/N]: y
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing :
bigtop-utils-0.4+352-1.cdh4.1.0.p0.28.el6.noarch 1/3
Installing :
zookeeper-3.4.3+25-1.cdh4.1.0.p0.28.el6.noarch 2/3
Installing :
zookeeper-server-3.4.3+25-1.cdh4.1.0.p0.28.el6.noarch 3/3
Verifying :
zookeeper-3.4.3+25-1.cdh4.1.0.p0.28.el6.noarch 1/3
Verifying :
bigtop-utils-0.4+352-1.cdh4.1.0.p0.28.el6.noarch 2/3
Verifying :
zookeeper-server-3.4.3+25-1.cdh4.1.0.p0.28.el6.noarch 3/3
Installed:
zookeeper-server.noarch
0:3.4.3+25-1.cdh4.1.0.p0.28.el6
Dependency Installed:
bigtop-utils.noarch
0:0.4+352-1.cdh4.1.0.p0.28.el6
zookeeper.noarch
0:3.4.3+25-1.cdh4.1.0.p0.28.el6
Complete!
[dc@vivian-y1639vf3 ~]$
OK lets test zookeper first .Usually
these Hadoop components work out of the box but they may require some
configuration parameters to be modified first.
One of the advantages of using the
Cloudera CDH is they do all the compatibility for us. We know
zookeeper works with all the other components like Hadoop, bigtop,
etc.. .we don't care about this b/c we are just going to use
Zookeeper by itself with storm.
One of the confusing parts using the
CDH or any YUM based installer is there are certain conventions which
nobody really explains. The config files for a YUM component usually
go under /etc. For zookeeper they are under /etc/zookeeper/conf
There are 4 files;
[dc@vivian-y1639vf3 conf]$ ls
configuration.xsl log4j.properties
zoo.cfg zoo_sample.cfg
[dc@vivian-y1639vf3 conf]$
OK this is good. log4j.properties is
the standard logger file for Java programs where it explains the
formatting and location. We don't need to mess with this. Zoo.cfg is
important. The last line in that file specified port 2181 for the
default client port. We need to connect to it to make sure our
zookeeper server works and we may have to configure storm to make
sure it connects here. This is for 1 zookeeper instance, not sure if
storm is happy with only 1. Usually zookeeper servers run in a group
so if one fails the others still provide state data for the rest of
the cluster. It is designed to be a single point of failure to make
the administration of the cluster easier.
If storm requires a cluster of
zookeeper servers in an ensemble then we can set that up using these
instructions:
http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html
This is out of scope for now.
OK lets try starting zookeeper. The
convention for CDH installs is to use
>sudo service zookeeper-server start
OK I get an error like this;
[dc@vivian-y1639vf3 ~]$ sudo service
zookeeper-server start
[sudo] password for dc:
JMX enabled by default
Using config:
/etc/zookeeper/conf/zoo.cfg
ZooKeeper data directory is missing at
/var/lib/zookeeper fix the path or run initialize
Time to do a websearch, run the server
init first. Didnt know that.
[dc@vivian-y1639vf3 ~]$ sudo service
zookeeper-server init
No myid provided, be sure to specify it
in /var/lib/zookeeper/myid if using non-standalone
OK another error. Another web search.
Web search kinda vague, just create a myid file and see what happens.
[dc@vivian-y1639vf3 ~]$ sudo nano
/var/lib/zookeeper/myid
I just entered an integer, 1 and
saved the file. This integer corresponds to the zookeeper conf.dist file which is the zookeeper config file in distributed mode where you want to run more than 1 zookeeper. Here is an example of a cluster of 4 zookeepers:
maxClientCnxns=50
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/var/lib/zookeeper
# the port at which the clients will connect
clientPort=2181
server.4=172.16.144.252:2888:3888
server.3=172.16.144.251:2888:3888
server.2=172.16.144.250:2888:3888
server.1=172.16.144.249:2888:3888
maxClientCnxns=50
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
dataDir=/var/lib/zookeeper
# the port at which the clients will connect
clientPort=2181
server.4=172.16.144.252:2888:3888
server.3=172.16.144.251:2888:3888
server.2=172.16.144.250:2888:3888
server.1=172.16.144.249:2888:3888
NOTE: The server.1 setting refers to the 1 in the myid setting you just created.
Start again:
[dc@vivian-y1639vf3 ~]$ sudo service
zookeeper-server start
JMX enabled by default
Using config:
/etc/zookeeper/conf/zoo.cfg
Starting zookeeper ... STARTED
OK server looks to be up. We should do
2 things for verification which exists for all Hadoop components:
- look at the logs to make sure there is nothing funky going on.
- See if there is a HTTP port where there is an administration interface to show everything is good.
By convention all logs in a YUM
installed component are under /var/log, in zookeeper's case we have 2
logs, zookeeper.log and zookeeper.out:
[dc@vivian-y1639vf3 ~]$ ls
/var/log/zookeeper
zookeeper.log zookeeper.out
[dc@vivian-y1639vf3 ~]$
Lets just see what is in them:
[dc@vivian-y1639vf3 ~]$ cat
/var/log/zookeeper/zookeeper.log
2012-10-05 22:38:29,743 [myid:] - INFO
[main:QuorumPeerConfig@101] - Reading configuration from:
/etc/zookeeper/conf/zoo.cfg
2012-10-05 22:38:29,758 [myid:] - INFO
[main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2012-10-05 22:38:29,758 [myid:] - INFO
[main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0
2012-10-05 22:38:29,759 [myid:] - INFO
[main:DatadirCleanupManager@101] - Purge task is not scheduled.
2012-10-05 22:38:29,759 [myid:] - WARN
[main:QuorumPeerMain@118] - Either no config or no quorum defined in
config, running in standalone mode
2012-10-05 22:38:29,775 [myid:] - INFO
[main:QuorumPeerConfig@101] - Reading configuration from:
/etc/zookeeper/conf/zoo.cfg
2012-10-05 22:38:29,776 [myid:] - INFO
[main:ZooKeeperServerMain@100] - Starting server
2012-10-05 22:38:29,792 [myid:] - INFO
[main:Environment@100] - Server
environment:zookeeper.version=3.4.3-cdh4.1.0--1, built on 09/29/2012
17:54 GMT
2012-10-05 22:38:29,794 [myid:] - INFO
[main:Environment@100] - Server environment:host.name=vivian-y1639vf3
2012-10-05 22:38:29,795 [myid:] - INFO
[main:Environment@100] - Server environment:java.version=1.7.0_07
2012-10-05 22:38:29,795 [myid:] - INFO
[main:Environment@100] - Server environment:java.vendor=Oracle
Corporation
2012-10-05 22:38:29,795 [myid:] - INFO
[main:Environment@100] - Server
environment:java.home=/usr/java/jre1.7.0_07
2012-10-05 22:38:29,796 [myid:] - INFO
[main:Environment@100] - Server
environment:java.class.path=/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.3-cdh4.1.0.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/etc/zookeeper/conf::/etc/zookeeper/conf:/usr/lib/zookeeper/zookeeper.jar:/usr/lib/zookeeper/zookeeper-3.4.3-cdh4.1.0.jar:/usr/lib/zookeeper/lib/log4j-1.2.15.jar:/usr/lib/zookeeper/lib/jline-0.9.94.jar:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/lib/netty-3.2.2.Final.jar
2012-10-05 22:38:29,796 [myid:] - INFO
[main:Environment@100] - Server
environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2012-10-05 22:38:29,797 [myid:] - INFO
[main:Environment@100] - Server environment:java.io.tmpdir=/tmp
2012-10-05 22:38:29,797 [myid:] - INFO
[main:Environment@100] - Server environment:java.compiler=
2012-10-05 22:38:29,798 [myid:] - INFO
[main:Environment@100] - Server environment:os.name=Linux
2012-10-05 22:38:29,798 [myid:] - INFO
[main:Environment@100] - Server environment:os.arch=amd64
2012-10-05 22:38:29,799 [myid:] - INFO
[main:Environment@100] - Server
environment:os.version=2.6.32-279.9.1.el6.x86_64
2012-10-05 22:38:29,799 [myid:] - INFO
[main:Environment@100] - Server environment:user.name=zookeeper
2012-10-05 22:38:29,800 [myid:] - INFO
[main:Environment@100] - Server
environment:user.home=/var/run/zookeeper
2012-10-05 22:38:29,800 [myid:] - INFO
[main:Environment@100] - Server environment:user.dir=/
2012-10-05 22:38:29,808 [myid:] - INFO
[main:ZooKeeperServer@726] - tickTime set to 2000
2012-10-05 22:38:29,808 [myid:] - INFO
[main:ZooKeeperServer@735] - minSessionTimeout set to -1
2012-10-05 22:38:29,809 [myid:] - INFO
[main:ZooKeeperServer@744] - maxSessionTimeout set to -1
2012-10-05 22:38:29,844 [myid:] - INFO
[main:NIOServerCnxnFactory@99] - binding to port 0.0.0.0/0.0.0.0:2181
2012-10-05 22:38:29,859 [myid:] - INFO
[main:FileTxnSnapLog@270] - Snapshotting: 0x0 to
/var/lib/zookeeper/version-2/snapshot.0
[dc@vivian-y1639vf3 ~]$
Looks OK, no ERROR messages, just INFO.
The hadoop convention is to use ERROR if something is wrong.
How about the other one?
[dc@vivian-y1639vf3 ~]$ cat
/var/log/zookeeper/zookeeper.out
[dc@vivian-y1639vf3 ~]$
blank, ok logs are good. Now lets see
if we can find an admin interface.
Quick web search,
http://zookeeper.apache.org/doc/r3.2.2/zookeeperStarted.html
looks like there is no admin web page
but there is a client called zkCli.sh you can use to test the
connection as described in the link above
cd to /usr/lib/zookeeper
>bin/zkCli.sh -server localhost:2181
OK this makes sense b/c the earlier
zoo.cfg file we saw port 2181 as the last entry. Lets try the above
command. We dont have a bin directory b/c we used the YUM program to
install instead of downloading zookeeper directly
[dc@vivian-y1639vf3 ~]$
/usr/lib/zookeeper/bin/zkCli.sh -server localhost:2181
Connecting to localhost:2181
2012-10-05 22:52:17,904 [myid:] - INFO
[main:Environment@100] - Client
environment:zookeeper.version=3.4.3-cdh4.1.0--1, built on 09/29/2012
17:54 GMT
2012-10-05 22:52:17,908 [myid:] - INFO
[main:Environment@100] - Client environment:host.name=vivian-y1639vf3
2012-10-05 22:52:17,908 [myid:] - INFO
[main:Environment@100] - Client environment:java.version=1.6.0_34
2012-10-05 22:52:17,909 [myid:] - INFO
[main:Environment@100] - Client environment:java.vendor=Sun
Microsystems Inc.
2012-10-05 22:52:17,909 [myid:] - INFO
[main:Environment@100] - Client
environment:java.home=/home/dc/jdk1.6.0_34/jre
2012-10-05 22:52:17,909 [myid:] - INFO
[main:Environment@100] - Client
environment:java.class.path=/usr/lib/zookeeper/bin/../build/classes:/usr/lib/zookeeper/bin/../build/lib/*.jar:/usr/lib/zookeeper/bin/../lib/slf4j-log4j12-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/slf4j-api-1.6.1.jar:/usr/lib/zookeeper/bin/../lib/netty-3.2.2.Final.jar:/usr/lib/zookeeper/bin/../lib/log4j-1.2.15.jar:/usr/lib/zookeeper/bin/../lib/jline-0.9.94.jar:/usr/lib/zookeeper/bin/../zookeeper-3.4.3-cdh4.1.0.jar:/usr/lib/zookeeper/bin/../src/java/lib/*.jar:/usr/lib/zookeeper/bin/../conf:
2012-10-05 22:52:17,910 [myid:] - INFO
[main:Environment@100] - Client
environment:java.library.path=/home/dc/jdk1.6.0_34/jre/lib/amd64/server:/home/dc/jdk1.6.0_34/jre/lib/amd64:/home/dc/jdk1.6.0_34/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2012-10-05 22:52:17,910 [myid:] - INFO
[main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2012-10-05 22:52:17,911 [myid:] - INFO
[main:Environment@100] - Client environment:java.compiler=
2012-10-05 22:52:17,911 [myid:] - INFO
[main:Environment@100] - Client environment:os.name=Linux
2012-10-05 22:52:17,912 [myid:] - INFO
[main:Environment@100] - Client environment:os.arch=amd64
2012-10-05 22:52:17,912 [myid:] - INFO
[main:Environment@100] - Client
environment:os.version=2.6.32-279.9.1.el6.x86_64
2012-10-05 22:52:17,912 [myid:] - INFO
[main:Environment@100] - Client environment:user.name=dc
2012-10-05 22:52:17,913 [myid:] - INFO
[main:Environment@100] - Client environment:user.home=/home/dc
2012-10-05 22:52:17,913 [myid:] - INFO
[main:Environment@100] - Client environment:user.dir=/home/dc
2012-10-05 22:52:17,915 [myid:] - INFO
[main:ZooKeeper@433] - Initiating client connection,
connectString=localhost:2181 sessionTimeout=30000
watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@3dac2f9c
Welcome to ZooKeeper!
2012-10-05 22:52:18,143 [myid:] - INFO
[main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@958]
- Opening socket connection to server
localhost.localdomain/127.0.0.1:2181. Will not attempt to
authenticate using SASL (Unable to locate a login configuration)
JLine support is enabled
2012-10-05 22:52:18,213 [myid:] - INFO
[main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@850]
- Socket connection established to
localhost.localdomain/127.0.0.1:2181, initiating session
[zk: localhost:2181(CONNECTING) 0]
2012-10-05 22:52:18,588 [myid:] - INFO
[main-SendThread(localhost.localdomain:2181):ClientCnxn$SendThread@1187]
- Session establishment complete on server
localhost.localdomain/127.0.0.1:2181, sessionid = 0x13a3494fb6a0000,
negotiated timeout = 30000
WATCHER::
WatchedEvent state:SyncConnected
type:None path:null
[zk: localhost:2181(CONNECTED) 0]
OK type help to make sure the client
interface to the server still works...
ZooKeeper -server host:port cmd args
connect host:port
get path [watch]
ls path [watch]
set path data [version]
rmr path
delquota [-n|-b] path
quit
printwatches on|off
create [-s] [-e] path data acl
stat path [watch]
close
ls2 path [watch]
history
listquota path
setAcl path acl
getAcl path
sync path
redo cmdno
addauth scheme auth
delete path [version]
setquota -n|-b val path
[zk: localhost:2181(CONNECTED) 1]
OK good enough I think. On to storm. ….
Back to the original link for storm,
ok download the storm package and
follow 2) on the link;
OK this is strange. The fact you have
to become root user. For correctly built packages on linux you
shouldn't have to do this. Looks like this isn't a debugged packager
like what RPM or YUM are, there are a bunch of scripts you are
supposed to run to change users, groups, etc.. Something funny. Is ok
we can try it his way. The problem with root install is only root can
access these programs.
Install uuid first.
[dc@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]$ yum search uuid
Loaded plugins: fastestmirror,
refresh-packagekit, security
Loading mirror speeds from cached
hostfile
* base: mirror.nwresd.org
* extras: centos.mirror.lstn.net
* updates: centos.mirrors.hoobly.com
============================== N/S
Matched: uuid ===============================
uuidd.x86_64 : Helper daemon to
guarantee uniqueness of time-based UUIDs
libuuid.i686 : Universally unique ID
library
libuuid.x86_64 : Universally unique ID
library
libuuid-devel.i686 : Universally unique
ID library
libuuid-devel.x86_64 : Universally
unique ID library
uuid.i686 : Universally Unique
Identifier library
uuid.x86_64 : Universally Unique
Identifier library
uuid-c++.i686 : C++ support for
Universally Unique Identifier library
uuid-c++.x86_64 : C++ support for
Universally Unique Identifier library
uuid-c++-devel.i686 : C++ development
support for Universally Unique Identifier
: library
uuid-c++-devel.x86_64 : C++ development
support for Universally Unique
: Identifier
library
uuid-dce.i686 : DCE support for
Universally Unique Identifier library
uuid-dce.x86_64 : DCE support for
Universally Unique Identifier library
uuid-dce-devel.i686 : DCE development
support for Universally Unique Identifier
: library
uuid-dce-devel.x86_64 : DCE development
support for Universally Unique
: Identifier
library
uuid-devel.i686 : Development support
for Universally Unique Identifier library
uuid-devel.x86_64 : Development support
for Universally Unique Identifier
: library
uuid-perl.x86_64 : Perl support for
Universally Unique Identifier library
uuid-pgsql.x86_64 : PostgreSQL support
for Universally Unique Identifier library
uuid-php.x86_64 : PHP support for
Universally Unique Identifier library
Name and summary matches only, use
"search all" for everything.
[dc@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]$ sudo yum install uuid.x86_64
[sudo] password for dc:
Loaded plugins: fastestmirror,
refresh-packagekit, security
Loading mirror speeds from cached
hostfile
* base: mirrors.arpnetworks.com
* extras: mirror.stanford.edu
* updates: mirrors.xmission.com
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package uuid.x86_64
0:1.6.1-10.el6 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===============================================================================================
Package Arch
Version Repository Size
===============================================================================================
Installing:
uuid x86_64
1.6.1-10.el6 base 54 k
Transaction Summary
===============================================================================================
Install 1 Package(s)
Total download size: 54 k
Installed size: 113 k
Is this ok [y/N]: y
Downloading Packages:
uuid-1.6.1-10.el6.x86_64.rpm
| 54 kB 00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : uuid-1.6.1-10.el6.x86_64
1/1
Verifying : uuid-1.6.1-10.el6.x86_64
1/1
Installed:
uuid.x86_64 0:1.6.1-10.el6
Complete!
[dc@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_
su -
we are now in the root directory, go
back to where our downloads are.
>cd /home/dc/Downloads
>cd
storm-installer-0.8.0_1.el6.x86_64
OK following the instructtions from the
webpage:
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# rpm -ivh
zeromq-2.1.7-1.el6.x86_64.rpm
Preparing...
########################################### [100%]
1:zeromq
########################################### [100%]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# rpm -ivh
zeromq-devel-2.1.7-1.el6.x86_64.rpm
Preparing...
########################################### [100%]
1:zeromq-devel
########################################### [100%]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# rpm -ivh
jzmq-2.1.0-1.el6.x86_64.rpm
Preparing...
########################################### [100%]
1:jzmq
########################################### [100%]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# rpm -ivh
storm-0.8.0-1.el6.x86_64.rpm
Preparing...
########################################### [100%]
1:storm
########################################### [100%]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# rpm -ivh
storm-service-0.8.0-1.el6.x86_64.rpm
Preparing...
########################################### [100%]
1:storm-service
########################################### [100%]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# sudo updatedb
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# ls /opt/storm/conf/storm.yaml
/opt/storm/conf/storm.yaml
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# nano /opt/storm/conf/storm.yaml
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# service storm-nimbus start
Starting storm nimbus...
Storm nimbus is running.
[ OK ]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# service storm-ui start
Starting storm ui...
Storm ui is running.
[ OK ]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]# service storm-supervisor start
Starting storm supervisor...
Storm supervisor is running.
[ OK ]
[root@vivian-y1639vf3
storm-installer-0.8.0_1.el6.x86_64]#
Verifying Storm
Once the daemons
are started, verify using ps.
>ps -ef | grep
storm
[root@vivian-y1639vf3
~]# ps -ef | grep storm
root 5826
1 0 Oct05 pts/1 00:00:48 java -server -Xmx768m
-Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64
-Dstorm.options= -Dstorm.home=/opt/storm -Dlogfile.name=nimbus.log
-Dlog4j.configuration=storm.log.properties -cp
/opt/storm/lib/*:/opt/storm/storm-0.8.0.jar:/opt/storm/conf:/opt/storm/log4j
backtype.storm.daemon.nimbus
root 5859
1 0 Oct05 pts/1 00:00:34 java -server -Xmx768m
-Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64
-Dstorm.options= -Dstorm.home=/opt/storm -Dlogfile.name=ui.log
-Dlog4j.configuration=storm.log.properties -cp
/opt/storm/lib/*:/opt/storm/storm-0.8.0.jar:/opt/storm/conf:/opt/storm/log4j:/opt/storm
backtype.storm.ui.core
root 5901
1 0 Oct05 pts/1 00:01:02 java -server -Xmx1024m
-Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64
-Dstorm.options= -Dstorm.home=/opt/storm
-Dlogfile.name=supervisor.log
-Dlog4j.configuration=storm.log.properties -cp
/opt/storm/lib/*:/opt/storm/storm-0.8.0.jar:/opt/storm/conf:/opt/storm/log4j
backtype.storm.daemon.supervisor
root 9124
9046 0 02:26 pts/1 00:00:00 grep storm
You can also look
at the config file for more clues how Storm works:
These are the
default values for storm. We may have to change the parameter
storm.cluster.mode from distributed to local. Storm.yaml is used to
override the default values.
The admin is on
port 8080 from th e ui.port setting.
OK lets test the
maven files for running the compile and execution of the java sample
code under storm -sample.
The
instructions https://github.com/nathanmarz/storm-starter
here on the last section say:
WordCountTopology
in
local mode, use this command:mvn -f m2-pom.xml compile exec:java -Dexec.classpathScope=compile -Dexec.mainClass=storm.starter.WordCountTopology
Lets try setting
storm.yaml to local mode first.
root>updatedb
root>locate
storm.yaml
[root@vivian-y1639vf3
storm]# locate storm.yaml
/home/dc/storm-0.8.1/conf/storm.yaml
/opt/storm-0.8.0/conf/storm.yaml
root@vivian-y1639vf3>
nano /opt/storm-0.8/conf/storm.yaml
########### These
MUST be filled in for a storm configuration
storm.zookeeper.servers:
-
"localhost"
nimbus.host:
"localhost"
#
# ##### These may
optionally be filled in:
#
## List of custom
serializations
#
topology.kryo.register:
# -
org.mycompany.MyType
# -
org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## Locations of
the drpc servers
# drpc.servers:
# - "server1"
# - "server2"
java.library.path:
"/usr/local/lib:/opt/local/lib:/usr/lib:/usr/lib64"
storm.local.dir:
"/opt/storm"
#storm.cluster.mode:
"local"
OK had to comment
out the local mode above. It didnt work. Found this in the
/var/log/storm/nimbus.log
2012-10-06
02:49:07 NIOServerCnxn [ERROR] Thread Thread[main,5,main] died
java.lang.IllegalArgumentException:
Cannot start server in local mode!
at
backtype.storm.daemon.common$validate_distributed_mode_BANG_.invoke(common.clj:77)
at
backtype.storm.daemon.nimbus$launch_server_BANG_.invoke(nimbus.clj:1078)
at
backtype.storm.daemon.nimbus$_launch.invoke(nimbus.clj:1110)
at
backtype.storm.daemon.nimbus$_main.invoke(nimbus.clj:1134)
at
clojure.lang.AFn.applyToHelper(AFn.java:159)
at
clojure.lang.AFn.applyTo(AFn.java:151)
at
backtype.storm.daemon.nimbus.main(Unknown Source)
Back to
distributed mode. Crappy Docs.
Restart the
daemons. Forgot what they were called. By convention all the daemons
are under /etc/init.d. Lets do a ls and grep for them:
[root@vivian-y1639vf3
storm]# ls /etc/init.d | grep storm
storm-nimbus
storm-supervisor
storm-ui
[root@vivian-y1639vf3
storm]#
Now lets restart
them:
[root@vivian-y1639vf3
storm]# service storm-nimbus start
Starting storm
nimbus...
Storm nimbus is
running. [ OK ]
[root@vivian-y1639vf3
storm]# service storm-supervisor restart
Stopping storm
supervisor...
Storm supervisor
is stopped. [ OK ]
Starting storm
supervisor...
Storm supervisor
is running. [ OK ]
[root@vivian-y1639vf3
storm]# service storm-ui restart
Stopping storm
ui...
Storm ui is
stopped. [ OK ]
Starting storm
ui...
Storm ui is
running. [ OK ]
[root@vivian-y1639vf3
storm]#
Test the storm
CLI, this is the command line interface. This isn't well documented.
The command line interface is at /opt/storm-0.8.0/bin/storm, run this
program and you should get something like this:
[root@vivian-y1639vf3
storm-starter]# /opt/storm-0.8.0/bin/storm
Commands:
activate
classpath
deactivate
dev-zookeeper
drpc
help
jar
kill
list
localconfvalue
nimbus
rebalance
remoteconfvalue
repl
shell
supervisor
ui
version
Help:
help
help
Documentation for
the storm client can be found at
https://github.com/nathanmarz/storm/wiki/Command-line-client
Configs can be
overridden using one or more -c flags, e.g. "storm list -c
nimbus.host=nimbus.mycompany.com"
[root@vivian-y1639vf3
storm-starter]#
OK back to maven
to try to run the sample starter word count program.
Do a web search
for Apache Maven, download Apache Maven from here;
Unzip and untar
this using >gunzip apache-maven-3.0.4.tar.gz
then >tar -xvf
apache-maven-3.0.4.tar
cd into the newly
extracted directory, apache-maven-3.0.4 and run pwd to get the full
path.
[dc@vivian-y1639vf3
apache-maven-3.0.4]$ pwd
/home/dc/apache-maven-3.0.4
[dc@vivian-y1639vf3
apache-maven-3.0.4]$
cd again to get to
your home directory, add the pwd path /home/dc/apache-maven-3.0.4 to
MAVEN_HOME and set PATH to include $MAVEN_HOME/bin. This should have
the same convention as how Java is setup. Your .bashrc file should
look like this:
[dc@vivian-y1639vf3
apache-maven-3.0.4]$ cat ~/.bashrc
# .bashrc
# Source global
definitions
if [ -f
/etc/bashrc ]; then
. /etc/bashrc
fi
# User specific
aliases and functions
export
MONGO_HOME=/home/dc/mongodb-linux-x86_64-2.2.0
export
JAVA_HOME=/home/dc/jdk1.6.0_34
#export
STORM_HOME=/home/dc/storm-0.8.1
export
MAVEN_HOME=/home/dc/apache-maven-3.0.4
export
PATH=$PATH:$PATH/bin:$JAVA_HOME/bin:$MAVEN_HOME/bin
>source .bashrc
to set the Maven executable directory.
Test you have
maven by running >mvn -v
[dc@vivian-y1639vf3
apache-maven-3.0.4]$ mvn -v
Apache Maven 3.0.4
(r1232337; 2012-01-17 00:44:56-0800)
Maven home:
/home/dc/apache-maven-3.0.4
Java version:
1.6.0_34, vendor: Sun Microsystems Inc.
Java home:
/home/dc/jdk1.6.0_34/jre
Default locale:
en_US, platform encoding: UTF-8
OS name: "linux",
version: "2.6.32-279.9.1.el6.x86_64", arch: "amd64",
family: "unix"
[dc@vivian-y1639vf3
apache-maven-3.0.4]$
OK we
are good to go. Now we can try the maven storm-starter instructions
here: https://github.com/nathanmarz/storm-starter
cd into
storm-starter. You should be in the same directory as the m2-pom.xml
file.
Before we build
storm-starter, we installed everything as root.
>su –
to become root.
[dc@vivian-y1639vf3
storm-starter]$ ls
LICENSE
m2-pom.xml multilang project.clj README.markdown src target
[dc@vivian-y1639vf3
storm-starter]$
Lets build the
source and create the jar files for storm-starter first.
>mvn -f
m2-pom.xml package
you should see a
long output which
[root@vivian-y1639vf3
storm-starter]# mvn -f m2-pom.xml package
The output is too
big to add here but you should see a lot of lines like:
Downloading:
http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.jar
Downloaded:
http://repo.maven.apache.org/maven2/org/apache/maven/surefire/surefire-junit3/2.10/surefire-junit3-2.10.jar
(26 KB at 171.3 KB/sec)
in the end you
should see a build success message like this:
[root@vivian-y1639vf3
storm-starter]# mvn -f m2-pom.xml package
[INFO] Scanning
for projects...
[WARNING]
[WARNING] Some
problems were encountered while building the effective model for
storm.starter:storm-starter:jar:0.0.1-SNAPSHOT
[WARNING]
'build.plugins.plugin.version' for
org.apache.maven.plugins:maven-compiler-plugin is missing. @ line
127, column 12
[WARNING]
[WARNING] It is
highly recommended to fix these problems because they threaten the
stability of your build.
[WARNING]
[WARNING] For this
reason, future Maven versions might no longer support building such
malformed projects.
[WARNING]
[INFO]
[INFO]
------------------------------------------------------------------------
[INFO] Building
storm-starter 0.0.1-SNAPSHOT
[INFO]
------------------------------------------------------------------------
[INFO]
[INFO] ---
maven-resources-plugin:2.5:resources (default-resources) @
storm-starter ---
[debug] execute
contextualize
[INFO] Using
'UTF-8' encoding to copy filtered resources.
[INFO] Copying 4
resources
[INFO]
[INFO] ---
maven-compiler-plugin:2.3.2:compile (default-compile) @ storm-starter
---
[INFO] Compiling 2
source files to /home/dc/storm-starter/target/classes
[INFO]
[INFO] ---
clojure-maven-plugin:1.3.8:compile (compile) @ storm-starter ---
[INFO]
[INFO] ---
maven-resources-plugin:2.5:testResources (default-testResources) @
storm-starter ---
[debug] execute
contextualize
[INFO] Using
'UTF-8' encoding to copy filtered resources.
[INFO] skip non
existing resourceDirectory /home/dc/storm-starter/src/test/resources
[INFO]
[INFO] ---
maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @
storm-starter ---
[INFO] No sources
to compile
[INFO]
[INFO] ---
maven-surefire-plugin:2.10:test (default-test) @ storm-starter ---
[INFO] No tests to
run.
[INFO] Surefire
report directory: /home/dc/storm-starter/target/surefire-reports
-------------------------------------------------------
T E
S T S
-------------------------------------------------------
Results :
Tests run: 0,
Failures: 0, Errors: 0, Skipped: 0
[INFO]
[INFO] ---
clojure-maven-plugin:1.3.8:test (test) @ storm-starter ---
Testing
com.theoryinpractise.clojure.testrunner
Ran 0 tests
containing 0 assertions.
0 failures, 0
errors.
[INFO]
[INFO] ---
maven-jar-plugin:2.3.2:jar (default-jar) @ storm-starter ---
[INFO]
[INFO] ---
maven-assembly-plugin:2.2-beta-5:single (make-assembly) @
storm-starter ---
[INFO] META-INF/
already added, skipping
[INFO]
META-INF/MANIFEST.MF already added, skipping
[INFO] twitter4j/
already added, skipping
[INFO]
META-INF/LICENSE.txt already added, skipping
[INFO]
META-INF/maven/ already added, skipping
[INFO]
META-INF/maven/org.twitter4j/ already added, skipping
[INFO] META-INF/
already added, skipping
[INFO]
META-INF/MANIFEST.MF already added, skipping
[INFO] Building
jar:
/home/dc/storm-starter/target/storm-starter-0.0.1-SNAPSHOT-jar-with-dependencies.jar
[INFO] META-INF/
already added, skipping
[INFO]
META-INF/MANIFEST.MF already added, skipping
[INFO] twitter4j/
already added, skipping
[INFO]
META-INF/LICENSE.txt already added, skipping
[INFO]
META-INF/maven/ already added, skipping
[INFO]
META-INF/maven/org.twitter4j/ already added, skipping
[INFO] META-INF/
already added, skipping
[INFO]
META-INF/MANIFEST.MF already added, skipping
[INFO]
------------------------------------------------------------------------
[INFO] BUILD
SUCCESS
[INFO]
------------------------------------------------------------------------
[INFO] Total time:
8.059s
[INFO] Finished
at: Sat Oct 06 04:11:49 PDT 2012
[INFO] Final
Memory: 17M/270M
[INFO]
-------------------------------
If you get a build
faiure permission denied then you arent running as root.
Once the jars are
built you can run the word count program. The instructions on the web
page are reversed.
You should still
be in the storm-starter directory.
[root@vivian-y1639vf3
storm-starter]# mvn -f m2-pom.xml compile exec:java
-Dexec.classpathScope=compile
-Dexec.mainClass=storm.starter.WordCountTopology
You should see a
BUILD SUCCESS message at the end and output showing words being
counted:
11505 [Thread-25]
INFO backtype.storm.daemon.task - Emitting: split default ["four"]
11505 [Thread-21]
INFO backtype.storm.daemon.executor - Processing received message
source: split:5, stream: default, id: {}, ["four"]
11506 [Thread-21]
INFO backtype.storm.daemon.task - Emitting: count default [four,
58]
11506 [Thread-25]
INFO backtype.storm.daemon.task - Emitting: split default ["score"]
11506 [Thread-21]
INFO backtype.storm.daemon.executor - Processing received message
source: split:5, stream: default, id: {}, ["score"]
11506 [Thread-21]
INFO backtype.storm.daemon.task - Emitting: count default [score,
58]
11506 [Thread-25]
INFO backtype.storm.daemon.task - Emitting: split default ["and"]
11507 [Thread-23]
INFO backtype.storm.daemon.executor - Processing received message
source: split:5, stream: default, id: {}, ["and"]
11507 [Thread-23]
INFO backtype.storm.daemon.task - Emitting: count default [and,
103]
11507 [Thread-25]
INFO backtype.storm.daemon.task - Emitting: split default ["seven"]
11507 [Thread-19]
INFO backtype.storm.daemon.executor - Processing received message
source: split:5, stream: default, id: {}, ["seven"]
11508 [Thread-19]
INFO backtype.storm.daemon.task - Emitting: count default [seven,
103]
11508 [Thread-25]
INFO backtype.storm.daemon.task - Emitting: split default ["years"]
11508 [Thread-19]
INFO backtype.storm.daemon.executor - Processing received message
source: split:5, stream: default, id: {}, ["years"]
11508 [Thread-19]
INFO backtype.storm.daemon.task - Emitting: count default [years,
58]
11508 [Thread-25]
INFO backtype.storm.daemon.task - Emitting: split default ["ago"]
Like years has a
count of 58 occurrences.
This comment has been removed by the author.
ReplyDeleteThanks for sharing this informative information..
ReplyDeleteFor Storm components all over information you may also refer.... http://www.s4techno.com/blog/2016/08/13/storm-components/