Wednesday, October 24, 2012

Mongo Setting up Sharding Example Tutorial


MongoDB Configuring Sharding


We will use an abbreviated setup where we have 3 replica sets but each replica set is just one mongod process instead of 3.

Run the following 3 commands to create 3 mongo servers running as a replica set in sharded mode. This simulates 3 nodes in a cluster where each node is a replica set called rs1, rs2 and rs3. Instead of running 3 mongo servers in each replica set we are only going to run 1 on our local development machine. 

mongodb-linux-x86_64-2.2.0/bin/mongod --rest --dbpath /rs1/db1 --logpath /rs1/db1/log --fork --replSet rs1 --shardsvr --port 10001

mongodb-linux-x86_64-2.2.0/bin/mongod --rest --dbpath /rs2/db1 --logpath /rs2/db1/log --fork --replSet rs2 --shardsvr --port 10002

mongodb-linux-x86_64-2.2.0/bin/mongod --rest --dbpath /rs3/db1 --logpath /rs3/db1/log --fork --replSet rs3 --shardsvr --port 10003

Before running the script create the 3 directories under /rs1/db, /rs2/db, /rs3/db and verify the permissions and owner are you. Or are at a level which the mongod processes can write into. Note: the ls ld command can show you the owner of a directory. Make sure it isn't root.

[dc@localhost ~]$ ls -ld /rs1/db1
drwxrwxr-x. 5 dc dc 4096 Oct 24 18:21 /rs1/db1
[dc@localhost ~]$ ls -ld /rs1
drwxrwxrwx. 3 mongod mongod 4096 Oct 24 15:52 /rs1
[dc@localhost ~]$


Start the scripts verify using ps -ef | grep mongod the processes are running in memory.



Before doing anything else we started the mongo server as a replica set. You have to initialize the replica set state machine. We are not going to add more servers to each replica set, a replica set with 1 member is enough for testing. Connect to each mongo server we created above and convert it from a stand alone mongo server to a replica set using:

Optional step: if you run a rs.conf() command from the mongo command prompt you will see the replica sets are not configured:


[root@dvsnode4 ~]# mongo localhost:10000
MongoDB shell version: 2.2.1
connecting to: localhost:10000/test
> rs.conf()
null
>



[dc@localhost ~]mongo localhost:10001
>rs.initiate()
>rs.conf()

After issuing the rs.initiate()  wait for the command prompt to change indicating the mongo server converted to a replica set. Here is what your session should look like. Note the command prompt changes from a > to a  rs1:STARTUP2 to rs1:PRIMARY.


>rs.initiate()
{
        "info2" : "no configuration explicitly specified -- making one",
        "me" : "DChang-PC:10001",
        "info" : "Config now saved locally.  Should come online in about a minut
e.",
        "ok" : 1
}
rs1:STARTUP2>rs.conf()
{
        "_id" : "rs1",
        "version" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "host" : "DChang-PC:10001"
                }
        ]
}
rs1:PRIMARY> exit
bye


(note for below you can connect to the localhost:port in mongo in the mongo command shell)
>mongo localhost:10002
>rs.initiate()
>rs.conf()

>mongo localhost:10003
>rs.initiate()
>rs.conf()



and start the replica set state machine using rs.initialize() and then rs.conf() to verify you see tit being elected a primary. Reconnect to each one at localhost.localdomain:10001 and see you get an >PRIMARY prompt indicating the mongod process is a replica set.






Start a config server:

[dc@localhost ~]$ sudo chown -R dc:dc /config/db1
[dc@localhost ~]$ mongodb-linux-x86_64-2.2.0/bin/mongod --configsvr --rest --dbpath /config/db1 --logpath /config/db1/log --port 20000 --fork --nojournal
all output going to: /config/db1/log


Start a mongos.
[dc@localhost ~]$ mongodb-linux-x86_64-2.2.0/bin/mongos --configdb localhost:20000 --rest --dbpath /config/db1 --logpath /mongos/log/log.txt

Enable sharding for database test, then create a collection, insert a value, enable indexes on the id field and enable sharding for the collection w id as the shard key. Insert some values and verify the inserts are sharded.

Add the sharding servers where you started the mongod processes on: 
mongos>


-- NOTE AFTER Starting mongos you get access to sh.help()

mongos>sh.help()
sh.addShard( host )                       server:port OR setname/server:port
sh.enableSharding(dbname)                 enables sharding on the database dbname
sh.shardCollection(fullName,key,unique)   shards the collection
sh.splitFind(fullName,find)               splits the chunk that find is in at the median
sh.splitAt(fullName,middle)               splits the chunk that middle is in at middle
sh.moveChunk(fullName,find,to)            move the chunk where 'find' is to 'to' (name of shard)
sh.setBalancerState( )   turns the balancer on or off true=on, false=off
sh.getBalancerState()                     return true if enabled
sh.isBalancerRunning()                    return true if the balancer has work in progress on any mongos
sh.addShardTag(shard,tag)                 adds the tag to the shard
sh.removeShardTag(shard,tag)              removes the tag from the shard
sh.addTagRange(fullName,min,max,tag)      tags the specified range of the given collection
sh.status()                               prints a general overview of the cluster
mongos>




mongos> sh.enableSharding("test")
{ "ok" : 1 }

If you try to shard a collection yo umay get an error message:
mongos> sh.shardCollection("test.nonshard",{"id":1} )
{
"proposedKey" : {
"id" : 1
},
"curIndexes" : [
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "test.nonshard",
"name" : "_id_"
}
],
"ok" : 0,
"errmsg" : "please create an index that starts with the shard key before sharding."
}

Add an index to the id key:
mongos> db.nonshard.ensureIndex({id:1})


Shard the collection:
mongos> sh.shardCollection("test.nonshard",{"id":1} )
{ "collectionsharded" : "test.nonshard", "ok" : 1 }

Insert docs into sharded collection:
mongos> for(var i=0;i<100000 db.nonshard.save="db.nonshard.save" i="i" id="id" p="p">


Verify the sharding status is correct:

mongos> db.printShardingStatus()
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "rs1", "host" : "rs1/localhost.localdomain:10001" }
{ "_id" : "rs2", "host" : "rs2/localhost.localdomain:10002" }
{ "_id" : "rs3", "host" : "rs3/localhost.localdomain:10003" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "testshard", "partitioned" : true, "primary" : "rs1" }
{ "_id" : "test", "partitioned" : true, "primary" : "rs1" }
test.nonshard chunks:
rs3 1
rs1 1
rs2 1
{ "id" : { $minKey : 1 } } -->> { "id" : 0 } on : rs3 Timestamp(3000, 0)
{ "id" : 0 } -->> { "id" : 16687 } on : rs1 Timestamp(3000, 1)
{ "id" : 16687 } -->> { "id" : { $maxKey : 1 } } on : rs2 Timestamp(2000, 0)
mongos>  

No comments:

Post a Comment