Wednesday, February 6, 2013

zookeeper logs snapshots PurgeTxnLog example

As warned in the documentation for Zookeeper, zookeeper keeps it's logs and snapshots forever. There is no automatic delete or clean feature and if you don't add some code after a Zookeeper install you will run out of disk space and your system will cease to operate as you are doing your development. Here is a 50G disk drive 100% utilized at the root volume level.

By default Zookeeper configures dataDir to be /var/lib/zookeeper in the default zoo.cfg configuration file at /etc/zookeeper/conf/zoo.cfg

Your snapshots and logs are under directory /var/lib/zookeeper/version-2. Here is what a directory looks like after zookeeper is being used:

Big mess of files which I have no use for. Today is 2/5 and I have files from November of last year.

To delete them cd into /usr/lib/zookeeper and run the following command:

[dc@localhost zookeeper]$ sudo java -cp zookeeper-3.4.3-cdh4.1.2.jar:lib/slf4j-log4j12-1.6.1.jar:lib/slf4j-api-1.6.1.jar:lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.PurgeTxnLog /var/lib/zookeeper/ /var/lib/zookeeper/ -n 4

You will get the command printing out files it is deleting:

Removing file: Jan 23, 2013 10:56:36 PM /var/lib/zookeeper/version-2/snapshot.e91c
Removing file: Jan 23, 2013 10:59:55 PM /var/lib/zookeeper/version-2/snapshot.e942
Removing file: Dec 27, 2012 8:15:27 PM /var/lib/zookeeper/version-2/snapshot.23d9
Removing file: Jan 2, 2013 2:00:19 PM /var/lib/zookeeper/version-2/snapshot.240a
Removing file: Dec 30, 2012 1:53:16 PM /var/lib/zookeeper/version-2/snapshot.23ed
Removing file: Jan 4, 2013 8:17:07 AM /var/lib/zookeeper/version-2/snapshot.2422
Removing file: Jan 2, 2013 4:14:02 AM /var/lib/zookeeper/version-2/snapshot.2409

When complete your version-2 directory above looks like:

Much cleaner!!!

Make this command verbatim part of a cron command which gets run at repeat intervals without manual intervention.

1 comment:

  1. Since zookeeper 3.4.0 you can use:

    # Auto purge feature keeps this ammount of most recent snapshots
    # and the corresponding transaction logs
    autopurge.snapRetainCount = 5

    # The time interval in hours for which the purge task has to be triggered
    autopurge.purgeInterval = 24