Persistence Adaptors and ActiveMQ Options

October 01, 2013

Persistence Adaptors and ActiveMQ Options

Where do you find ActiveMQ options that aren't listed on the ActiveMQ site pages? And how do you configure the newer, more pluggable persistence and lock adapters?

Recently, we've had problems around ActiveMQ again - less in terms of the brokers than in the clients having ever increasing numbers of connections causing the broker to eventually crash. While we try to understand the issue with the clients (unlikely due to ActiveMQ), we tried to reconfigure the ActiveMQ brokers to cope with the load. One thing mentioned previously was turning off Network of Brokers - it fails too easily and ends up in a split brain. Restarting a broker fixes it, but can kick off the clients - consumers and producers - into increasing connections. In other words, we were trapped - increasing connections causing ActiveMQ problems causing increasing connections causing ...

While we were experimenting, we went back to a SQL persistence layer rather than KahaDB. For pseudo random testing, it was about 60-70% as performance as KahaDB, but in real work loads, it seemed noticeably slower. The consumers and producers seemed to think so as well as the spiraling connections was much worse with SQL backed storage, presumably due to slower performance causing more connection issues causing slower performance causing ... you get the picture.

We're tempted by a solution that is based completely NFS shared storage, but some colleagues are nervous about KahaDB stale file issues over NFS and shared between brokers. If we have KahaDB files that won't clear and are shared between two brokers, it seems even more problematic than our current set up.

A couple of ideas: ActiveMQ supports mKahaDB which allows you to have different KahaDB files for different queues or topics (with "catch all" defaults available). That way the slow and fast consumers can be separated from each other and the file size can be controlled better. It might also help with stale file issues. See this page for more: http://activemq.apache.org/kahadb.html
Another idea is to switch to LevelDB which should be faster than KahaDB according to the ActiveMQ site and also might not suffer from the stale file problems. Or maybe it does.
Yet another option is to use the ActiveMQ pluggable storage lockers http://activemq.apache.org/pluggable-storage-lockers.html and set the storage to be a local directory for each broker and the lock file can be a shared directory.** Ok, this option isn't great because messages could be stranded, especially if you have some slow consumers, but for us, we might prefer a little manual work after a failover instead of risking down time - well some of the other guys.

Regardless of approach, the pluggable storage has the ability to separate locks from data if needed and to set some options.** Look at this for a quick example of
<persistenceAdapter>
<kahaDB directory="activemq-data">
<locker>
   <shared-file-locker lockAcquireSleepInterval="100000"/>
    </locker>
</kahaDB>
</persistenceAdapter>
For setting the lock directory, use:
<shared-file-locker directory="activemq-lock-directory"/>
or similar. A few of the options for the pluggable storage and lockers is on the main ActiveMQ site, but there are more that aren't.

Finding all the ActiveMQ XML configuration options, including the storage locking ones, is easiest by looking directly at the ActiveMQ XML. Here's the link to the kahaDB store:
activemq.apache.org/schema/core/activemq-core-5.8.0-schema.html#kahaDB
and to the shared-file-locker which is a possible element of that store:
http://activemq.apache.org/schema/core/activemq-core-5.8.0-schema.html#shared-file-locker
Be aware of version numbers in the two links above (both are for 5.8.0).
The ActiveMQ guys are in the process of finishing the separation of storage vs locks, but not all made it into 5.8 - mixing kaha with SQL lease locks might have to wait until 5.9 (and 5.9 is out with the SQL lease locks):
https://issues.apache.org/jira/browse/AMQ-4365

*** Update - a little delayed in mentioning this, but ActiveMQ 5.9 appears to have all that is needed to use shared locking with individual kahadb stores - we'll test that and report back.
The options in the <statements/> section are available by looking at the code which also has useful pieces like the SQL create statements. Hopefully, we can drop the work below now!

**Ok, one problem - the XML supports the shared-file-locker directoy="..." syntax, but the code does NOT do anything with it! After trying this, we realized the lock file was still being put in the same location as the data files. Reviewing the ActiveMQ code (search for sharedfilelocker.java) made it clear that it hasn't been finished yet. So, how to use a feature like this: linux based file locking - if you set a script that either locks or changes ownership of the 'lock' file then you'll control ActiveMQ start up. Detecting and setting the lock requires a little work with NFS, flock, or perhaps something like python - actually just changing the ownership is easier. Since you're trying to detect if the other ActiveMQ is running, looking for a lock is nice, but you could just curl one of the standard URLs on the other broker to see if it is running - not foolproof, but perhaps workable with the right supplemental checks.

One option to having a shared lock is to try the DB shared locker, but in the spirit of avoiding the DB (and it would be fine to use it for a lock!), here's a little script that flags the lock by leaving a file on the NFS mount (msg_dir). It's set to look at /proc/kmsg as a test, but change it to the activemq/lock file instead.

#check if file is locked or at leasted opened and indicate with another file
# using this as flock across NFS didn't initially work for me
fn=/proc/kmsg
#fn=$activemqdir/lock
hn=`hostname`
msg_dir=/tmp
fn=$msg_dir/file_is_locked_on_$hn
standby_server="no"
amq_lock_file=/tmp/activemq_home_lock

#exit value should be 0 for an opened file and fopened should have a process id
# else file shouldn't be opened and therefore should be in use by activemq

if [ -e $fn ]
then
fopened=`lsof -wt $fn`
exit_value=$?
else
echo "no file to check!"
fopened=""
exit_value=1
fi

if [ "$fopened" != "" -a $exit_value -eq 0 ]
then
   echo "file is locked"
   echo $fopened > $msg_dir/file_is_locked_on_$hn
   rm -f $msg_dir/file_NOT_or_unknown
   if [ "$standby_server" = "yes" ]
     then
       sleep 5 #make the standby server sleep waiting to see if race for lock with another server
       remote_lock=`ls -1 $msg_dir | grep locked | grep -v $hn | wc -l`
       if [ $remote_lock -gt 0 ]
         then
          touch $msg_dir/file_NOT_or_unknown
          rm -f $msg_dir/file_is_locked_on_$hn
       else
         echo "still not locked remotely so setting for local startup"
       fi
   fi
else
   echo "not locked or unknown"
   touch $msg_dir/file_NOT_or_unknown
   rm -f $msg_dir/file_is_locked_on_$hn
fi

# Have left a marker that one instance is up or not, now use that to control activemq

remote_lock=`ls -1 $msg_dir | grep locked | grep -v $hn | wc -l`
local_lock=`ls -1 $msg_dir | grep locked | grep $hn | wc -l`
echo "remote lock: $remote_lock; locally locked: $local_lock"
if [ $remote_lock -gt 0 ]
then
chown root.root $amq_lock_file #prevent activemq from locking and starting
elif [ $local_lock -eq 1 ]
chown activemq.activemq $amq_lock_file #allow activemq to lock and start
# could just start activemq at this point
then
fi

Comments

AnonymousAugust 28, 2014 at 8:01 AM
Hi JM,
It was a very nice blog post. Currently, we're trying to set up a HA activemq cluster and I'm trying to choose the correct option for our system. Have you checked newer versions of AMQ and how do you compare Master/Slave options such as Shared Disk and Replicated LevelDBStore?
Regards
ReplyDelete
Replies
AnonymousOctober 3, 2014 at 2:49 AM
Hi, thanks for sharing your experiences with AMQ.
One thing that concerns me about your setup, which I assume is multiple KahaDBs + DB locking, is failover. What will happen if one of your KahaDB fails? Do you somehow prevent losing messages from a failed KahaDB instance?
ReplyDelete
Replies
jmSeptember 23, 2015 at 11:02 PM
Sorry for the late reply! With a separate KahaDBs and a single DB locking, there is definitely the possibility that messages will be held in the unavailable ActiveMQ. We've handled this by bringing that ActiveMQ up on a different port (to prevent clients from connecting) and moving the messages to the active instance. It's not ideal from that perspective in terms of manual recovery work - although if your messages are short lived you don't need to do this. We would use shared storage or replicated storage as our main setup if possible. We're trying replicated LevelDB now so will feed back shortly on that experience.
ReplyDelete
Replies

Add comment

Search This Blog

Random Parallels

Persistence Adaptors and ActiveMQ Options

Comments

Post a Comment

Popular Posts

ActiveMQ recover corrupt KahaDB

Couple of tools for Duplicate Photos and Media Titles