7. Advanced QueueMetrics monitoring

Recent versions of the Java JVM offer very powerful APIs to monitor and diagnose live systems while they are running; they are meant to be run in production with negligible performance impact.

This can be useful to diagnose specific problems, e.g. Java heap exhaustion issues, or to monitor the activity of your QM servers.

7.1. Prerequisites

  • A QueueMetrics instance runnining under JDK 6 or newer. The specific version of Java that is being run can easily be seen under the License page of QueueMetrics.

Recent versions of QueueMetrics installed using ’yum’ should already be running under JDK 6. If this is not your case, you should upgrade the ’queuemetrics-java’ package.

7.2. Assessing memory problems

If you feel you are experiencing memory issues, you should take multiple memory and thread dumps spaced a couple of hours in between and send them to Loway for inspection.

We will usually need:

  • The current memory settings
  • A memory dump
  • A thread dump

They should be obtained as described below.

7.2.1. Finding the current QueueMetrics PID

In order to perform the procedures described below, you must know the PID of your currently running QueueMetrics instance. It can usually be found out by running:

[root@qm ~]# ps fax | grep catalina
32313 pts/0    S+     0:00          \_ grep catalina
12345 ?        Sl     0:14 /usr/java/jdk1.6.0_17/bin/java -Xms128M .....

Here in the example QM is running with a PID of 12345.

The PID is used to attach to the current JVM and query it. It is also possible to start the JVM so that it allows administrative access over a network; therefore all the procedures described below can be run on a remote JVM as well.

7.2.2. Taking a memory dump

A memory dump presents a (long) list of all the loaded Java classes, and how many instances of each are present in memory.

[root@qm ~]# /usr/java/jdk1.6.0_17/bin/jmap -histo:live 12345

You should also collect general memory area usage statistics by running:

[root@qm ~]# /usr/java/jdk1.6.0_17/bin/jmap 12345

7.2.3. Taking a thread dump

A thread dump prints out - thread by thread - what each one is doing at a given moment. This is useful to diagnose load-based issues where too many requests and open sessions "flood" the QM server.

[root@qm ~]# /usr/java/jdk1.6.0_17/bin/jstack -l 12345

This lets you know what a "frozen" server with high CPU usage is actually doing.

7.3. Remote monitoring with VisualVM

’VisualVM’ is a graphical tool developed by Sun that lets you monitor a remote QueueMetrics instance while it’s running (it can actually be used with any Java-based process).

It allows monitoring over a network, so it is common to run it on a workstation to monitor one or more remote servers.

You can find it at: https://visualvm.dev.java.net/

7.3.1. Allowing remote access

The standard JVM settings for QM ’do not’ allow remote access over a network, for obvious security reasons. In order to allow it, you should add the following line to ’/etc/init.d/queuemetrics’ (all on one line):

export JAVA_OPTS="-Dcom.sun.management.jmxremote.port=9999
                  -Dcom.sun.management.jmxremote.authenticate=false
                  -Dcom.sun.management.jmxremote.ssl=false $JAVA_OPTS"

Restart the JVM after adding it. You can change the port (in this case we set it to 9999) for security purposes.

7.3.2. Starting VisualVM

To start VisualVM, you run ’bin/visualvm.exe’.

When started, click on "Add server" and enter the IP address of your QM server. Click on "Advanced settings" and set the port to the one you specified in the QM configuration (9999 in this example).

After that, you select your server and select "Add JMX connection" from the right-button menu. You enter the JMX connection as "IP:9999".

By clicking on it, you get a working connection, like in the picture below:

./Pictures/visualvm.png

7.3.3. Things you can do in VisualVM

A number of interesing things can be done with VisualVM:

  • ’Know your JVM’: you can see the JVM settings from ’Overview’ / ’JVM arguments’.
  • ’Memory monitoring’: you can see the current CPU, memory and thread usage from the ’Monitor’ page. Note that with most settings, it is normal that all memory be used up before a garbage collection is performed; so you would expect to see spikes and falls in the graph. You can also force a garbage collection if you want to see the "true" memory usage, but this may be unwise on heavily loaded production servers.
  • ’Thread monitoring’: you can get a textual thread dump like the one discussed above by selecting ’Threads’ / ’Thread dump’
  • You can use the ’Sampler’ to acquire a breakdown of memory and CPU usage per class