We wanted to monitor Rabbitmq through Wavefront. So, we followed below steps to configure the integration:
- Install telegraf agent.
- Enable Rabbitmq management plugin.
- Configure telegraf agent to use Rabbitmq input plugin.
- Restart telegraf
Even after completing above steps we were not able to see complete data in Wavefront dashboards. We were able to see basic node level data but not able to see queue level data. We checked /var/log/messages where telegraf logs are written and found following messages.
Apr 10 03:48:36 test-app1 telegraf: 2022-04-10T10:48:36Z E! [inputs.rabbitmq] Error in plugin: getting “/api/overview” failed: 401 Unauthorized
Apr 10 03:48:36 test-app1 telegraf: 2022-04-10T10:48:36Z E! [inputs.rabbitmq] Error in plugin: getting “/api/nodes” failed: 401 Unauthorized
Apr 10 03:48:36 test-app1 telegraf: 2022-04-10T10:48:36Z E! [inputs.rabbitmq] Error in plugin: getting “/api/exchanges” failed: 401 Unauthorized
Apr 10 03:48:36 test-app1 telegraf: 2022-04-10T10:48:36Z E! [inputs.rabbitmq] Error in plugin: getting “/api/queues” failed: 401 Unauthorized
Apr 10 03:48:39 test-app1 telegraf: 2022-04-10T10:48:39Z E! [inputs.rabbitmq] Error in plugin: Get “http://testrmq.vmware.com:15672/api/queues”: net/http: timeout awaiting response headers
So, there were two types of errors: 401 Unauthorized and timeout. We tested if credentials mentioned input.rabbitmq configuration were correct by using below curl command:
curl -s -u <moinotoringUser>:<password> http://testrmq.vmware.com:15672/api/queues
Curl command gave expected results. So, we ruled out authentication as an issue. Our environment has thousands of queues, so we thought getting data of all queues from rabbitmq was taking time. After doing web search on header timeout value, we changed header_timeout vaulue in inputs.rabbitmq and restarted telegraf agent. This time we got a different error:
[inputs.rabbitmq] Error in plugin: Get “http://testrmq.vmware.com:15672/api/queues”: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
To fix above error we increased client_timeout and restarted telegraf agent. Unfortunately, we got one more error: [agent] [“outputs.wavefront”] did not complete within its flush interval. To resolve this, we increaseed flush_interval to 20s from default 10s. Bingo !!! Issue was finally resolved. We were able to see metrics in wavefront.
But 401 Unauthorized error message was still there in the logs. We increased log level of telegraf by changing debug value to true (debug = true) in telegraf.conf. We found following message: [inputs.rabbitmq] Requesting “http://localhost:15672/api/queues”. localhost is not the url we configured in inputs.rabbitmq. We checked further to 401 Unauthorized message in the logs and found something suspicious: Loaded inputs: cpu disk kernel mem net processes rabbitmq (2x) swap system.
That means telegraf loaded rabbitmq configuration twice. But /etc/telegraf/telegraf.d has only one configuration file related to rabbitmq. We have greped in whole file system for “localhost:15672” but could not find anything. Our rabbitmq inputs config file looked something like:
[[inputs.rabbitmq]]
url = "http://testrmq.vmware.com:15672"
username = "myuser"
password = "mypassword"
header_timeout = "10s"
client_timeout = "10s"
[[inputs.rabbitmq]]
nodes = ["rabbit@rmq1", "rabbit@rmq2","rabbit@rmq3"]
We just thought of commenting second inputs.rabbitmq and see what happens. Surprisingly, we no longer observed 401 authentication errors. So, telegraf treating each inputs.rabbitmq directive as configuration for a different rabbitmq instance. By the way, there were no url or credentials for the second input.rabbitmq directive, how telegraf was trying? Looks like, telegraf defaults url to localhost:15672 and credentials to guest/guest.
Author Profile

Latest entries
- August 23, 2023How we reduced SOA OSB provisioning from 4 days to 4 hours
- May 23, 2023rabbitmqRabbitmq Connection Error: javax.net.ssl.SSLHandshakeException: Invalid ECDH ServerKeyExchange signature
- March 24, 2023rcuSOA Suite 12.2.1.4 installation: Got exception when auto configuring the schema component(s) with data obtained from shadow table
- March 9, 2023rabbitmqNot able to start Rabbitmq Cluster: Cannot declare a queue ‘~s’ on node ‘~s’: ~255p