← Back to team overview

graphite-dev team mailing list archive

[Question #137386]: relay-rules.conf seems to cause loss of some metrics

 

New question #137386 on Graphite:
https://answers.launchpad.net/graphite/+question/137386

I'm running into a strange problem today. I've added a new relay-rules.conf file to my Graphite servers, and not getting the results I expected.

Our system layout consists of four hosts that are divided into two redundant pairs. I'm using a naming convention like this for my metrics: source-of-metric.HOSTGROUP.hostname.metric-name.

Our initial use for Graphite is to feed in data from Ganglia and build more robust graphs on top of that data. However, we have a fairly large number of machines, with many metrics per host, so we've decided to split the load of incoming metrics across the two redundant pairs. Our 'FTR' hostgroups comprise about half of the metrics, so our initial relay-rules.conf looks like this:

[ganglia-ftr]
pattern = ganglia\.FTR.*
servers = graphite2.ourdomain.net, graphite3.ourdomain.net

[default]
default = true
servers = graphite0.ourdomain.net, graphite1.ourdomain.net

This is working just fine. I can submit metrics to Graphite and they are ending up just where I expect them to. For example:

ganglia.FTRP-WEB1.ftrp-web1_ourdomain_net.cpu_usage -> Matches the first rule and goes to graphite2+graphite3
ganglia.FTRQ-DBA3.ftrq-dba3_ourdomain_net.disk_slash -> Matches the first rule and goes to graphite2+graphite3
ganglia.FLAP-SQL1.flap-sql1_ourdomain_net.swap_used -> Matches the second rule and goes to graphite0+graphite1
ganglia.FROD-CAT4.frod-cat4_ourdomain_net.nfs_activity -> Matches the second rule and goes to graphite0+graphite1

All is well with the world.

Now, I'm trying to add additional sources of data (such as GroundWork and one-off metrics), so I've added two rules in relay-rules.conf to send these additional metrics to the corresponding pairs. Our FTR* hostgroups still comprise about half of the overall load, so I intend to send the groundwork.FTR* and metrics.FTR* metrics over to graphite2+graphite3, just as I have with the Ganglia metrics. 

[ganglia-ftr]
pattern = ganglia\.FTR.*
servers = graphite2.ourdomain.net, graphite3.ourdomain.net

[metrics-ftr]
pattern = metrics\.FTR.*
servers = graphite2.ourdomain.net, graphite3.ourdomain.net

[groundwork-ftr]
pattern = groundwork\.FTR.*
servers = graphite2.ourdomain.net, graphite3.ourdomain.net

[default]
default = true
servers = graphite0.ourdomain.net, graphite1.ourdomain.net

However, after making this change (and this is the only change I've made), I'm unable to create new metrics in Graphite matching the patterns "metrics.*" or "groundwork.*". If I submit these metrics, carbon does not create any 'metrics' or 'groundwork' subdirectory in /opt/graphite/storage/whisper, no matter how many times I send the metrics.

If I revert the change and go back to the first relay-rules.conf, I can submit "metrics.*" and "groundwork.*" metrics to Graphite and carbon creates the appropriate subdirectories, populates them with whisper files, and I'm able to graph them in the web UI like you'd expect. However, since the rules are no longer present, all of the groundwork.* and metrics.* metrics end up matching the default rule and appear together on the same pair of backend servers, which is not what I want.

I'm probably doing something stupid - let me know if you need any more information to assist.

Thanks in advance,
Sean 

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.