← Back to team overview

graphite-dev team mailing list archive

Re: [Question #228472]: DESTINATIONS vs CARBONLINK_HOSTS in a cluster

 

Question #228472 on Graphite changed:
https://answers.launchpad.net/graphite/+question/228472

    Status: Open => Answered

Anatoliy Dobrosynets proposed the following answer:
Yeah, I wander too.

When webapp receives a render metric ,  it first tries to determine whether the appropriate .wsp file(s) can be found locally ( under the DATA_DIRS path from local_settings.py )
If found - data are read from the file.
If .wsp is not found locally, then webapp sends a request (iterates) to all other webapp nodes defined in CLUSTER_SERVERS, which in turn, looks up for data file in their local DATA_DIRS storage. Eventually, one of the nodes has it.
Then,
 webapp needs to merge received 'cold' file data with the 'hot' carbon-cache data (that are carbon-cache'ed but were not written to disk yet). For this reason, webapp uses consistent-hashing to select a cache instance (host:port) to send query to.
Cold and hot data are merged, converted (csv/json/png/) , and send back in response to inital request.

Both carbon-relay and webapp uses the same consistent-hashing
get_node(metric_key) function to select the node to work with. Buth the
hash_ring is generated on different sets of arguments.

For carbon-relay :  hash_ring =  ConsistentHashRing(DESTINATIONS)
For webapp :           hash_ring =  ConsistentHashRing(CARBONLINK_HOSTS)

I just made a quick test to confirm that these two rings may return
different hosts:

DESTINATIONS = ["127.0.0.1:2013:a","127.0.0.1:2023:b","127.0.0.2:2013:a","127.0.0.2:2023:b"]
CARBONLINK_HOSTS = ["127.0.0.1:2013:a","127.0.0.1:2023:b"]

>>> hash_ring = ConsistentHashRing(DESTINATIONS)
>>> hash_ring2 = ConsistentHashRing(CARBONLINK_HOSTS)

, they may return the same instance for one key:

>>> hash_ring.get_node('system')
'127.0.0.1:2013:a'

>>> hash_ring2.get_node('system')
'127.0.0.1:2013:a'

, and completely different instance for another key:

>>> hash_ring.get_node('system.load')
'127.0.0.2:2023:b'

>>> hash_ring2.get_node('system.load')
'127.0.0.1:2023:b'

My opinion is that CARBONLINK_HOSTS should always match the
DESTINATIONS, otherwise hash_ring.get_node(key) simply return wrong
result as above. In this example, carbon-relay will writes system.load
metric to 127.0.0.2:2023:b,  webapp is able to get 'cold' data but fails
to merge it with 'hot' data, because it looks them up at wrong carbon-
cache host. It is not that bad eventually, but we loose rel-time
statistics.

carbon.conf documentation puts a lot of confusion here saying that only
local carbon-cache instances should be listed in CARBONLINK_HOSTS.

It is possible that I'm just getting it wrong, of course, but I'd be very grateful and buy a pint of beer to someone who would point out my mistakes here :)
Chris, we need you !

With regards,
Anatoliy Dobrosynets

-- 
You received this question notification because you are a member of
graphite-dev, which is an answer contact for Graphite.