← Back to team overview

drizzle-discuss team mailing list archive

Re: Wild feature: query de-duplication


(1) smart enough to ignore SQL comments (some folks stick a comment at
the end of the query with a unique host Id or something for stats and/or

(2) configurable so the site can decide they don't care about
transaction state and is willing to say that 2 queries are duplicates if
the match a very small set of features (user makes sense, but not sure
about tx isolation level and other "exotic stuff")

The reason I suggest a "related" approach to seeing that two queries are
equal is that I bet most folks doing high volumes of possibly duplicate
queries are not doing much fancy in the first place.  They're just
hitting a bank of slaves from their webbies and hoping to repopulate the
failed memcache tier without killing the DB boxes i n the process.

Or am I nuts?

No, you are not nuts. That is how most people learn what a cache stampede is. =)

Yes, an active and passive mode seems logical. Passive mode would take into consideration session state and things. Active mode could allow (like gearman) a query id to be sent. If that query id is already being worked on, attach the request to that thread for the answer. That way it could be entirely controlled by the client code.

I still want to hear from a Drizzle dev on the feasibility of this in the engine whether it would be core or plugin.


Follow ups