← Back to team overview

openerp-expert-framework team mailing list archive

a plan for dealing with the many (Magento) custom EAV product attributes

 

Hello folks,

This is a know issue of Magentoerpconnect that we tend to pay a huge
performance penalty when Magento brings lot's of custom attributes to
OpenERP as extra columns in the 'product_product' table.
We discussed this here already:
https://bugs.launchpad.net/magentoerpconnect/+bug/648667 There are
historical reasons why we did it this way because OpenERP inherits was
vastly broken in OpenERP v5.
Sharoon and Sebastien just dicsussed it here again:
https://bugs.launchpad.net/openobject-server/+bug/731156


I've thought extensively to the topic and here is the progressive migration
path I propose toward a better performance (this can also be a generic
custom attributes pattern for OpenERP that's why I post it th the framework
list too):

1) move the magento attributes in a different table and define a new object
like "product.extended" which would inherit from product.product. Hence
loading Magento extra attributes will only happen when browsing that
extended object via a dedicated menu/wizard. Regular OpenERP browse/read of
"product.product" would then be as fast as usual. Doing this would be pretty
straightforward and bring a lot of performance back.
The "product.extended" view can be created dynamically overriding
fields_view_get much like we are doing and reusing the 'product.product'
view, just injecting the extra attributes in an extra tab as we are doing
already.

2) I don't think there is a real need for a full blown EAV pattern (or at
least I think it's not our job to take care of doing that in an ecommerce
connector): to gain extra perf, we could keep all the optional Magento
defined fields and make them "fields.function". Instead of all having their
custom columns, they could rather pull/push their value form/into a a JSON
( http://www.json.org/ )  text element inside a single field of
"product.extended" (could also be XML but I tend to find it more bloated).
Thus we would have only ~1 extra text field no matter how many extra custom
attributes, now matter how sparse they are. I think those sparse optional
attribute don't deserve to be SQL, it's just a custom data bag where we when
to store custom attributes in a formatted manner. The advantage of doing 1)
too in the scenario woudl be we avoid hitting always that JSON field +
in/out costly methods when we don't need to; that is however is still open
question fro me.

Magento had no suitable DDL engine, so they went EAV all the way, this is
also a very bad design choice having huge performances impacts on Magento as
soon as caching cannot be used (fortunately for Magento caching can often be
used in the web shop static display and then perceived speed is rather bound
to Apache/Foo serving static HTML)
http://inchoo.net/ecommerce/magento/escape-from-eav-the-magento-way/
http://thebuild.com/presentations/10-ways-to-kill-performance.pdf
BTW, there are rumors that Magento will try to step out of
that unbelievable design error in what they will call "Magento 2".

I think my design for 2) would bring the best of the SQL and NoSQL worlds to
OpenERP: attributes that will be used for search/join will be SQL; sparse
custom ones would be a JSON NoSQL databag.


What do you think?

Follow ups