← Back to team overview

maria-discuss team mailing list archive

Re: The future of Dynamic Columns

 

2015-01-29 11:38 GMT-02:00 Tom Worster <fsb@xxxxxxxxxx>:

> Hi Frederico,
>
> You identify three problems.
>
> 1. Developers are mostly ignorant of dynamic column or that they can solve
> this problem. I agree.
>
hum, i will point some problems and ideas...
normally developers don't think about how data is saved and handled by
database / storage engine / storage media / cached, etc...
that's something that not all developers take care, today we have 1TB hdd
easily, some years ago 250GB was very much, and some years ago 4GB was very
very big (we couldn't use 32bits machines with more than 4GB)
with hdd we have problem with data access... no more, ssd is something that
will save many guys 250mb/s of random access is very good, some years ago
we only have this with hdd+raid systems

dynamic column consume more space, but are more flexible, i think the use
of dynamic column are more related to the developer knowledge... for
example, let's talk about a issue tracker system where we don't know what
we will expand
let's get mariadb issue tracker exeample... Jira, create a new variable at
Jira system, does jira use a table with "issue_id / variable/ value"
columns to save it? or it use a dynamic column at "issues" table? both
works, but what is better?
that's where json/yaml/xml/dynamic columns can help, today i see many many
many developers using "issue_id / variable / value" cause we don't have a
index to fast query system, other point is... with a single table (issues
table) we need more lock control probably, updating a row constantly is
slower than updating just the table that we need if you don't tune database

well... i think users want options to solve problems, dynamic columns is a
option (very good), but without index, i think we can't increase the use of
dynamic columns, users will use dynamic columns, but when they get a table
with >1M rows, they will think about well that query with dynamic column
take 2seconds to return, how could i optimize it?... create a new field?
create a table? we can't index it easly.. that's the main problem that i
see with dynamic columns




> 2. Maria's interface is not easy.

I think mariadb/mysql is easier to use than pgsql, oracle, sql server,
odbc, and some others


> I agree but my opinion has changed
> during this discussion. There's a limit to what the server can do about
> this. Also, the CRUD app developer is unlikely to be writing SQL.

yes, that's a big problem in my opnion, i prefer write each sql with
strings and stress with sql injection tests


> So I think we need to look farther up the stack for the solution(s).
>
that's a nice solution when you use it :), but think about where to solve
this problem... at client side or at server side? is your server better
than your client (cpu/memory/network/cache/etc)?


>
> 3. No indexes. I partly agree. On one hand, if you really want to sell the
> feature to the broad user community then yes, this is an obvious hole in
> your marketing story.

hum... well many years ago i used postgresql, interbase, sql server and
oracle... i prefer mysql cause it's easy to use and fast to solve (simple)
queries, with dynamic columns i don't see 'simple' queries running fast, if
i really need dynamic columns in my project where i can't live without it,
i will use another database, or probably will use a virtual column or
something to solve this and use more disk/cache/buffer than i should



> But the feature is still very useful without it.
>
yes! increase number of columns is a problem without online alter table,
when you have more than 1 dimension inside dynamic column, a table with
id-key-value can't handle well


> Many applications will function fine without indexes.

yeap


> The general problem (a type with too many properties) is old and has been
> solved in the past
> without indexes. Dynamic columns are a big step forwards even without
> the indexes.
>
yes and no... dynamic columns today are nice, cause we have space, and big
machines, but without index we can't use it with big tables >1M rows for
example...


>
> The way I see it now, we are complaining to the wrong people. We have to
> do some work on the client side (where exactly, I'm not sure) and we need
> better marketing.
>
well if you want to handle data (convert to/from json, include a column,
remove a column) probably it's nice to do something at client side, but if
you want to optimize queires , optimize space, and others things you will
need to improve database side, index is a first good step to optimize any
table scan query


>
> Tom
>
>
> On 1/29/15, 2:56 AM, "Federico Razzoli" <federico_raz@xxxxxxxx> wrote:
>
> >Hi
> >
> >I agree that EAV is a pain in the ass. Having several very similar tables
> >that represent flavors of the same "parent entity" is a pain too.
>
>Dynamic Columns are logically simple.
>
yeap, agree


> >
> >I think that the problem is that people don't really know them. And if
> >one tries to explore them, he finds a syntax that is not really easy.
> >Also they are slow, unless you index them, but indexing a dynamic column
> >is not easy - also, requires more space than a normal index, because you
> >are supposed to create a persistent column first.
>
yeap, agree


> >I don't have any suggestions, except that importing data from JSON would
> >be easier and probably useful (something like a JSON_TO_DYNCOL function).
>
nice, that's a example of data handle, instead of storage optimization
(specific field type to save data with less space, or save data with a
faster method to query, like geometry index) or query optimization (index +
optimizer changes)


> >
> >Regards
> >Federico
>

References