← Back to team overview

maria-developers team mailing list archive

Re: GSoC 2015: Indexes on Virtual Columns and Array Based UDFs


Hi, Richa!

On Mar 13, Richa Sehgal wrote:
> Dear mentors,
> I am currently pursuing Master's in University of Illinois at Urbana
> Champaign, USA and completed by B.Tech from Indian Institute of Technology
> - Delhi (IIT- Delhi)
> This is regarding GSoC 2015. I am really interested in databases, and I was
> very excited to see all these projects listed here. The exciting part was
> that some of the projects are really “hard” as in they have challenged the
> database community since a long time, and thus it would be very interesting
> to solve some of these challenges as part of GSoC.
> I want to discuss 2 projects:
> A. Indexes on virtual columns
> Materialization gives us two things:
> 1. A name to the column which we can use in queries

Not quite. Even non-materialized columns always have names.
See https://mariadb.com/kb/en/mariadb/virtual-computed-columns/

> 2. A formal "regular" column which is stored and indexed in the regular
> fashion - Disadvantage: Extra memory requirements for the materialized
> column.


> My initial thoughts on this project are the following:
> We do need the name of the column which can be used to query. So maybe we
> can expose a command such as:
> create virtual_index <name> on <column_name> <expression>
> What this would do would run a regular query which evaluates expressions
> (like in WHERE clause) and the feed the result into the indexer. This index
> can then be stored in the regular fashion.

Yes, in a sense.

The update_virtual_fields() function calculates values of virtual
columns (e.g. if one does SELECT * from table_with_virtual columns).

In MyISAM, _mi_make_key() creates a key from the row image.

Pure virtual columns are not in the row image, so they cannot be indexed
that way.

So, in a very simplified way, this project is about calling
update_virtual_fields() from _mi_make_key().

> B. Having UDFs returning an array/set
> There are three approaches that I can think of:
> I would like to discuss these and then decide on one of them. Am I
> approaching this in the right direction? Can you please point me to
> the next steps?

I'm afraid, the project was badly formulated.

I've rewritten it now to highlight what it is really about.
It is not about arrays, the important part is a "result set".
See MDEV-5199


Follow ups