← Back to team overview

maria-developers team mailing list archive

Re: [Spatial] On current implementation approach

 

hi again :)
i'm not a mariadb team developer, so please consider me as an user/udf
developer =]

2013/9/23 Mateusz Loskot <mateusz@xxxxxxxxxx>

> On 23 September 2013 22:10, Alexey Botchkov <holyfoot@xxxxxxxxxxxx> wrote:
> >>
> >> 1. Is it possible to implement MariaDB extensions like Spatial (custom
> >> type + set of functions) without such a tight coupling with the
> >> internal implementation of the type system (without messing Field
> >> class with geometry types directly, etc.)?
> >
> >
> > Yes, it is possible. The core algorithms are separated from the Field
> > structure and any other database internals.
> > They are placed in sql/gcalc_slicescan.cc and sql/gcalc_tools.cc files.
>
> Yes, but my question is not really about location of computational geometry
> bits, but about the data management: SQL data type for geometry objects,
> input/output routines.
>
> Due to my lack of experience with MariaDB/MySQL UDF, I simply assumed that
> if:
> 1. Field is the only place that defines GEOMETRY type (and there is no
> CREATE TYPE support)
>

create type probably will be a 10.1 feature:
https://mariadb.atlassian.net/browse/MDEV-4912
and maybe you will not have a spatial key optimization in the first version
of this feature

in my opnion if you start a new udf today with gis, you should use the WKB
+ a second lib (geos is very good) to handle spatial data
geos can use the WKB with a fast "unserialize": GEOSGeomFromWKB_buf



> 2. UDF prototypes will use of GEOMETRY in their prototypes to declare
> input/output parameters
> then I couldn't understand how it is possible to remove geometry
> definitions from Field
> and other internal definitions.
>
> But, I've just found this project [1] with extra spatial UDFs, so I
> think I understand the UDF
> protocol regarding I/O arguments would not require explicit GEOMETRY type
>

yes, you don't have a GEOMTRY_TYPE for arg_type[] at udf
check your example at your mysql-spatial-udf git project:

my_bool msudf_within_init(UDF_INIT *initid,UDF_ARGS *args,char *message)
...

*	args->arg_type[0] = STRING_RESULT;*
...


long long msudf_within(UDF_INIT *initid,UDF_ARGS *args,char *is_null,
char *error)
...

geom1 = msudf_getGeometry((unsigned char *)args->args[0],args->lengths[0]);		

set arg_type to STRING_RESULT, and use a cast (unsigned char *) to handle
raw geometry data


making it possible to move Spatial Extensions completely out of
> built-ins (trunk/sql/ files).
>
> [1] https://github.com/krandalf75/MySQL-Spatial-UDF



mariadb 10.0 have plans about OPENGIS:
https://mariadb.com/kb/en/plans-for-10x/#opengis-compliance
but i didn't found JIRA report about it, or another worklog or something
similar (must check if it's in lauchpad bug track or another lauchpad
branch)
and i don't know if mariadb will use GEOS... but from what i know, geos is
the best opengis lib today, why not use it at mariadb?! =)



>
> >> 2. Is it possible to implement Spatial using User-Defined Functions
> >> (UDF) defined in shared binary?
> >
> >
> > The spatial functions/operations can be implemented with UDF, but
> > that makes query optimization and using Spatial keys problemmatic.
>
> So, for real use case, the idea I brainstormed above would not make sense.
> Unless, there is workaround for those problems you mean.
>

well i don't know what problemmatic means at high/low level, but i think
it's something like this at sql layer:

WHERE udf_function(x)
in theory this udf_function() could be optimized with rtree index Y...
but it will do a table scan... optimizer don't know how to use index with
udf functions yet :(
check that some internal functions don't have optimizations too, like:
SUBSTRING(indexed_field,1,4)='abcd' could be rewrite as (indexed_field LIKE
"abcd%" OR indexed_field='abcd')


a workaround about index should be done at application side, could be
something like:
WHERE udf_function(x) and other_builtin_function_that_use_index(xxxx)
with this "other_builtin_function_that_use_index" function (envelop
funciont for example) , you could use the spatial index and optimize the
query... but it's not the "best" solution at server side, but the only i
can think as a udf developer :)

well if you know how to code at mysql/mariadb server side... you can patch
the optimizer, but i think it's a hard work, optimizer is black magic to me
yet =]



> >> 3. What is the reason behind using Well-Known-Binary (WKB) stream of
> >> bytes to transport geometry values into/from functions? Is it due to
> >> limitations of MariaDB type system where String is  the only universal
> >> carrier for complex data? This concern is related to necessity of
> >> encoding/decoding WKB when chaining spatial function calls, and
> >> possibilities to avoid it.
> >
> >
> > The reason was mostly historical. It was sufficient for the first
> > implementations of the Geometry field types and somewhat convenient as
> > we don't need to perform conversions
> > when we need to import/export features in their WKB representation.
> > But yes, that format is inefficient and difficult to handle properly. I
> plan
> > to get rid of it internally - only support importing-exporting it.
>

hum... what could be a better format? i think that at disk level (storage)
WKB is a small format, not? maybe we could do a better data type? but why?
the mysql built-in functions can't handle GIS fast? the RTREE is slow with
this datatype? any good point to consider with a new data type?


>
>
I roughly understand, but how do you plan to pass geometry data around,
> in what format?
>
> AFAIU, it is not possible to pass user-defined types into/from SQL
> functions,
> so geometries would have to be passed as String objects anyway, wouldn't
> they?


well from what i know, mysql "built-in" geometric function use ~wkb format
too



> IOW, there are only 3 types available (integer, real, string), so
> String is the only one

usable to pass geometry objects around, regardless of actual encoding
> format,
> WKB, WKT, any other binary stream...
>

when using GEOMETRIC data type at sql functions, it's always ~WKB
from your example at git, internally WKB start at position 4, and SRID is
the first 4 bytes, check:

GEOSGeom msudf_getGeometry(unsigned char *buf,unsigned int length)
{
	int srid;
	GEOSGeom geom;

	geom = GEOSGeomFrom*WKB*_buf(*buf + 4*,length - 4);
	if (geom != NULL) {
		srid = *msudf_getInt*(buf);
		GEOSSetSRID(geom,srid);
	}
	return geom;
}

int *msudf_getInt*(const unsigned char *buf)
{
	int result;
    unsigned char *p = (unsigned char *)&result;

*    p[0] = buf[0];*
*    p[1] = buf[1];*
*    p[2] = buf[2];*
*    p[3] = buf[3];*

	return result;
}





>
> It means, that if I want to pass geometry to my_foo UDF:
>
> MSUDF_API char*
> my_foo(UDF_INIT *initid,UDF_ARGS *args, char *buf, unsigned long
> *length, char *is_null, char *error);
>
> the only option available is to make geometry into a kind of stream of
> bytes
> and passed as one of args item.
> So, a kind of serialising/deserialising is in fact unavoidable.
>

unserialize/serialize only if you use different string format than
mariadb/mysql source code =) like geos...


>
> Is my understanding correct?
>

well i think yes :)

did you checked that dev.mysql.com have a poll about
"What are the top 3 GIS related features that you are most interested in
for your new and existing MySQL projects?"

maybe we are talking about something in the near future right now...
mariadb 10.0 have this 'problem' too check the mariadb 10.0 plans



>
> Best regards,
> --
> Mateusz  Loskot, http://mateusz.loskot.net
> "Participation in this whole process is a form of torture" ~~ Szalony
>
> _______________________________________________
> Mailing list: https://launchpad.net/~maria-developers
> Post to     : maria-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~maria-developers
> More help   : https://help.launchpad.net/ListHelp
>


thanks reading this email :)

sorry, i can't help a lot here :/

i want some opengis features too, but my apps aren't high read/write yet,
parsing the geometry at php (client side) is ok to me

well... mariadb team is very good and create many features that mysql don't
create in 7 years with just weeks :) like my base64 functions...
wait is a option at mariadb world :) at mysql you could forget it hehe :)

if you don't have time.... i don't know what to say ='(
the best opensource database with many opengis features today is
postgresql, sorry mariadb team :(

well this project (https://github.com/krandalf75/MySQL-Spatial-UDF) is very
interesting, maybe we could port manythings... i will report it at jira to
don't forget it

-- 
Roberto Spadim

Follow ups

References