maria-discuss team mailing list archive
-
maria-discuss team
-
Mailing list archive
-
Message #02359
Re: stored programs
Hi,
This is the kind of thing you can do in PostgreSQL and it is amazingly very
fast. I like MySQL but come on guys, this is 2015. SQL92 is 23 years
old. By the way this adds two more requests: stored aggregate functions
and stored table functions
-- Table stored function which combines shuttle radar topography raster
data (SRTM)
-- and Open street map geometry data
CREATE FUNCTION cache_1m_tile(integer) RETURNS SETOF tile
LANGUAGE sql
AS $_$
with lands as -- common table expressions
-- Use raster and geometry objects together with GiST index
(select (st_pixelaspoints(st_clip(rast, geometry),1,false)).x,
(st_pixelaspoints(st_clip(rast,geometry),1,false)).y, array_accum(type) as
"types", array_accum(name) as names, array_accum(z_order) as z_orders
from import.osm_landusages
join srtm_1m on st_intersects(rast, geometry)
where rid = $1
group by 1,2
),
buildings as
( select * from srtm_buildings where rid = $1),
places_and_amenities as
-- array_accum is a stored AGGREGATE function that works like group_concat
(select x,y, array_accum(name) thing_name, array_accum(type) thing_type
from
( select (st_pixelaspoints(st_clip(rast, geometry),1,false)).x,
(st_pixelaspoints(st_clip(rast,geometry),1,false)).y, 'place::' || name as
name, 'place::' || type as type
from import.osm_places
join srtm_1m on st_intersects(rast, geometry)
where rid = $1
union all
select (st_pixelaspoints(st_clip(rast, geometry),1,false)).x,
(st_pixelaspoints(st_clip(rast,geometry),1,false)).y, 'place::' || name as
name, 'place::' || type as type
from import.osm_amenities
join srtm_1m on st_intersects(rast, geometry)
where rid = $1
) foo
group by 1,2
)
insert into srtm_1m_tiles
select p.rid,p.x,
p.y,
p.val height,
p.geom,
floor(b.area) building_area,
b.id building_id,
r.id road_ids ,
r.z_order road_z_order,
r.bridge road_bridge,
r.tunnel road_tunnel,
r.rtype road_type,
w.val is not null has_water,
lands.names as land_names,
lands.types as land_types,
lands.z_orders as land_z,
pa.thing_name,
pa.thing_type
from srtm_1m_pixels p
left join buildings b
on b.rast_geom && p.geom
and st_contains(b.geometry, p.geom)
and p.rid = b.rid
left join srtm_road_accum r
on b.rid = r.rid
and p.x = r.x
and p.y = r.y
left join srtm_water_pixels w
on b.rid = w.rid
and p.x = w.x
and p.y = w.y
left join srtm_transport_pixels tp
on b.rid = tp.rid
and p.x = tp.x
and p.y = tp.y
left join lands
on p.x = lands.x
and p.y = lands.y
left join places_and_amenities pa
on p.x = pa.x
and p.y = pa.y
where p.rid=$1
and not exists (select 1 from srtm_1m_tiles t2 where p.rid = t2.rid and
p.x = t2.x and p.y = t2.y)
returning *;
$_$;
On Tue, Mar 3, 2015 at 7:05 AM, Justin Swanhart <greenlion@xxxxxxxxx> wrote:
> Hi,
>
> Just to avoid ambiguity: the stored function is executed as byte code in
> each call, and the native function is called by the byte code for each call.
>
> --Justin
>
> On Tue, Mar 3, 2015 at 7:04 AM, Justin Swanhart <greenlion@xxxxxxxxx>
> wrote:
>
>> Hi,
>>
>> The server parses the function into a form of bytecode which is then
>> cached until the routine changes. In the past any routine change would
>> invalidate cache for all routines and that sometimes can still happen but
>> it is rare. The server then interprets the bytecode for every call to the
>> function and yes, the function is executed every time unless you memoize
>> the function ( if(function_input != @last_input) then @last_input =
>> function_input; return @last_result := function(function_input); else
>> return @last_result;
>>
>> On Tue, Mar 3, 2015 at 6:59 AM, Peter Laursen <peter_laursen@xxxxxxxxxx>
>> wrote:
>>
>>> I really don't know if it should take 2, 5 or 60 seconds. The numbers
>>> surprise me (in the bad maner). There is an incredible overhead when
>>> wrapping this query in a function. I wonder if there is an invocation of
>>> the function for every iteration.
>>>
>>> -- Peter
>>>
>>> On Tue, Mar 3, 2015 at 2:51 PM, Federico Razzoli <federico_raz@xxxxxxxx>
>>> wrote:
>>>
>>>> SELECT 1 was .42. What I find amazing is not the absolute numbers (it's
>>>> an old computer with desktop installed, etc) but the difference between the
>>>> tests. If SELECT 1 takes .42, test 2 shouldn't take more than 1 min... do
>>>> you agree?
>>>>
>>>> Federico
>>>>
>>>>
>>>> --------------------------------------------
>>>> Mar 3/3/15, Justin Swanhart <greenlion@xxxxxxxxx> ha scritto:
>>>>
>>>> Oggetto: Re: [Maria-discuss] stored programs
>>>> A: "Peter Laursen" <peter_laursen@xxxxxxxxxx>
>>>> Cc: "Federico Razzoli" <federico_raz@xxxxxxxx>, "Maria Discuss" <
>>>> maria-discuss@xxxxxxxxxxxxxxxxxxx>
>>>> Data: Martedì 3 marzo 2015, 14:28
>>>>
>>>> You
>>>> probably have either a faster CPU or bigger cache on your
>>>> cpu. How long does benchmark select 1 take? You should
>>>> find it is faster on your system too, right?
>>>> --Justin
>>>>
>>>> Sent from my iPhone
>>>> On Mar 3,
>>>> 2015, at 6:18 AM, Peter Laursen <peter_laursen@xxxxxxxxxx>
>>>> wrote:
>>>>
>>>> Your 3rd test case takes
>>>> 2:02 in MariaDB 10.1 and 2:13 in MySQL 5.6 on my system
>>>> (when otherwise idle) .
>>>> --
>>>> Peter
>>>> On Tue,
>>>> Mar 3, 2015 at 1:58 PM, Federico Razzoli <federico_raz@xxxxxxxx>
>>>> wrote:
>>>> I made
>>>> some quick test to show what I mean by "performance
>>>> problem". Note that I'm not saying the the first
>>>> and the second test cases should perform equally. But there
>>>> is too much difference - see the conclusion.
>>>>
>>>>
>>>>
>>>> 1)
>>>>
>>>>
>>>>
>>>> MariaDB [test]> SELECT BENCHMARK(50000000, (SELECT
>>>> 1));
>>>>
>>>> +---------------------------------+
>>>>
>>>> | BENCHMARK(50000000, (SELECT 1)) |
>>>>
>>>> +---------------------------------+
>>>>
>>>> | 0 |
>>>>
>>>> +---------------------------------+
>>>>
>>>> 1 row in set (0.42 sec)
>>>>
>>>>
>>>>
>>>> 2)
>>>>
>>>>
>>>>
>>>> DELIMITER ||
>>>>
>>>> CREATE FUNCTION f()
>>>>
>>>> RETURNS TINYINT
>>>>
>>>> BEGIN
>>>>
>>>> RETURN 1;
>>>>
>>>> END ||
>>>>
>>>> DELIMITER ;
>>>>
>>>>
>>>>
>>>> MariaDB [test]> SELECT BENCHMARK(50000000, (SELECT
>>>> f()));
>>>>
>>>> +-----------------------------------+
>>>>
>>>> | BENCHMARK(50000000, (SELECT f())) |
>>>>
>>>> +-----------------------------------+
>>>>
>>>> | 0 |
>>>>
>>>> +-----------------------------------+
>>>>
>>>> 1 row in set (2 min 5.70 sec)
>>>>
>>>>
>>>>
>>>> 3)
>>>>
>>>>
>>>>
>>>> DELIMITER ||
>>>>
>>>> CREATE FUNCTION f(x TINYINT)
>>>>
>>>> RETURNS TINYINT
>>>>
>>>> BEGIN
>>>>
>>>> RETURN x;
>>>>
>>>> END ||
>>>>
>>>> DELIMITER ;
>>>>
>>>>
>>>>
>>>> MariaDB [test]> SELECT BENCHMARK(50000000, (SELECT
>>>> f(1)));
>>>>
>>>> +------------------------------------+
>>>>
>>>> | BENCHMARK(50000000, (SELECT f(1))) |
>>>>
>>>> +------------------------------------+
>>>>
>>>> | 0 |
>>>>
>>>> +------------------------------------+
>>>>
>>>> 1 row in set (3 min 35.20 sec)
>>>>
>>>>
>>>>
>>>> -- Conclusions:
>>>>
>>>>
>>>>
>>>> Times in seconds:
>>>>
>>>>
>>>>
>>>> 0.42
>>>>
>>>> 125.70
>>>>
>>>> 215.20
>>>>
>>>>
>>>>
>>>> I don't know which ratio would be acceptable, but the
>>>> difference betweem a trivial query and a trivial function is
>>>> too high.
>>>>
>>>> Also, the difference between 2) and 3) is that the function
>>>> in 3) accepts and returns a parameter. Again, the
>>>> performance difference seems to me too high.
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>> Federico
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>>
>>>> Mailing list: https://launchpad.net/~maria-discuss
>>>>
>>>> Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
>>>>
>>>> Unsubscribe : https://launchpad.net/~maria-discuss
>>>>
>>>> More help : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~maria-discuss
>>>> Post to : maria-discuss@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~maria-discuss
>>>> More help : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>
>
References