← Back to team overview

kicad-developers team mailing list archive

Re: [PATCH] Increase initial vertex container size by order of magnitude

 

It looks like the "sort layers" feature is causing removal and re-adding of
all items on the canvas, which is extremely slow.
I'm going to see if that is actually necessary, since my initial thought is
that it should be possible to avoid that work.
Unfortunately RTREE::Remove is super slow if there are lots of items, so we
have to be careful to not call that when we don't need to...

On Thu, Feb 1, 2018 at 1:51 PM, jp charras <jp.charras@xxxxxxxxxx> wrote:

> Le 01/02/2018 à 18:16, Jon Evans a écrit :
> > Thanks for the data, Seth.
> >
> > I'm curious, has anyone tried my other patch (changing from bitset to
> vector) to see if it has any
> > performance impact on your machines?
>
> I just tested it in Gerbview, and I do not really seen any performance
> change.
>
> During my tests, I noticed a relly strange behavior (not depending on this
> patch):
>
> I loaded all .gbr samples:
> - The first time I sort layers by X2 order, the calculation time is 23 -
> 25 s.
> - The second time (and next times) I sort layers by X2 order, the
> calculation time is 45 s.
>
> >
> > -Jon
> >
> > On Thu, Feb 1, 2018 at 11:58 AM, Seth Hillbrand <
> seth.hillbrand@xxxxxxxxx
> > <mailto:seth.hillbrand@xxxxxxxxx>> wrote:
> >
> >     Hi Jon-
> >
> >     I see a similar situation to JP (although my machine is not so
> fast).  Loading all layers is
> >     fast.  Changing the dcode display takes about 40s on my machine and
> no difference in the patch.
> >     Interestingly, it is faster for me to close all layers, change the
> dcode display and re-open the
> >     files than it is to change the display with the files open.
> >
> >     I also tried the halftone image from https://bugs.launchpad.
> net/kicad/+bug/1745050
> >     <https://bugs.launchpad.net/kicad/+bug/1745050> in pcbnew.  Without
> the patch, it takes about 30
> >     seconds to re-draw the vertices.  With the patch, it takes about
> 25s.  By making the new size a
> >     power of two (right now it is 2^20, so if we make it 2^24), I get a
> 20s load time.  However,
> >     increasing beyond my GPU's memory (2^26) results in no image
> whatsoever.  So I suspect that more
> >     limited GPUs may see a similar issue sooner.
> >
> >     -S
> >
> >     2018-02-01 7:48 GMT-08:00 jp charras <jp.charras@xxxxxxxxxx <mailto:
> jp.charras@xxxxxxxxxx>>:
> >
> >         Le 01/02/2018 à 15:34, Jon Evans a écrit :
> >         > Thanks for feedback Orson and JP, I will definitely keep
> investigating.
> >         > FYI I only see a big speed change in GerbView when loading all
> of the test files (not just a single
> >         > layer) from the bug report.
> >         > I also see no difference in pcbnew (which tends to have much
> fewer vertices)
> >
> >         Exactly, loading all files is fast.
> >
> >         This is only running "Sort Layers if X2 Mode" (or Show / hide
> DCodes) that takes a while.
> >         (25s with or without the patch)
> >
> >         >
> >         > -Jon
> >         >
> >         > On Thu, Feb 1, 2018 at 6:05 AM, jp charras <
> jp.charras@xxxxxxxxxx
> >         <mailto:jp.charras@xxxxxxxxxx> <mailto:jp.charras@xxxxxxxxxx
> >         <mailto:jp.charras@xxxxxxxxxx>>> wrote:
> >         >
> >         >     Le 01/02/2018 à 10:43, Maciej Sumiński a écrit :
> >         >     > Hi Jon,
> >         >     >
> >         >     > TL;DR: I have no idea why it boosts the performance. I
> would love to see
> >         >     > some extra input from others, to see if patch helps on
> other systems or
> >         >     > at least does not decrease performance on low-spec cards.
> >         >     >
> >         >     >
> >         >     > It is hard to explain why do you see a performance
> boost. I tried it on
> >         >     > my machine (integrated Intel GPU), it takes the same
> time to load the
> >         >     > files attached to the bug report. You can set WXTRACE
> environmental
> >         >     > variable to trace vertex allocation (only for debug
> builds):
> >         >
> >         >     I have a similar result: no significant performance boost.
> >         >     I loaded only one test file "acquire-PWM-F.Cu.gbr".
> >         >     It is fast to load, but takes 12 - 13 seconds to redraw
> when I run "Sort Layers if X2
> >         Mode"
> >         >
> >         >     >
> >         >     > WXTRACE=GAL_CACHED_CONTAINER_GPU ./gerbview/gerbview
> >         >     > 10:20:00 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Resizing
> & defragmenting
> >         >     > container from 1048576 to 2097152
> >         >     > 10:20:00 AM: Trace: (GAL_CACHED_CONTAINER_GPU)
> Defragmented container
> >         >     > storing 1048576 vertices / 44.6 ms
> >         >     > 10:20:04 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Resizing
> & defragmenting
> >         >     > container from 2097152 to 4194304
> >         >     > 10:20:04 AM: Trace: (GAL_CACHED_CONTAINER_GPU)
> Defragmented container
> >         >     > storing 2097147 vertices / 80.1 ms
> >         >     > 10:20:11 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Resizing
> & defragmenting
> >         >     > container from 4194304 to 8388608
> >         >     > 10:20:11 AM: Trace: (GAL_CACHED_CONTAINER_GPU)
> Defragmented container
> >         >     > storing 4194300 vertices / 144.7 ms
> >         >     >
> >         >     > It is true that once your patch is applied there are no
> resize &
> >         >     > defragment operations, but it looks as if it saved only
> several hundred
> >         >     > msecs. Maybe it is different for nVidia cards.
> >         >     >
> >         >     > Regarding low-end GPUs: each vertex is 32 bytes, so
> resizing the initial
> >         >     > container size from 1048576 to 10485760, changes the
> initial occupied
> >         >     > video memory (or RAM if GPU cannot cope with video
> memory mapping) from
> >         >     > 32 MB to 320 MB. I traced the container code and it
> turns out only very
> >         >     > complex boards (A64-OLinuXino and Chris's motherboard)
> require 2M or 4M
> >         >     > vertices in pcbnew, so the majority of the allocated
> memory stays unused
> >         >     > for most cases. I think we should look for performance
> improvements
> >         >     > elsewhere.
> >         >     >
> >         >     > One of ideas I never had the time to implement is to
> reduce vertex size
> >         >     > (struct VERTEX) by replacing full color information with
> an index to a
> >         >     > color palette stored in an uniform object.
> >         >     >
> >         >     > Another possible modification is to get rid of shader
> parameters for
> >         >     > vertices that do not use them. It would require two
> cached containers -
> >         >     > one for plain vertices, the other for vertices with
> parameters.
> >         >     >
> >         >     > Cheers,
> >         >     > Orson
> >         >     >
> >         >     > On 02/01/2018 04:29 AM, Jon Evans wrote:
> >         >     >> Next in the series of OpenGL performance work related
> to this bug:
> >         >     >> https://bugs.launchpad.net/kicad/+bug/1745203
> >         <https://bugs.launchpad.net/kicad/+bug/1745203>
> >         <https://bugs.launchpad.net/kicad/+bug/1745203 <
> https://bugs.launchpad.net/kicad/+bug/1745203>>
> >         >     >>
> >         >     >> In the scenario of this bug report (load lots of gerber
> files, each with
> >         >     >> lots of items) we struggle with vertex allocation.
> >         >     >> On my Linux system with NVIDIA GPU, the attached patch
> speeds up
> >         >     >> load/display of the file (and showing of Dcodes) by 50%.
> >         >     >>
> >         >     >> This patch feels kind of like a hack; and I'm not sure
> if it would cause
> >         >     >> any issues on low-spec systems.
> >         >     >> I'm curious for input from Orson/Tom and anyone else
> who has looked at our
> >         >     >> OpenGL system.
> >         >     >>
> >         >     >> But, if it is safe, it could be a nice one-line speed
> improvement for
> >         >     >> scenarios where we throw tons of items at the OpenGL
> GAL.
> >         >     >>
> >         >     >> Maybe a longer-term fix would be to take a look at the
> vertex container
> >         >     >> algorithms and see if we can be more intelligent about
> when and how much we
> >         >     >> allocate memory to reduce this overhead without having
> such a huge initial
> >         >     >> size.
> >         >     >>
> >         >     >> -Jon
> >         >     >>
>
> --
> Jean-Pierre CHARRAS
>
> _______________________________________________
> Mailing list: https://launchpad.net/~kicad-developers
> Post to     : kicad-developers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~kicad-developers
> More help   : https://help.launchpad.net/ListHelp
>

References