kicad-developers team mailing list archive
-
kicad-developers team
-
Mailing list archive
-
Message #33630
Re: [PATCH] Increase initial vertex container size by order of magnitude
Le 01/02/2018 à 18:16, Jon Evans a écrit :
> Thanks for the data, Seth.
>
> I'm curious, has anyone tried my other patch (changing from bitset to vector) to see if it has any
> performance impact on your machines?
I just tested it in Gerbview, and I do not really seen any performance change.
During my tests, I noticed a relly strange behavior (not depending on this patch):
I loaded all .gbr samples:
- The first time I sort layers by X2 order, the calculation time is 23 - 25 s.
- The second time (and next times) I sort layers by X2 order, the calculation time is 45 s.
>
> -Jon
>
> On Thu, Feb 1, 2018 at 11:58 AM, Seth Hillbrand <seth.hillbrand@xxxxxxxxx
> <mailto:seth.hillbrand@xxxxxxxxx>> wrote:
>
> Hi Jon-
>
> I see a similar situation to JP (although my machine is not so fast). Loading all layers is
> fast. Changing the dcode display takes about 40s on my machine and no difference in the patch.
> Interestingly, it is faster for me to close all layers, change the dcode display and re-open the
> files than it is to change the display with the files open.
>
> I also tried the halftone image from https://bugs.launchpad.net/kicad/+bug/1745050
> <https://bugs.launchpad.net/kicad/+bug/1745050> in pcbnew. Without the patch, it takes about 30
> seconds to re-draw the vertices. With the patch, it takes about 25s. By making the new size a
> power of two (right now it is 2^20, so if we make it 2^24), I get a 20s load time. However,
> increasing beyond my GPU's memory (2^26) results in no image whatsoever. So I suspect that more
> limited GPUs may see a similar issue sooner.
>
> -S
>
> 2018-02-01 7:48 GMT-08:00 jp charras <jp.charras@xxxxxxxxxx <mailto:jp.charras@xxxxxxxxxx>>:
>
> Le 01/02/2018 à 15:34, Jon Evans a écrit :
> > Thanks for feedback Orson and JP, I will definitely keep investigating.
> > FYI I only see a big speed change in GerbView when loading all of the test files (not just a single
> > layer) from the bug report.
> > I also see no difference in pcbnew (which tends to have much fewer vertices)
>
> Exactly, loading all files is fast.
>
> This is only running "Sort Layers if X2 Mode" (or Show / hide DCodes) that takes a while.
> (25s with or without the patch)
>
> >
> > -Jon
> >
> > On Thu, Feb 1, 2018 at 6:05 AM, jp charras <jp.charras@xxxxxxxxxx
> <mailto:jp.charras@xxxxxxxxxx> <mailto:jp.charras@xxxxxxxxxx
> <mailto:jp.charras@xxxxxxxxxx>>> wrote:
> >
> > Le 01/02/2018 à 10:43, Maciej Sumiński a écrit :
> > > Hi Jon,
> > >
> > > TL;DR: I have no idea why it boosts the performance. I would love to see
> > > some extra input from others, to see if patch helps on other systems or
> > > at least does not decrease performance on low-spec cards.
> > >
> > >
> > > It is hard to explain why do you see a performance boost. I tried it on
> > > my machine (integrated Intel GPU), it takes the same time to load the
> > > files attached to the bug report. You can set WXTRACE environmental
> > > variable to trace vertex allocation (only for debug builds):
> >
> > I have a similar result: no significant performance boost.
> > I loaded only one test file "acquire-PWM-F.Cu.gbr".
> > It is fast to load, but takes 12 - 13 seconds to redraw when I run "Sort Layers if X2
> Mode"
> >
> > >
> > > WXTRACE=GAL_CACHED_CONTAINER_GPU ./gerbview/gerbview
> > > 10:20:00 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Resizing & defragmenting
> > > container from 1048576 to 2097152
> > > 10:20:00 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Defragmented container
> > > storing 1048576 vertices / 44.6 ms
> > > 10:20:04 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Resizing & defragmenting
> > > container from 2097152 to 4194304
> > > 10:20:04 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Defragmented container
> > > storing 2097147 vertices / 80.1 ms
> > > 10:20:11 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Resizing & defragmenting
> > > container from 4194304 to 8388608
> > > 10:20:11 AM: Trace: (GAL_CACHED_CONTAINER_GPU) Defragmented container
> > > storing 4194300 vertices / 144.7 ms
> > >
> > > It is true that once your patch is applied there are no resize &
> > > defragment operations, but it looks as if it saved only several hundred
> > > msecs. Maybe it is different for nVidia cards.
> > >
> > > Regarding low-end GPUs: each vertex is 32 bytes, so resizing the initial
> > > container size from 1048576 to 10485760, changes the initial occupied
> > > video memory (or RAM if GPU cannot cope with video memory mapping) from
> > > 32 MB to 320 MB. I traced the container code and it turns out only very
> > > complex boards (A64-OLinuXino and Chris's motherboard) require 2M or 4M
> > > vertices in pcbnew, so the majority of the allocated memory stays unused
> > > for most cases. I think we should look for performance improvements
> > > elsewhere.
> > >
> > > One of ideas I never had the time to implement is to reduce vertex size
> > > (struct VERTEX) by replacing full color information with an index to a
> > > color palette stored in an uniform object.
> > >
> > > Another possible modification is to get rid of shader parameters for
> > > vertices that do not use them. It would require two cached containers -
> > > one for plain vertices, the other for vertices with parameters.
> > >
> > > Cheers,
> > > Orson
> > >
> > > On 02/01/2018 04:29 AM, Jon Evans wrote:
> > >> Next in the series of OpenGL performance work related to this bug:
> > >> https://bugs.launchpad.net/kicad/+bug/1745203
> <https://bugs.launchpad.net/kicad/+bug/1745203>
> <https://bugs.launchpad.net/kicad/+bug/1745203 <https://bugs.launchpad.net/kicad/+bug/1745203>>
> > >>
> > >> In the scenario of this bug report (load lots of gerber files, each with
> > >> lots of items) we struggle with vertex allocation.
> > >> On my Linux system with NVIDIA GPU, the attached patch speeds up
> > >> load/display of the file (and showing of Dcodes) by 50%.
> > >>
> > >> This patch feels kind of like a hack; and I'm not sure if it would cause
> > >> any issues on low-spec systems.
> > >> I'm curious for input from Orson/Tom and anyone else who has looked at our
> > >> OpenGL system.
> > >>
> > >> But, if it is safe, it could be a nice one-line speed improvement for
> > >> scenarios where we throw tons of items at the OpenGL GAL.
> > >>
> > >> Maybe a longer-term fix would be to take a look at the vertex container
> > >> algorithms and see if we can be more intelligent about when and how much we
> > >> allocate memory to reduce this overhead without having such a huge initial
> > >> size.
> > >>
> > >> -Jon
> > >>
--
Jean-Pierre CHARRAS
Follow ups
References