lightspark-users team mailing list archive

Thread
Date
Re: Plan for memory usage and GC support

To: Matthias Gehre <M.Gehre@xxxxxx>, lightspark-users <lightspark-users@xxxxxxxxxxxxxxxxxxx>
From: Alessandro Pignotti <a.pignotti@xxxxxxxx>
Date: Fri, 27 Apr 2012 12:28:29 +0200
In-reply-to: <CANfT0A=heFaoC3CFYCyCVybU_D84JRQw7jC5eS8Twi04BpZGog@mail.gmail.com>
Il giorno mar, 24/04/2012 alle 16.34 +0200, Matthias Gehre ha scritto:
> 2012/4/11 Alessandro Pignotti <a.pignotti@xxxxxxxx>:
> > Hi everyone,
> >
> > I'd like to discuss a problem I've been working in the last few days.
> > I'm currently spending some time trying to get FarmVille to work, which
> > would be a fairly large milestone for the project, being an heavyweight,
> > interactive game. In my progress I'm currently hitting a memory usage
> > wall. I think the main problem is that ligthspark does not implement yet
> > a gargabe collector beside reference counting, which means that object
> > cycles are not freed until the flash instance is terminated. I've tried
> > to use Bohem GC (libgc) to solve the issue
> >
> > https://github.com/lightspark/lightspark/tree/experimental-boehm-gc
> >
> > Basically Boehm GC is a stop-the-world mark and sweep collector. The
> > stop-the-world part is really problematic since the collector will stop
> > all threads using signals every once in a while. This causes some very
> > serious slowdowns and that made me currently put such library aside. If
> > anyone has experience with it feedback would be very welcome.
> I thought about this some time ago. On the pro-side of the mark and sweep
> collector, we won't have to do those aweful many incRef/decRef's.
> Especially the decRef's are bad, because it's not trivial to JIT that
> call. Okey, one could JIT the fastpath (ref >0) and only do a function call
> in case ref=0.
> But your testing indicated that incRef/decRef is faster than mark and sweep?
After some time spent experimenting with bohem the major advantage of
reference counting I see is that you don't need to freeze the execution
of all the threads to collect. The point is that references on the stack
of other threads would not be found by the mark and sweep system. On the
other hand the reference count keep the object alive. I know that
tamarin uses a tracing GC, but I also know that python uses only
reference counting (+ a cycle detector to get rid of reference cycles)
I think both are respectable solution, and since we already have
reference counting in place I would go for the second choice.
> 
> > The
> > experiment has been useful anyway since I've used the leak detection
> > capabilities of the library to fix quite a few reference counting errors
> > that were causing memory leaks. Another side issue is that the collector
> > is "global", that's to say that it's not possible to isolate a instance
> > of flash from another one and the collector has to work on all the
> > objects from all instances together.
> This is a problem of the boehm-gc implementation, right? Because conceptually,
> one should be able to run one gc per flash instance.
Sure, it's just a matter of choosing the right subset of roots. Still,
it would not come for free with an unmodified boehm GC.
> 
> >
> > Moreover, I also think that objects themselves are bloated, since I've
> > actually never spent any time in optimizing their size.
> Tamarin does some quite clever things there. For example using the lower
> two bits of the "object pointer" to encode the type of object. (Those
> are unused when
> the objects are 4-byte aligned.) And then 32 bit Integers, Booleans,
> Null and Void
> can be directly encoded in the remaining bits of the "object pointer"
> without allocating memory for them.
I'm not sure if using the low bits as flags is actually a good idea. It
also makes the code less robust, since null and undefined values would
not behave as regular object. Moreover on 32bit platform you can only
encode a pointer to an integer not an integer itself. This limitation
does not exists on 64bit platform for integers, but is still there for
doubles. Moreover, recently I've uniqued all Nulls and Undefined. The
all are reference to the same object now.
> 
> > The way I would like to approach the problem is twofold:
> >
> > 1) Employ custom allocators to wrap heap traffic while accounting who is
> > using memory (similarly to what mozilla developers have been doing for
> > their MemShrink project https://wiki.mozilla.org/Performance/MemShrink)
> > This would make it possible to profile what should to be optimized)
> > 2) Use object cycle detection to deal with the limitation of reference
> > counting, similarly to what python does. I'm still investigating on
> > this.
> Are object cycles that are common in the AS code we execute?
> Otherwise we could run the object cycle detection at a very low frequency.
I think they are not so common. This also means that switching to mark
and sweep will not give up much an advantage anyway probably.
> 
> >
> > Any feedback/suggestion will be much appreciated, sorry for the length
> > of the mail,
> > Alessandro
> >
> >
> > --
> > Mailing list: https://launchpad.net/~lightspark-users
> > Post to     : lightspark-users@xxxxxxxxxxxxxxxxxxx
> > Unsubscribe : https://launchpad.net/~lightspark-users
> > More help   : https://help.launchpad.net/ListHelp
References

Plan for memory usage and GC support
From: Alessandro Pignotti, 2012-04-11
Re: Plan for memory usage and GC support
From: Matthias Gehre, 2012-04-24