launchpad-dev team mailing list archive
-
launchpad-dev team
-
Mailing list archive
-
Message #03108
Re: Trouble loading a meliae dump
On Sat, 2010-03-20 at 00:40 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Guilherme Salgado wrote:
> > Hi John,
> >
> > I've used meliae to get a memory dump from Launchpad, but when I tried
> > to load that dump I got http://paste.ubuntu.com/397273/ (the first line
> > there shows the line that causes simplejson.loads() to choke).
> >
> > From my understanding of [1], this seems to be expected, but I wonder
> > how these unpaired surrogates ended up in the dump. Any ideas?
> >
> > BTW, I did some hacks in my local copy of meliae to replace the
> > problematic bits on that line, and after that I was able to load the
> > dump. Maybe with that I could try and find out where the unpaired
> > surrogates are coming from?
> >
> > [1] <http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters#Surrogates>
> >
> > Cheers,
> >
>
> I'm mostly offline on vacation right now, but I'll try to help out when
> I get back. I can think of 2 causes:
Thanks for the help, John, but it turned out a memory dump from staging
was loaded just fine, so I'm not worrying about this now and hoping the
same will happen for a production dump, when we see another memory leak.
If for some reason I can't load the production dump, I'll see if it
could be caused by one of the two reasons below.
>
> 1) I trim most output to 100 characters. (So if you have a 1,000 byte
> string, I only output 100 bytes.) It is possible that a Unicode
> surrogate was at bytes 100 and 101 and just got truncated.
>
> 2) I use a pretty stupid method for encoding 8-bit strings, just mapping
> them all to the unicode code point '\xff' => U+00FF. Some of that may be
> invalid.
>
> 3) Other bugs I don't even know about... :)
>
> I'm happy to debug this with you sometimes soon. (If you're getting
> this, it probably means I'm back home, rather than offline in an airport.)
>
> John
> =:->
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (Cygwin)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAkukX9sACgkQJdeBCYSNAANWfwCgw2CBP2rdIwUEGwNK9yE70sIY
> LqoAn2J14Q84GDZEBLPDlqBZjol6iVzn
> =MvTl
> -----END PGP SIGNATURE-----
>
--
Guilherme Salgado <salgado@xxxxxxxxxxxxx>
Attachment:
signature.asc
Description: This is a digitally signed message part
References