← Back to team overview

launchpad-dev team mailing list archive

Re: text markup in launchpad

 

It turns out there are two Python Markdown implementations; with not
much obvious to choose between them except the second one might (or
might not) be faster.  So for now I'm just going to go with the one
packaged as python-markdown, <http://pypi.python.org/Markdown>

On 18 November 2011 22:57, Barry Warsaw <barry@xxxxxxxxxx> wrote:
> On Nov 18, 2011, at 07:35 PM, Martin Pool wrote:
>
>> * There's a lot of existing text in Launchpad that was entered with
>>no concept of it being rendered as markup, which may look weird if we
>>change it that way.  I propose to handle it by, in the first instance,
>>seeing how much of it does actually look bad.  If it's a serious
>>problem, then we can update the database to put everything affected in
>>to a literal block.  Or, we can perhaps do some schema change to say
>>"this text is pre-markup" but that seems like it's going to need lots
>>of one-off changes.
>
> I think the way to handle this, and also allow room for future adoption of
> additional markup languages, is to include a Content-Type like value
> designating the style of markup being used.  PEPs for example have exactly
> this header to designate plain text (i.e. pre-markup) or reST.

Yes, but where would we track this?

I can think of three options:
1 - Add a "markup_type" column next to every column in the Launchpad
database that holds nontrivial user text content:
bug.description_content_type, person.homepage_content.content_type,
...  That is a lot of database updates (and I believe they're still
not painless to land?) and it seems it will be a bit annoying to
actually get that data up all the way through the stack to the tal
formatter.

2- Split out every text field that would be marked up into a separate
table, holding text and the content type.  That seems to have most of
the risks of #1, but even more code changes and somewhat scary
possible performance impact.

3- Put a magic string at the start of the text to declare its format, say

#markdown
#plaintext

This will take a little care that it never leaks, but aside from that
it ought to be feasible.  Within this there are variations about doing
a (huge) bulk update of the database to instert this at the start of
every string, and about whether it's automatically or invisibly
inserted in new user text.  Also some interesting interaction with the
api here: we don't want to break api clients by having it appear
everywhere in what they think is real text, but we also need
bug.description += 'foo' not to lose the format marker.

0- Just punt and don't explicitly track it: everything in the database
is assumed to be in the same format.  That format might be "markdown
unless it's obviously not". For user and api client sanity we desire
that anyhow.  Maybe optionally allow a marker at the top where that is
wanted.  (So a bit like Moin.)

-- 
Martin


Follow ups

References