← Back to team overview

openteachermaintainers team mailing list archive

Re: Synonyms as answers

 

Again as it didn't seem to get through.

On Mon, Nov 15, 2010 at 5:56 PM, Marten de Vries <
marten-de-vries@xxxxxxxxxxx> wrote:

> When you have to translate for example the dutch word "bank" in English,
> there are many valid translations (which are mostly synonyms), (i.e.
> couch, seat, bench, sofa; but also 'bank' (the place to deposit money in
> this case)).
>
> I thought it would be smart to check if the answered word is a synonym
> of the "correct answer", because they're both correct.
>
> Whiteboard:
>
> I (marten-de-vries) think it's an interesting option, but it requires a
> huge wordlist. Maybe it's possible to combine this one with the spell
> check and the automatic translation blueprint.
>
> (milan-boers): And every language has it's own synonym lists so it
> should be known which language (French, Dutch, German...) is the
> questioned language and OpenTeacher should then contain synonym lists of
> many languages. Automatic translation programs like
> Google/Bing/Babelfish Translate only give one word.
> If there would be a website with a synonym list for every language with
> an API, that would be a lot better cause you don't need to provide the
> word list with the program which will make it a lot bigger. I don't know
> if there is one though.
> If there isn't, it's still an idea to provide synonym lists on the
> OpenTeacher website, free to download, which can be loaded into
> OpenTeacher. This wouldn't make the normal OpenTeacher package bigger,
> and if you want this functionality, you can still get it.
>
> (milan-boers): It's fairly easy to make a custom API, and only make it
> (automatically) download the database when the functionality is used.
> Right now, I have a simple API, with 500 English commonly used synonyms
> (found on the internet). This takes only 5kB (gzipped).
>
>
> http://www.milanboers.nl/py-synonyms/synonymlist.php?token=163dc7cb5c492d3cc903e724b6594ea52fc1eb08&lang=en&word=bad
>
> gives synonyms for "bad" in JSON encoding.
>
> The python class currently looks up the word in the online database via
> the API, and downloads the entire database if it's not up to date (it
> can check that). If the connection could not be made (IOError, no
> internet), it uses the offline database. So you dont have a problem when
> you're away and there's no internet available.
>
> What would be a nice idea is if the "correct anyway" button would send
> the supposed answer and the given answer to the database, and store them
> as a new synonym. Then it only needs to be reviewed before it can get
> in. Also this would be fairly easy to implement on the server side. The
> only thing we need to do then is to find a synonym list for the common
> languages (English, French, German, Dutch) to start with. Then the list
> grows automatically and gets updated automatically.
>
> I think it would be a very nice feature for 2.1 or 3.0. What do you
> think?
>
> (marten-de-vries): I like the idea, but I see some problems:
> 1) we don't have really reliable web hosting for OpenTeacher. Yes, we've
> got your webhost, mine, and sourceforge.net, but can we trust them for
> years of use? ( Making an OpenTeacher version useless in less than 5
> years doesn't seem right to me. )
>

SourceForge is pretty reliable, but it doesn't support PHP (or any other
server-side programming language), so it's not usable. We can, however, use
the SourceForge domain and redirect the requests to my webhost. If my
webhost ever fails, we adjust the redirect and it would still work. It does
not have a 100% uptime, but that's not very important, because the databased
is also stored locally.


> 2) An argument from Cas ( He's an MSN contact of me ): We can't just
> rely on users' clicks on the 'correct anyway' button, because you're
> never sure if they're right. Requiring a synonym to be send multiple
> times before adding it to the list could fix this, but it's easy for
> somebody to mess up the database if he/she wants to.
>

Yes I thought about that too. But it's fine if we just check the submitted
synonyms before they get into the database. Also, the maximum amount of
submits per day can be limited by IP, to prevent huge attacks.


>
> About the offline database, this is exactly where CouchDB is designed
> for, and would be the nicest way of implementing something like this. It
> also solves partly the hosting problem if implemented smart, because
> other CouchDB servers could just host an alternative database when our
> server is going down. It's a huge dependancy however, so maybe a sqlite
> DB would do too... (sqlite is supported via the sqlite3 module in python
> itself.)
>

I don't really see why that is necessary, as you can just save the database
to a gzipped JSON-encoded file. This is very small and doesn't require extra
software. See the attachment for an example of the file it creates from
http://www.milanboers.nl/py-synonyms/synonymlist.php?token=163dc7cb5c492d3cc903e724b6594ea52fc1eb08&lang=en&entiredb=yes.
It can then be read from the file just like the online database, using
json decoding.


>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openteachermaintainers<https://launchpad.net/%7Eopenteachermaintainers>
> Post to     : openteachermaintainers@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~openteachermaintainers<https://launchpad.net/%7Eopenteachermaintainers>
> More help   : https://help.launchpad.net/ListHelp
>

Attachment: en-synonyms.db
Description: Binary data


Follow ups

References