zeitgeist team mailing list archive

Thread
Date
[Bug 784011] [NEW] clean up sql.py

To: zeitgeist@xxxxxxxxxxxxxxxxxxx
From: "J.P. Lacerda" <jpaflacerda@xxxxxxxxx>
Date: Tue, 17 May 2011 13:32:54 -0000
Reply-to: Bug 784011 <784011@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
Public bug reported:

There are quite a few clean-ups that are possibly in sql.py.
I've already made a few of these cleanups, but some more complex one's would have to rely on better sql tests:

10:16 < jplacerda> thekorn: back to sql.py... :) In create_db if the database is not new, and it is up to date, then the cursor is returned immediately
10:17 < jplacerda> However, the code in check_core_schema_upgrade only returns True if it is already upgraded prior to entering the function
10:17 < jplacerda> False is returned if: 1) the database does not exist 2) the database is not up to date prior to entering the database
10:18 < jplacerda> The latter poses a problem, as check_core_schema_upgrade calls do_schema_upgrade, and the database is upgraded to CORE_SCHEMA_VERSION
10:18 < jplacerda> *but* False is still returned
10:18 < jplacerda> Which means that create_db then tries to insert the same things again into the table
10:19 < jplacerda> This is a bug, I assume? 
10:27 < thekorn> let me have a closer look to the code
10:28 < jplacerda> thekorn: sure. it looks particularly suspicious, as at the end of create_db you have: _set_schema_version (cursor, constants.CORE_SCHEMA, constants.CORE_SCHEMA_VERSION)
10:41 < thekorn> jplacerda: in short: it is not a problem at all, as all SQL statements in create_db have sth. along "IF NOT EXIST"
10:41 < thekorn> or similar
10:42 < jplacerda> thekorn: agreed, but does it really make sense to go through all those now that we have upgrade checking?
10:43 < jplacerda> At the end of _do_schema_upgrade it will be good to go
10:43 < jplacerda> Is there a compelling reason not to?
10:49 < jplacerda> I mean, if it is not a problem, then we shouldn't even have do_schema_upgrade :P
10:50 < jplacerda> It's just that do_schema_upgrade provides a better incremental picture of what's going on, and is probably faster than running through all of create_db
10:52 < thekorn> I know we had a good reason to do it this way, but I cannot remember which, let me think about it a bit
10:53 < jplacerda> thekorn: sure :)
10:55 < jplacerda> thekorn: the only reason I can think of (from looking at the code) is in case an upgrade fails -- you still get the same db structure after going through the statements in create_db
10:56 < thekorn> yeah, sth. alog the lines
10:56 < thekorn> along
10:57 < jplacerda> well, now that we have a method of ensuring n -> n+1 works, we can do away with the repetition, and have that code only for new db's :)
10:58 < thekorn> jplacerda: it's also nicer to maintain, because we knoe that the sql statements in create_db() represent the up-to-date version of our db scheme
10:58 < thekorn> and we don't have to look at many files to understand how the current db structure looks like
10:59 < thekorn> ....plus I really don't think it has significant performance issues this way,
10:59 < thekorn> e.g. impact on startup time
10:59 < jplacerda> thekorn: sure :) I think that the statements in create_db can be left unchanged, but I think it now makes sense to only have them being reached by new databases
10:59 < jplacerda> would you agree with this?
11:00 < thekorn> jplacerda: yeah, but this would mean the upgrade scripts would get more complicated
11:00 < thekorn> e.g. we have to find out which indecies were added (or removed) at which point
11:00 < thekorn> etc.
11:01 < jplacerda> hmmm
11:01 < thekorn> as we have it right now, the upgrade scripts are mostly all about upgrading the data
11:01 < jplacerda> But don't you do that already?
11:02 < jplacerda> I mean, the statements in create_db give you an absolute picture of the current schema
11:02 < thekorn> sorry, what I mean is: if we change it the way you suggest, we have to go back in history and adjust each upgrade script
11:02 < thekorn> and see if they are really compatible
11:02 < jplacerda> the ones in core_n_n+1 are relative to previous versions
11:02 < jplacerda> I see.
11:02 < jplacerda> What is the best way to test this, then?
11:02 < thekorn> and given that we have no good way to test our upgrade pathes it might get some realy big pain
11:03 < thekorn> jplacerda: I thought about how we can test it intensively, and I failed miserably
11:03 < thekorn> the best is to use some sample data
11:03 < jplacerda> ok
11:03 < jplacerda> Should the tests be done in sql.py?
11:03 < thekorn> at different db scheme versions,
11:03 < jplacerda> woops
11:03 < jplacerda> i mean
11:03 < jplacerda> in tests/sql
11:04 < thekorn> jplacerda: I think it would be woth adding a new file t/sql_upgrade
11:04 < thekorn> to get some more structure
11:04 < jplacerda> thekorn: agreed
11:05 < jplacerda> would any tests designed previously have to be moved there?
11:06 < thekorn> but honestly, if you want to work on it, I think having a kind of testing framework, which generats dbs at a specified version, and tests upgrades to each other versions would be *the most awesome 
                 solution* (tm)
11:06 < thekorn> which would scale in a good way for the future
11:07 < thekorn> jplacerda: I don't think so, because we don't have tests for upgrades atm

** Affects: zeitgeist
     Importance: Undecided
         Status: New

** Branch linked: lp:~jplacerda/zeitgeist/660307

-- 
You received this bug notification because you are a member of Zeitgeist
Framework Team, which is subscribed to Zeitgeist Framework.
https://bugs.launchpad.net/bugs/784011

Title:
  clean up sql.py

Status in Zeitgeist Framework:
  New

Bug description:
  There are quite a few clean-ups that are possibly in sql.py.
  I've already made a few of these cleanups, but some more complex one's would have to rely on better sql tests:

  10:16 < jplacerda> thekorn: back to sql.py... :) In create_db if the database is not new, and it is up to date, then the cursor is returned immediately
  10:17 < jplacerda> However, the code in check_core_schema_upgrade only returns True if it is already upgraded prior to entering the function
  10:17 < jplacerda> False is returned if: 1) the database does not exist 2) the database is not up to date prior to entering the database
  10:18 < jplacerda> The latter poses a problem, as check_core_schema_upgrade calls do_schema_upgrade, and the database is upgraded to CORE_SCHEMA_VERSION
  10:18 < jplacerda> *but* False is still returned
  10:18 < jplacerda> Which means that create_db then tries to insert the same things again into the table
  10:19 < jplacerda> This is a bug, I assume? 
  10:27 < thekorn> let me have a closer look to the code
  10:28 < jplacerda> thekorn: sure. it looks particularly suspicious, as at the end of create_db you have: _set_schema_version (cursor, constants.CORE_SCHEMA, constants.CORE_SCHEMA_VERSION)
  10:41 < thekorn> jplacerda: in short: it is not a problem at all, as all SQL statements in create_db have sth. along "IF NOT EXIST"
  10:41 < thekorn> or similar
  10:42 < jplacerda> thekorn: agreed, but does it really make sense to go through all those now that we have upgrade checking?
  10:43 < jplacerda> At the end of _do_schema_upgrade it will be good to go
  10:43 < jplacerda> Is there a compelling reason not to?
  10:49 < jplacerda> I mean, if it is not a problem, then we shouldn't even have do_schema_upgrade :P
  10:50 < jplacerda> It's just that do_schema_upgrade provides a better incremental picture of what's going on, and is probably faster than running through all of create_db
  10:52 < thekorn> I know we had a good reason to do it this way, but I cannot remember which, let me think about it a bit
  10:53 < jplacerda> thekorn: sure :)
  10:55 < jplacerda> thekorn: the only reason I can think of (from looking at the code) is in case an upgrade fails -- you still get the same db structure after going through the statements in create_db
  10:56 < thekorn> yeah, sth. alog the lines
  10:56 < thekorn> along
  10:57 < jplacerda> well, now that we have a method of ensuring n -> n+1 works, we can do away with the repetition, and have that code only for new db's :)
  10:58 < thekorn> jplacerda: it's also nicer to maintain, because we knoe that the sql statements in create_db() represent the up-to-date version of our db scheme
  10:58 < thekorn> and we don't have to look at many files to understand how the current db structure looks like
  10:59 < thekorn> ....plus I really don't think it has significant performance issues this way,
  10:59 < thekorn> e.g. impact on startup time
  10:59 < jplacerda> thekorn: sure :) I think that the statements in create_db can be left unchanged, but I think it now makes sense to only have them being reached by new databases
  10:59 < jplacerda> would you agree with this?
  11:00 < thekorn> jplacerda: yeah, but this would mean the upgrade scripts would get more complicated
  11:00 < thekorn> e.g. we have to find out which indecies were added (or removed) at which point
  11:00 < thekorn> etc.
  11:01 < jplacerda> hmmm
  11:01 < thekorn> as we have it right now, the upgrade scripts are mostly all about upgrading the data
  11:01 < jplacerda> But don't you do that already?
  11:02 < jplacerda> I mean, the statements in create_db give you an absolute picture of the current schema
  11:02 < thekorn> sorry, what I mean is: if we change it the way you suggest, we have to go back in history and adjust each upgrade script
  11:02 < thekorn> and see if they are really compatible
  11:02 < jplacerda> the ones in core_n_n+1 are relative to previous versions
  11:02 < jplacerda> I see.
  11:02 < jplacerda> What is the best way to test this, then?
  11:02 < thekorn> and given that we have no good way to test our upgrade pathes it might get some realy big pain
  11:03 < thekorn> jplacerda: I thought about how we can test it intensively, and I failed miserably
  11:03 < thekorn> the best is to use some sample data
  11:03 < jplacerda> ok
  11:03 < jplacerda> Should the tests be done in sql.py?
  11:03 < thekorn> at different db scheme versions,
  11:03 < jplacerda> woops
  11:03 < jplacerda> i mean
  11:03 < jplacerda> in tests/sql
  11:04 < thekorn> jplacerda: I think it would be woth adding a new file t/sql_upgrade
  11:04 < thekorn> to get some more structure
  11:04 < jplacerda> thekorn: agreed
  11:05 < jplacerda> would any tests designed previously have to be moved there?
  11:06 < thekorn> but honestly, if you want to work on it, I think having a kind of testing framework, which generats dbs at a specified version, and tests upgrades to each other versions would be *the most awesome 
                   solution* (tm)
  11:06 < thekorn> which would scale in a good way for the future
  11:07 < thekorn> jplacerda: I don't think so, because we don't have tests for upgrades atm
Follow ups

[Bug 784011] Re: clean up sql.py
From: Siegfried Gevatter, 2011-09-21
[Bug 784011] Re: clean up sql.py
From: Seif Lotfy, 2011-05-21
[Bug 784011] Re: clean up sql.py
From: Seif Lotfy, 2011-05-21
[Bug 784011] Re: clean up sql.py
From: J.P. Lacerda, 2011-05-18
[Bug 784011] Re: clean up sql.py
From: Launchpad Bug Tracker, 2011-05-18
[Bug 784011] Re: clean up sql.py
From: Siegfried Gevatter, 2011-05-17
[Bug 784011] [NEW] clean up sql.py
From: J.P. Lacerda, 2011-05-17
References

[Bug 784011] [NEW] clean up sql.py
From: J.P. Lacerda, 2011-05-17