← Back to team overview

fuel-dev team mailing list archive

Re: Discussing DB migrations

 

> You still can do it with file per release.

>>> Regardless of the one per release argument vs many files. We still haven't created fuel_4.1.py which if we are doing one per release, is very necessary. There is no point of managing db migrations if we don't create the files per release.

> While we don't have upgrades we need to have a single migrations file i.e. 5.0.
> When we start develop 5.1 release we will create 5.1 migration file.

Again, we still do not have a 4.1 migration file which should have
been created already. So we aren't even creating one file per release
as you indicated we are.


On Tue, Apr 15, 2014 at 10:10 AM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
> Hi guys, sorry, but I was really busy and didn't have time to respond you
>
>>>  I have also never seen database migrations handled in any other way than
>>> with multiple files.
>
> We will have multiple files, but it will be single file per release.
>
>>> I agree with Nikolay and Ryan, Multiple files makes more sense. One
>>> Alembic tracks the dependencies between them and applies them in order. Two
>>> it allow us to revert changes that include db changes safely. Three it
>>> allows people with working db's to migrate between versions safely between
>>> revisions.
>
> You still can do it with file per release.
>
>>> Regardless of the one per release argument vs many files. We still
>>> haven't created fuel_4.1.py which if we are doing one per release, is very
>>> necessary. There is no point of managing db migrations if we don't create
>>> the files per release.
>
> While we don't have upgrades we need to have a single migrations file i.e.
> 5.0.
> When we start develop 5.1 release we will create 5.1 migration file.
>
> I'll try to to describe problems which we will have in case of several files
> per release.
>
> 1. we have to create some kind of naming convention for this files. And
> there will be a lot -1 for this.
> E.g.
> 330ec2ab2bbf_add_nodegroup.py - where nodegroup was added?
> 596b7e3f2b11_upgrade.py - or this file's name tells almost nothing
>
> In case of file per release it much simpler
>
> fuel_5_0.py
> fuel_5_1.py
>
> 2. from this file names it's not obvious what order will be used to apply
> this migrations files, you need to run some script to find the order.
>
> In case of file per release it is obvious what order will be used
>
> fuel_5_0.py
> fuel_5_1.py
>
> 3. and argument from my previous email
>
> Developer A added field "a".
> Developer B during development found that this field and decided to delete
> it or to rename it.
>
> 4_0_fix_project_user_quotas_resource_length.py
> 4_0_add_a_in_compute_nodes.py - Developer A added this migration file
> 4_0_add_extra_resources_in_compute_nodes.py
> 4_0_add_details_column_to_instance_actions_events.py
> 4_0_add_ephemeral_key_uuid.py - Last migration
>
> What developer B should to do? Should he create new
> migration file or should he change/remove previous files?
> It's very easy to miss the file '4_0_add_a_in_compute_nodes.py'
> in the list, in this case developer will create new extra migration
> file to remove or to rename field "a".
>
> In case of single migration file per release developer will be able
> to see, that this field was added in the current release, and
> he will be able to remove/rename it.
>
> [0]
> https://github.com/stackforge/fuel-web/tree/master/nailgun/nailgun/db/migration/alembic_migrations/versions
>
> Thanks,
>
>
> On Sat, Apr 12, 2014 at 1:25 AM, Andrew Woodward <xarses@xxxxxxxxx> wrote:
>>
>> Ryan helped to find that the changes I found from [1] are in fact due to
>> buggy migrations from Alembic 0.6.2, moving to 0.6.4 resolves this issue. So
>> that was a false alarm. I am intrigued as to why no one has raised this.
>>
>> [1] https://gist.github.com/xarses/10498338
>>
>>
>> On Fri, Apr 11, 2014 at 1:21 PM, Andrew Woodward <xarses@xxxxxxxxx> wrote:
>>>
>>> I agree with Nikolay and Ryan, Multiple files makes more sense. One
>>> Alembic tracks the dependencies between them and applies them in order. Two
>>> it allow us to revert changes that include db changes safely. Three it
>>> allows people with working db's to migrate between versions safely between
>>> revisions.
>>>
>>> Regardless of the one per release argument vs many files. We still
>>> haven't created fuel_4.1.py which if we are doing one per release, is very
>>> necessary. There is no point of managing db migrations if we don't create
>>> the files per release.
>>>
>>> Also, I have found that there are changes currently in master that are
>>> not covered by a migration [1]. This shows that either changes aren't being
>>> tracked propery in with current.py or people don't what or how to update
>>> this. If we are going to keep the one-per-release approach, it would be
>>> better to just not manage the migration files until we are ready to generate
>>> the release and create it once.
>>>
>>> [1] https://gist.github.com/xarses/10498338
>>>
>>>
>>> On Fri, Apr 11, 2014 at 12:28 PM, Ryan Moe <rmoe@xxxxxxxxxxxx> wrote:
>>>>
>>>> Have we reached a consensus on how we're handling migrations? I see some
>>>> reviews modifying current.py and some adding new migration files. FWIW I
>>>> agree with everything Nikolay said. I have also never seen database
>>>> migrations handled in any other way than with multiple files.
>>>>
>>>> Thanks,
>>>> Ryan
>>>>
>>>> On Mon, Mar 31, 2014 at 7:12 AM, Nikolay Markov <nmarkov@xxxxxxxxxxxx>
>>>> wrote:
>>>>>>
>>>>>> I think it will be easier to add changes in a single
>>>>>> schema instead of merging before release because
>>>>>> in case of merging we have additional manual
>>>>>> labour, we need to remember that we need to do it
>>>>>> before release and we need to merge the migration
>>>>>> files manually.
>>>>>
>>>>>
>>>>> All we need to do in this case is simple copy-paste, it can even be
>>>>> automated if we are not happy about doing it by hands. All code in upgrade()
>>>>> and downgrade() methods executes one migration by one, it doesn't matter if
>>>>> it's located in one file or multiple.
>>>>>
>>>>>> Common practice is to keep in a single migration
>>>>>> file all changes which were made during development
>>>>>> cycle.
>>>>>
>>>>>
>>>>> As long-time web developer in the past - never saw this practice. It
>>>>> was always multiple files.
>>>>>
>>>>> I would say you're thinking too much about developers looking through
>>>>> migrations. I can say you almost never need to look at previous migrations,
>>>>> you just need to create yours from previous state (no matter what it is) to
>>>>> yours.
>>>>>
>>>>> Also, it actually doesn't matter how long does it take to apply DB
>>>>> migration. In the scope of upgrading process as a whole it will be a tiny
>>>>> thing and even if we add field and then delete it - it doesn't make any
>>>>> notable difference for users, but it's easier for developers to not look
>>>>> back.
>>>>>
>>>>>> If release == new database, we will have performance degradation in N
>>>>>> times (where N equal to amount of releases).
>>>>>
>>>>>
>>>>> Why? We can do requests in parallel. And what are possible problems
>>>>> with transactions? We still keep all the objects with v1 in DBv1 and objects
>>>>> v2 in DBv2. They will never intersect, in transactions as well.
>>>>>
>>>>> On Mon, Mar 31, 2014 at 3:28 PM, Evgeniy L <eli@xxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> >> The question is, do we need to keep it single during development
>>>>>> >> process or we should just merge all the files into one migration just before
>>>>>> >> release?
>>>>>>
>>>>>> I think it will be easier to add changes in a single
>>>>>> schema instead of merging before release because
>>>>>> in case of merging we have additional manual
>>>>>> labour, we need to remember that we need to do it
>>>>>> before release and we need to merge the migration
>>>>>> files manually.
>>>>>>
>>>>>> >> As for me, I don't see any issues with keeping multiple migrations
>>>>>> >> in code repo (that's the common practice of majority of projects). Please
>>>>>> >> write your objections.
>>>>>>
>>>>>> Common practice is to keep in a single migration
>>>>>> file all changes which were made during development
>>>>>> cycle. Our development cycles are much longer
>>>>>> than development cycles of regular web services
>>>>>> (it's a specific of our product) as result our migration
>>>>>> files bigger.
>>>>>>
>>>>>> I can provide several examples why 1 migration file
>>>>>> per release is better than hundreds of small migration files.
>>>>>>
>>>>>> 1. it looks better to have a single file per release
>>>>>>
>>>>>> current.py # I think we need to rename it to 5.0
>>>>>> fuel_4_0.py
>>>>>>
>>>>>> If you want to see what was changed between two
>>>>>> versions you can just open a single file.
>>>>>>
>>>>>> .... here a lot of files
>>>>>> 4_0_fix_project_user_quotas_resource_length.py
>>>>>> 4_0_add_metrics_in_compute_nodes.py
>>>>>> 4_0_add_extra_resources_in_compute_nodes.py
>>>>>> 4_0_add_details_column_to_instance_actions_events.py
>>>>>> 4_0_add_ephemeral_key_uuid.py
>>>>>> 4_0_drop_dump_tables.py
>>>>>> 4_0_add_stats_in_compute_nodes.py
>>>>>>
>>>>>> Here you have to follow some additional file naming
>>>>>> convention.
>>>>>> And not all of this names are obvious, as result you
>>>>>> have to look inside of this files anyway.
>>>>>>
>>>>>> 2. development
>>>>>>
>>>>>> Developer A added field "a".
>>>>>> Developer B during development found that this field and decided to
>>>>>> delete it or to rename it.
>>>>>>
>>>>>> 4_0_fix_project_user_quotas_resource_length.py
>>>>>> 4_0_add_a_in_compute_nodes.py - Developer A added this migration file
>>>>>> 4_0_add_extra_resources_in_compute_nodes.py
>>>>>> 4_0_add_details_column_to_instance_actions_events.py
>>>>>> 4_0_add_ephemeral_key_uuid.py - Last migration
>>>>>>
>>>>>> What developer B should to do? Should he create new
>>>>>> migration file or should he change/remove previous files?
>>>>>> It's very easy to miss the file '4_0_add_a_in_compute_nodes.py'
>>>>>> in the list, in this case developer will create new extra migration
>>>>>> file to remove or to rename field "a".
>>>>>>
>>>>>> In case of single migration file per release developer will be able
>>>>>> to see, that this field was added in the current release, and
>>>>>> he will be able to remove/rename it.
>>>>>>
>>>>>> >> I proposed to use separate DB for each major API version (which may
>>>>>> >> have completely independent schemas) and just write data migration scripts
>>>>>> >> (v1->v2 and v2->v1), for example, to allow adding nodes to v1 cluster.
>>>>>>
>>>>>> If release == new database, we will have performance degradation in N
>>>>>> times (where N equal to amount of releases).
>>>>>> How are you going to use transactions when you have several databases?
>>>>>> It adds complexity.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 28, 2014 at 7:12 PM, Nikolay Markov <nmarkov@xxxxxxxxxxxx>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hello colleagues,
>>>>>>>
>>>>>>> Right now we already have working DB migration mechanism presented by
>>>>>>> Alembic, but it becomes more and more complex as we move towards
>>>>>>> upgrades.
>>>>>>>
>>>>>>> First, as we agreed, migration from previous version of Fuel DB to
>>>>>>> the
>>>>>>> next one will be presented by a single file. The question is, do we
>>>>>>> need to keep it single during development process or we shouls just
>>>>>>> merge all the files into one migration just before release?
>>>>>>>
>>>>>>> To clarify things, it's not really possible to generate completely
>>>>>>> working migration from the scratch taking the diff between two
>>>>>>> releases, because there are some issues in auto-generated scripts
>>>>>>> which may be fixed by hands only during development. And our single
>>>>>>> migration script (current.py) is becoming more and more huge as we
>>>>>>> don't keep small updates in a separate files.
>>>>>>>
>>>>>>> As for me, I don't see any issues with keeping multiple migrations in
>>>>>>> code repo (that's the common practice of majority of projects).
>>>>>>> Please
>>>>>>> write your objections.
>>>>>>>
>>>>>>> Second, it's not clear right now how we're going to achieve backward
>>>>>>> compatibility. We will have separate versions of almost all objects
>>>>>>> in
>>>>>>> code and will select corresponding ones by Environment versions. The
>>>>>>> thing is, it will be very hard for us to write working migrations in
>>>>>>> both directions without serious data loss, especially if we'll have
>>>>>>> lots of changes in DB schema.
>>>>>>>
>>>>>>> I proposed to use separate DB for each major API version (which may
>>>>>>> have completely independent schemas) and just write data migration
>>>>>>> scripts (v1->v2 and v2->v1), for example, to allow adding nodes to v1
>>>>>>> cluster. This seems as a huge overhead, but actually helps to get
>>>>>>> away
>>>>>>> of bad headache writing DB migrations.
>>>>>>>
>>>>>>> Please let's discuss all these things it this thread.
>>>>>>>
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Nick Markov
>>>>>>>
>>>>>>> --
>>>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Nick Markov
>>>>>
>>>>> --
>>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>>
>>>>
>>>>
>>>> --
>>>> Mailing list: https://launchpad.net/~fuel-dev
>>>> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~fuel-dev
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>
>>>
>>>
>>> --
>>> Andrew
>>> Mirantis
>>> Ceph community
>>
>>
>>
>>
>> --
>> Andrew
>> Mirantis
>> Ceph community
>
>



-- 
Andrew
Mirantis
Ceph community


Follow ups

References