fuel-dev team mailing list archive

Thread
Date
Re: Discussing DB migrations

To: Nikolay Markov <nmarkov@xxxxxxxxxxxx>
From: Evgeniy L <eli@xxxxxxxxxxxx>
Date: Mon, 31 Mar 2014 15:28:31 +0400
Cc: "fuel-dev@xxxxxxxxxxxxxxxxxxx" <fuel-dev@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <CAG8wB_RE0_rsfp-c+hMvNt123JhBuxGZqdTY=UF0gg13K3hWsw@mail.gmail.com>
Hi,

>> The question is, do we need to keep it single during development process
or we should just merge all the files into one migration just before
release?

I think it will be easier to add changes in a single
schema instead of merging before release because
in case of merging we have additional manual
labour, we need to remember that we need to do it
before release and we need to merge the migration
files manually.

>> As for me, I don't see any issues with keeping multiple migrations in code
repo (that's the common practice of majority of projects). Please write
your objections.

Common practice is to keep in a single migration
file all changes which were made during development
cycle. Our development cycles are much longer
than development cycles of regular web services
(it's a specific of our product) as result our migration
files bigger.

I can provide several examples why 1 migration file
per release is better than hundreds of small migration files.

1. it looks better to have a single file per release

current.py # I think we need to rename it to 5.0
fuel_4_0.py

If you want to see what was changed between two
versions you can just open a single file.

.... here a lot of files
4_0_fix_project_user_quotas_resource_length.py
4_0_add_metrics_in_compute_nodes.py
4_0_add_extra_resources_in_compute_nodes.py
4_0_add_details_column_to_instance_actions_events.py
4_0_add_ephemeral_key_uuid.py
4_0_drop_dump_tables.py
4_0_add_stats_in_compute_nodes.py

Here you have to follow some additional file naming
convention.
And not all of this names are obvious, as result you
have to look inside of this files anyway.

2. development

Developer A added field "a".
Developer B during development found that this field and decided to delete
it or to rename it.

4_0_fix_project_user_quotas_resource_length.py
4_0_add_a_in_compute_nodes.py - Developer A added this migration file
4_0_add_extra_resources_in_compute_nodes.py
4_0_add_details_column_to_instance_actions_events.py
4_0_add_ephemeral_key_uuid.py - Last migration

What developer B should to do? Should he create new
migration file or should he change/remove previous files?
It's very easy to miss the file '4_0_add_a_in_compute_nodes.py'
in the list, in this case developer will create new extra migration
file to remove or to rename field "a".

In case of single migration file per release developer will be able
to see, that this field was added in the current release, and
he will be able to remove/rename it.

>> I proposed to use separate DB for each major API version (which may have
completely independent schemas) and just write data migration scripts
(v1->v2 and v2->v1), for example, to allow adding nodes to v1 cluster.

If release == new database, we will have performance degradation in N times
(where N equal to amount of releases).
How are you going to use transactions when you have several databases?
It adds complexity.

Thanks,



On Fri, Mar 28, 2014 at 7:12 PM, Nikolay Markov <nmarkov@xxxxxxxxxxxx>wrote:

> Hello colleagues,
>
> Right now we already have working DB migration mechanism presented by
> Alembic, but it becomes more and more complex as we move towards
> upgrades.
>
> First, as we agreed, migration from previous version of Fuel DB to the
> next one will be presented by a single file. The question is, do we
> need to keep it single during development process or we shouls just
> merge all the files into one migration just before release?
>
> To clarify things, it's not really possible to generate completely
> working migration from the scratch taking the diff between two
> releases, because there are some issues in auto-generated scripts
> which may be fixed by hands only during development. And our single
> migration script (current.py) is becoming more and more huge as we
> don't keep small updates in a separate files.
>
> As for me, I don't see any issues with keeping multiple migrations in
> code repo (that's the common practice of majority of projects). Please
> write your objections.
>
> Second, it's not clear right now how we're going to achieve backward
> compatibility. We will have separate versions of almost all objects in
> code and will select corresponding ones by Environment versions. The
> thing is, it will be very hard for us to write working migrations in
> both directions without serious data loss, especially if we'll have
> lots of changes in DB schema.
>
> I proposed to use separate DB for each major API version (which may
> have completely independent schemas) and just write data migration
> scripts (v1->v2 and v2->v1), for example, to allow adding nodes to v1
> cluster. This seems as a huge overhead, but actually helps to get away
> of bad headache writing DB migrations.
>
> Please let's discuss all these things it this thread.
>
> --
> Best regards,
> Nick Markov
>
> --
> Mailing list: https://launchpad.net/~fuel-dev
> Post to     : fuel-dev@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~fuel-dev
> More help   : https://help.launchpad.net/ListHelp
>
Follow ups

Re: Discussing DB migrations
From: Nikolay Markov, 2014-03-31
References

Discussing DB migrations
From: Nikolay Markov, 2014-03-28