yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #11356
[Bug 1290700] [NEW] Nova-manage db archive_deleted_rows stops at the first failure with insufficient diagnostic
Public bug reported:
1- After a long run provisioning which creates and deletes VMs for days
we have a quite long history of deleted instances in the DB that could
be archived
2- We have attempted to run:
nova-manage db archive_deleted_rows --max_rows 1000000
which was accepted but did not complete in a short time and then was
stopped using ˆc.
Possibly the same happens when you get a time-out because the command plays a multiple insert in the target shadow table and only after this deletes the entries that where logically deleted in the on-line table. Not sure if both are in the same commit cycle and if the first is rolled back when the second is unable to complete.
Also it is not clear if the command can be executed in concurrency form multiple users without problems. It happened to us to that the DB was left in an inconsistent state with rows still present in the on-line tables and already copied to the shadow tables.
3- As consequence of this situation any further invocation of the
command also with a limited max_row number will fail. This is not good
as it could be better to skip the one in error and continue with the
other, reporting which one failed and needs further actions. This leads
the user with the suspect that the archiving doesn't work at all, as
many are saying in the OpenStack forums
4- The problem here is a serviceability one. as point one, the command
doesn't return any output if everything went fine, that doesn't help to
make clarity
5- As point two , the output of the command in case something went wrong is not clear about what happened. It is just list the SQL transaction that goes wrong. If the transaction is a multiple insert with a large set of values, that may be the case triggered by an high max row parameter, the output is capable of showing only the final part of the statement.
f the max_rows parameter is big, the part of the output that fits in the shell is just a list of the values of the last field in the multiple insert, usually the content of the 'deleted' field for the row processed, which is a counter and not so meaningful to the user .
e.g. .......1401601, 1401602, 1401603, 1401604, 1401605, 1401606, 1401607, 1401608, 1401609, 1401610, 1401611)
Please note that in this case the command can be partially executed and any further attempt blocks at the same point.
6- as work around the user may only execute the command with max_row =1,
see the output and fix every problem manually in the DB. Not really
practical for the purpose of the command.
** Affects: nova-project
Importance: Undecided
Status: New
** Project changed: neutron => nova-project
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1290700
Title:
Nova-manage db archive_deleted_rows stops at the first failure with
insufficient diagnostic
Status in The Nova Project:
New
Bug description:
1- After a long run provisioning which creates and deletes VMs for
days we have a quite long history of deleted instances in the DB that
could be archived
2- We have attempted to run:
nova-manage db archive_deleted_rows --max_rows 1000000
which was accepted but did not complete in a short time and then was
stopped using ˆc.
Possibly the same happens when you get a time-out because the command plays a multiple insert in the target shadow table and only after this deletes the entries that where logically deleted in the on-line table. Not sure if both are in the same commit cycle and if the first is rolled back when the second is unable to complete.
Also it is not clear if the command can be executed in concurrency form multiple users without problems. It happened to us to that the DB was left in an inconsistent state with rows still present in the on-line tables and already copied to the shadow tables.
3- As consequence of this situation any further invocation of the
command also with a limited max_row number will fail. This is not good
as it could be better to skip the one in error and continue with the
other, reporting which one failed and needs further actions. This
leads the user with the suspect that the archiving doesn't work at
all, as many are saying in the OpenStack forums
4- The problem here is a serviceability one. as point one, the
command doesn't return any output if everything went fine, that
doesn't help to make clarity
5- As point two , the output of the command in case something went wrong is not clear about what happened. It is just list the SQL transaction that goes wrong. If the transaction is a multiple insert with a large set of values, that may be the case triggered by an high max row parameter, the output is capable of showing only the final part of the statement.
f the max_rows parameter is big, the part of the output that fits in the shell is just a list of the values of the last field in the multiple insert, usually the content of the 'deleted' field for the row processed, which is a counter and not so meaningful to the user .
e.g. .......1401601, 1401602, 1401603, 1401604, 1401605, 1401606, 1401607, 1401608, 1401609, 1401610, 1401611)
Please note that in this case the command can be partially executed and any further attempt blocks at the same point.
6- as work around the user may only execute the command with max_row
=1, see the output and fix every problem manually in the DB. Not
really practical for the purpose of the command.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova-project/+bug/1290700/+subscriptions
Follow ups
References