yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #92507
[Bug 2024258] [NEW] Performance degradation archiving DB with large numbers of FK related records
Public bug reported:
Observed downstream in a large scale cluster with constant create/delete
server activity and hundreds of thousands of deleted instances rows.
Currently, we archive deleted rows in batches of max_rows parents +
their child rows in a single database transaction. Doing it that way
limits how high a value of max_rows can be specified by the caller
because of the size of the database transaction it could generate.
For example, in a large scale deployment with hundreds of thousands of
deleted rows and constant server creation and deletion activity, a
value of max_rows=1000 might exceed the database's configured maximum
packet size or timeout due to a database deadlock, forcing the operator
to use a much lower max_rows value like 100 or 50.
And when the operator has e.g. 500,000 deleted instances rows (and
millions of deleted rows total) they are trying to archive, being
forced to use a max_rows value several orders of magnitude lower than
the number of rows they need to archive is a poor user experience and
also makes it unclear if archive progress is actually being made.
** Affects: nova
Importance: Undecided
Assignee: melanie witt (melwitt)
Status: New
** Affects: nova/antelope
Importance: Undecided
Status: New
** Affects: nova/wallaby
Importance: Undecided
Status: New
** Affects: nova/xena
Importance: Undecided
Status: New
** Affects: nova/yoga
Importance: Undecided
Status: New
** Affects: nova/zed
Importance: Undecided
Status: New
** Tags: db performance
** Description changed:
- Observed downstream in a large scale cluster with constant create/delete
+ Observed downstream in a large scale cluster with constant create/delete
server activity and hundreds of thousands of deleted instances rows.
Currently, we archive deleted rows in batches of max_rows parents +
their child rows in a single database transaction. Doing it that way
limits how high a value of max_rows can be specified by the caller
because of the size of the database transaction it could generate.
For example, in a large scale deployment with hundreds of thousands of
deleted rows and constant server creation and deletion activity, a
value of max_rows=1000 might exceed the database's configured maximum
packet size or timeout due to a database deadlock, forcing the operator
to use a much lower max_rows value like 100 or 50.
And when the operator has e.g. 500,000 deleted instances rows (and
millions of deleted rows total) they are trying to archive, being
forced to use a max_rows value several orders of magnitude lower than
the number of rows they need to archive is a poor user experience and
- makes it unclear if archive progress is actually being made.
+ also makes it unclear if archive progress is actually being made.
** Also affects: nova/xena
Importance: Undecided
Status: New
** Also affects: nova/antelope
Importance: Undecided
Status: New
** Also affects: nova/zed
Importance: Undecided
Status: New
** Also affects: nova/wallaby
Importance: Undecided
Status: New
** Also affects: nova/yoga
Importance: Undecided
Status: New
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/2024258
Title:
Performance degradation archiving DB with large numbers of FK related
records
Status in OpenStack Compute (nova):
New
Status in OpenStack Compute (nova) antelope series:
New
Status in OpenStack Compute (nova) wallaby series:
New
Status in OpenStack Compute (nova) xena series:
New
Status in OpenStack Compute (nova) yoga series:
New
Status in OpenStack Compute (nova) zed series:
New
Bug description:
Observed downstream in a large scale cluster with constant create/delete
server activity and hundreds of thousands of deleted instances rows.
Currently, we archive deleted rows in batches of max_rows parents +
their child rows in a single database transaction. Doing it that way
limits how high a value of max_rows can be specified by the caller
because of the size of the database transaction it could generate.
For example, in a large scale deployment with hundreds of thousands of
deleted rows and constant server creation and deletion activity, a
value of max_rows=1000 might exceed the database's configured maximum
packet size or timeout due to a database deadlock, forcing the operator
to use a much lower max_rows value like 100 or 50.
And when the operator has e.g. 500,000 deleted instances rows (and
millions of deleted rows total) they are trying to archive, being
forced to use a max_rows value several orders of magnitude lower than
the number of rows they need to archive is a poor user experience and
also makes it unclear if archive progress is actually being made.
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2024258/+subscriptions
Follow ups