yahoo-eng-team team mailing list archive
-
yahoo-eng-team team
-
Mailing list archive
-
Message #45723
[Bug 1524124] Re: unscalable database schema design
** Changed in: keystone
Status: New => Opinion
--
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1524124
Title:
unscalable database schema design
Status in OpenStack Identity (keystone):
Opinion
Bug description:
I have noticed the keystone SQL database schema design is not
scalable. It can hold maybe a few hundreds or maximum thousands of
entries, but beyond this, it is going to certainly create very serious
efficiency problems, both in terms storage space and query response
time. Here are the main problem points I have spotted:
i) most of the tables use primary keys of varchar(64) type: role,
domain, project, token, user, group etc., supposed to contain unique
hex identifiers. I am not exactly sure about the rationale behind this
design? If the idea is to be able to accommodate up to 16**64=10**77
distinct records, than this is clearly flawed, as the tables won't
hold more than a few thousand entries given the current length of the
primary key (and foreign keys, for those minor entity tables that
refer to the major entity).
ii) some tables have composite keys on multiple varchar(64) fields:
Create Table: CREATE TABLE `assignment` (
`type` enum('UserProject','GroupProject','UserDomain','GroupDomain') NOT NULL,
`actor_id` varchar(64) NOT NULL,
`target_id` varchar(64) NOT NULL,
`role_id` varchar(64) NOT NULL,
`inherited` tinyint(1) NOT NULL,
PRIMARY KEY (`type`,`actor_id`,`target_id`,`role_id`),
KEY `ix_actor_id` (`actor_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
iii) some tables have unique keys defined on varchar(255) columns:
Create Table: CREATE TABLE `role` (
`id` varchar(64) NOT NULL,
`name` varchar(255) NOT NULL,
`extra` text,
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
iv) the generated public id for (user,domain) entities is currently 64
hex chars, while only 32 hex chars are needed to ensure uniqueness up
to 16**16=2**64=10**19 entries, which should be more than sufficient
for any practical installation.
In order to remedy these problems, I propose the following
improvements:
i) replace the varchar(64) hex primary key by an auto-incremented
integer(4) column. This will hold up to 4 billion records and will
greatly reduce the storage requirements and improve query
performance.
ii) reduce the generated public id for (user, domain) entities to 32
hex chars, stored in binary form as two bigint(8) columns.
iii) reduce the "name" field length to more manageable length or
reduce index size using a hash function.
To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1524124/+subscriptions
References