maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #13294
Re: 940d028521f: MDEV-30164 System variable for default collations
Hi, Alexander,
I'm sorry it took a while.
I'm still thinking about the whole thing, it's a rather big change for
a really fringe functionality. But I failed to come up with something
better so far.
Code-wise the patch is mostly fine. See few small comments below, and
one slightly larger comment re. replication.
On Mar 07, Alexander Barkov wrote:
> revision-id: 940d028521f (mariadb-10.11.1-4-g940d028521f)
> parent(s): 71dedd0ebcd
> author: Alexander Barkov
> committer: Alexander Barkov
> timestamp: 2022-12-14 20:00:22 +0400
> message:
>
> MDEV-30164 System variable for default collations
>
> This patch adds a way to override default collations
> (or "character set collations") for desired character sets.
>
> diff --git a/sql/lex_charset.h b/sql/lex_charset.h
> --- a/sql/lex_charset.h
> +++ b/sql/lex_charset.h
> @@ -544,6 +699,20 @@ struct Lex_exact_charset_extended_collation_attrs_st
> {
> return m_ci;
> }
> + CHARSET_INFO *charset_info(const Charset_collation_map_st &map) const
> + {
> + switch (m_type)
> + {
> + case Lex_exact_charset_extended_collation_attrs_st::TYPE_CHARACTER_SET:
> + return map.get_collation_for_charset(m_ci);
> + case TYPE_EMPTY:
Lex_exact_charset_extended_collation_attrs_st::TYPE_EMPTY
(or all case labels without a struct name)
> + case Lex_exact_charset_extended_collation_attrs_st::TYPE_CHARACTER_SET_COLLATE_EXACT:
> + case Lex_exact_charset_extended_collation_attrs_st::TYPE_COLLATE_CONTEXTUALLY_TYPED:
COLLATE DEFAULT is TYPE_COLLATE_CONTEXTUALLY_TYPED.
shouldn't it use the map too?
> + case Lex_exact_charset_extended_collation_attrs_st::TYPE_COLLATE_EXACT:
> + break;
> + }
> + return m_ci;
> + }
> Type type() const
> {
> return m_type;
> diff --git a/sql/log_event.h b/sql/log_event.h
> --- a/sql/log_event.h
> +++ b/sql/log_event.h
> @@ -2147,6 +2153,8 @@ class Query_log_event: public Log_event
> bool sql_mode_inited;
> bool charset_inited;
>
> + LEX_CSTRING character_set_collations_str;
better LEX_CUSTRING, don't you think? It's not even a _str.
> +
> uint32 flags2;
> sql_mode_t sql_mode;
> ulong auto_increment_increment, auto_increment_offset;
> diff --git a/sql/log_event_server.cc b/sql/log_event_server.cc
> --- a/sql/log_event_server.cc
> +++ b/sql/log_event_server.cc
> @@ -1194,6 +1194,14 @@ bool Query_log_event::write()
> int2store(start+2, auto_increment_offset);
> start+= 4;
> }
> +
> + if (thd && thd->variables.character_set_collations.count())
> + {
> + *start++= Q_COLLATIONS_SESSION;
> + size_t len= thd->variables.character_set_collations.to_binary((char*)start);
> + start+= len;
> + }
Perhaps, detect if it's needed? A cheap way of doing it would be to extend
your Elem_st with a query_id. And every time you find_elem_by_charset,
you set this elem's query_id to thd->query_id. And here you write only
elements with the current query id. If any.
Another approach would be to have a bitmap, like
uchar used_default_coll_mapping;
and in find_elem_by_charset() you set the bit, like
used_default_coll_mapping |= 1 << i;
and then, again, print affected collations, if any. Most often
used_default_coll_mapping will likely be zero
one more question. In, say, 10.10->11.1 replication
master and slave will have different default collations, but
thd->variables.character_set_collations will not reflect that.
How do you plan to solve it?
> +
> if (charset_inited)
> {
> *start++= Q_CHARSET_CODE;
Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx
Follow ups