maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #12696
Re: e7762b3a725: MDEV-8334: Rename utf8 to utf8mb3
Hi, Rucha!
Looks great!
Just one question below re. using global vs session old_behavior
variable.
On May 05, Rucha Deodhar wrote:
> revision-id: e7762b3a725 (mariadb-10.5.2-583-ge7762b3a725)
> parent(s): bfedf1eb4b6
> author: Rucha Deodhar <rucha.deodhar@xxxxxxxxxxx>
> committer: Rucha Deodhar <rucha.deodhar@xxxxxxxxxxx>
> timestamp: 2021-04-20 12:50:32 +0530
> message:
>
> MDEV-8334: Rename utf8 to utf8mb3
>
> This patch changes the main name of 3 byte character set from utf8 to
> utf8mb3. New old_mode UTF8_IS_UTF8MB3 is added and set TRUE by default,
> so that utf8 would mean utf8mb3. If not set, utf8 would mean utf8mb4.
> diff --git a/libmariadb b/libmariadb
> --- a/libmariadb
> +++ b/libmariadb
> @@ -1 +1 @@
> -Subproject commit b6f8883d9687936a50a7ed79bd9e5af2340efccd
> +Subproject commit 03d983b287f8a1fe855cb5ed479a3f7ab4f922ab
when rebasing and pushing into 10.6 take care not to
rollback C/C changes. That is, only update C/C submodule reference
if it's earlier than your 03d983b287f8a1fe855cb5ed479a3f7ab4f922ab
> Binary files a/mysql-test/suite/sys_vars/r/character_set_results_basic.result and b/mysql-test/suite/sys_vars/r/character_set_results_basic.result differ
> diff --git a/sql/item.cc b/sql/item.cc
> --- a/sql/item.cc
> +++ b/sql/item.cc
> @@ -2359,6 +2359,9 @@ left_is_superset(const DTCollation *left, const DTCollation *right)
>
> bool DTCollation::aggregate(const DTCollation &dt, uint flags)
> {
> +
> + myf utf8_flag= global_system_variables.old_behavior &
> + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
if old_behavior is a session variable, then you should use session
value here.
> if (!my_charset_same(collation, dt.collation))
> {
> /*
> diff --git a/sql/mysqld.cc b/sql/mysqld.cc
> --- a/sql/mysqld.cc
> +++ b/sql/mysqld.cc
> @@ -4099,6 +4098,8 @@ static int init_common_variables()
> test purposes, to be able to start "mysqld" even if
> the requested character set is not available (see bug#18743).
> */
> + myf utf8_flag= global_system_variables.old_behavior &
> + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
ok, here there's no "session" yet, so global value is correct
> for (;;)
> {
> char *next_character_set_name= strchr(default_character_set_name, ',');
> diff --git a/sql/set_var.cc b/sql/set_var.cc
> --- a/sql/set_var.cc
> +++ b/sql/set_var.cc
> @@ -533,11 +533,12 @@ static my_old_conv old_conv[]=
> CHARSET_INFO *get_old_charset_by_name(const char *name)
> {
> my_old_conv *conv;
> -
> + myf utf8_flag= global_system_variables.old_behavior &
> + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
technically, you should use a session value here too.
but see the old_conv array, it doesn't have "utf8" anywhere,
so it doesn't matter what the flag is, you can use 0 there.
> for (conv= old_conv; conv->old_name; conv++)
> {
> if (!my_strcasecmp(&my_charset_latin1, name, conv->old_name))
> - return get_charset_by_csname(conv->new_name, MY_CS_PRIMARY, MYF(0));
> + return get_charset_by_csname(conv->new_name, MY_CS_PRIMARY, MYF(utf8_flag));
> }
> return NULL;
> }
> diff --git a/sql/sp.cc b/sql/sp.cc
> --- a/sql/sp.cc
> +++ b/sql/sp.cc
> @@ -291,7 +291,8 @@ bool load_charset(MEM_ROOT *mem_root,
> CHARSET_INFO **cs)
> {
> LEX_CSTRING cs_name;
> -
> + myf utf8_flag= global_system_variables.old_behavior &
> + OLD_MODE_UTF8_IS_UTF8MB3 ? MY_UTF8_IS_UTF8MB3 : 0;
but this needs a session value, I suppose
> if (field->val_str_nopad(mem_root, &cs_name))
> {
> *cs= dflt_cs;
> @@ -324,9 +325,10 @@ bool load_collation(MEM_ROOT *mem_root,
> *cl= dflt_cl;
> return TRUE;
> }
> + myf utf8_flag= thd->get_utf8_flag();
Hmm, here you do use a session value. So, I suppose your using global
value above was not an oversight, but you intentionally did it.
What were your reasons?
>
> DBUG_ASSERT(cl_name.str[cl_name.length] == 0);
> - *cl= get_charset_by_name(cl_name.str, MYF(0));
> + *cl= get_charset_by_name(cl_name.str, MYF(utf8_flag));
>
> if (*cl == NULL)
> {
> diff --git a/sql/sql_class.h b/sql/sql_class.h
> --- a/sql/sql_class.h
> +++ b/sql/sql_class.h
> @@ -1051,14 +1052,16 @@ static inline void update_global_memory_status(int64 size)
> @retval NULL on error
> @retval Pointter to CHARSET_INFO with the given name on success
> */
> -static inline CHARSET_INFO *
> -mysqld_collation_get_by_name(const char *name,
> +inline CHARSET_INFO *
static inline, please
> +mysqld_collation_get_by_name(const char *name, bool utf8_is_utf8mb3,
up to you, but wouldn't it be more convenient to
pass the flag here? Then you can invoke it with thd->utf8_flag()
> CHARSET_INFO *name_cs= system_charset_info)
> {
> CHARSET_INFO *cs;
> MY_CHARSET_LOADER loader;
> + myf utf8_flag= utf8_is_utf8mb3 ? MY_UTF8_IS_UTF8MB3 : 0;
> my_charset_loader_init_mysys(&loader);
> - if (!(cs= my_collation_get_by_name(&loader, name, MYF(0))))
> +
> + if (!(cs= my_collation_get_by_name(&loader, name, MYF(utf8_flag))))
> {
> ErrConvString err(name, name_cs);
> my_error(ER_UNKNOWN_COLLATION, MYF(0), err.ptr());
> diff --git a/sql/sql_parse.cc b/sql/sql_parse.cc
> --- a/sql/sql_parse.cc
> +++ b/sql/sql_parse.cc
> @@ -10449,7 +10449,10 @@ merge_charset_and_collation(CHARSET_INFO *cs, CHARSET_INFO *cl)
> CHARSET_INFO *find_bin_collation(CHARSET_INFO *cs)
> {
> const char *csname= cs->csname;
> - cs= get_charset_by_csname(csname, MY_CS_BINSORT, MYF(0));
> + myf utf8_flag= global_system_variables.old_behavior &
> + OLD_MODE_UTF8_IS_UTF8MB3 ?
> + MY_UTF8_IS_UTF8MB3 : 0;
Why not a session value here?
> + cs= get_charset_by_csname(csname, MY_CS_BINSORT, MYF(utf8_flag));
> if (!cs)
> {
> char tmp[65];
Regards,
Sergei
VP of MariaDB Server Engineering
and security@xxxxxxxxxxx