← Back to team overview

maria-developers team mailing list archive

Re: review of MDEV-4011 per thread memory counting and usage

 

Hi!

>>>>> "Sergei" == Sergei Golubchik <serg@xxxxxxxxxxxx> writes:

Sergei> Hi.
Sergei> Here it is:

>> === modified file 'client/completion_hash.cc'
>> --- client/completion_hash.cc	2011-06-30 15:46:53 +0000
>> +++ client/completion_hash.cc	2013-01-15 21:00:33 +0000
>> @@ -49,7 +49,7 @@ int completion_hash_init(HashTable *ht,
ht-> initialized = 0;
>> return FAILURE;
>> }
>> -  init_alloc_root(&ht->mem_root, 8192, 0);
>> +  init_alloc_root(&ht->mem_root, 8192, 0, 0);

Sergei> very often you write 0 or MY_THREAD_SPECIFIC instead of
Sergei> MYF(0) and MYF(MY_THREAD_SPECIFIC)

Yes. MYF() was important in C, but not that critical anymore in C++ I
can change all MY_THREAD_SPECIFIC to MYF(MY_THREAD_SPECIFIC) if you
want.

>> === modified file 'include/my_sys.h'
>> --- include/my_sys.h	2012-12-23 21:37:11 +0000
>> +++ include/my_sys.h	2013-01-15 21:00:33 +0000
>> @@ -86,6 +86,8 @@ typedef struct my_aio_result {
>> #define MY_SYNC       4096      /* my_copy(): sync dst file */
>> #define MY_SYNC_DIR   32768     /* my_create/delete/rename: sync directory */
>> #define MY_SYNC_FILESIZE 65536  /* my_sync(): safe sync when file is extended */
>> +#define MY_THREAD_SPECIFIC 0x10000  /* my_malloc(): thread specific */
>> +#define MY_THREAD_MOVE     0x20000  /* realloc(); Memory can move */

Sergei> may be, better to remove this, as it's unused?
Sergei> it's easier to add it later (just a couple of lines of code),
Sergei> than to keep the dead code around.

It's hard for someone else to know the removed lines of code if they
ever need to allocate memory for another thread.  I rather keep that
there for now.

>> #define MY_CHECK_ERROR	1	/* Params to my_end; Check open-close */
>> #define MY_GIVE_INFO	2	/* Give time info about process*/
>> @@ -148,6 +150,42 @@ typedef struct my_aio_result {
>> /* Extra length needed for filename if one calls my_create_backup_name */
>> #define MY_BACKUP_NAME_EXTRA_LENGTH 17
>> 
>> +/* If we have our own safemalloc (for debugging) */
>> +#if defined(SAFEMALLOC)
>> +#define MALLOC_SIZE_AND_FLAG(p,b) sf_malloc_usable_size(p,b)
>> +#define MALLOC_PREFIX_SIZE 0
>> +#define MALLOC_STORE_SIZE(a,b,c,d)
>> +#define MALLOC_FIX_POINTER_FOR_FREE(a) a
>> +#define SAFEMALLOC_REPORT_MEMORY(X) sf_report_leaked_memory(X)
>> +void sf_report_leaked_memory(my_thread_id id);
>> +extern my_thread_id (*sf_malloc_dbug_id)(void);
>> +#else
>> +/*
>> + *   We use double as prefix size as this guarantees the correct
>> + *   alignment on all platforms and will optimize things for
>> + *   memcpy(), memcmp() etc.
>> + */
>> +#define MALLOC_PREFIX_SIZE (sizeof(double))
>> +#define MALLOC_SIZE(p) (*(size_t*) ((char*)(p) - MALLOC_PREFIX_SIZE))
>> +#define MALLOC_STORE_SIZE(p, type_of_p, size, flag)      \
>> +{\
>> +  *(size_t*) p= (size) | (flag);    \
>> +  (p)= (type_of_p) (((char*) (p)) + MALLOC_PREFIX_SIZE); \
>> +} 
>> +static inline size_t malloc_size_and_flag(void *p, myf *flags)
>> +{
>> +  size_t size= MALLOC_SIZE(p);
>> +  *flags= (size & 1);
>> +  return size & ~ (ulonglong) 1;
>> +}
>> +#define MALLOC_SIZE_AND_FLAG(p,b) malloc_size_and_flag(p, b);
>> +#define MALLOC_FIX_POINTER_FOR_FREE(p) (((char*) (p)) - MALLOC_PREFIX_SIZE)
>> +#define SAFEMALLOC_REPORT_MEMORY(X) do {} while(0)
>> +#endif /* SAFEMALLOC */
>> +
>> +typedef void (*MALLOC_SIZE_CB) (long long size, myf my_flags); 
>> +extern void set_malloc_size_cb(MALLOC_SIZE_CB func);

Sergei> I don't think you need many of these defines here. my_sys.h is the interface
Sergei> header.  MALLOC_STORE_SIZE and MALLOC_FIX_POINTER_FOR_FREE (for example, I
Sergei> didn't check the others) are only used in my_malloc.c - they should not be
Sergei> used elsewhere, and thus, I think, they should be defined in my_malloc.c

Originally here we had definitions for usage of the different malloc
function and to get size for the memory blocks.
I agree that most of the above can now be moved to my_malloc.c.
I will do that.

Sergei> And, frankly speaking, I wouldn't bother doing two different ways of
Sergei> storing the size and the flags. my_malloc can always allocate one
Sergei> sizeof(double) more and store its data there. Who cares that in the debug
Sergei> builds safemalloc duplicates that...

I prefer to do it right. As this is just a small set of defines, it's
not hard to keep both around.

>> /* defines when allocating data */
>> extern void *my_malloc(size_t Size,myf MyFlags);
>> extern void *my_multi_malloc(myf MyFlags, ...);
>> @@ -323,6 +361,7 @@ typedef struct st_dynamic_array
>> uint elements,max_element;
>> uint alloc_increment;
>> uint size_of_element;
>> +  myf malloc_flags;

Sergei> hm. you've preserved the MEM_ROOT structure for compatibiity reasons,
Sergei> but you've changed DYNAMIC_ARRAY. Why?
Sergei> I'd expect you either to change everything or to change nothing.

MEMROOT is used the public client library, so I didn't want to change
it. I spent a lot of time encoding the memory in other variables just
to not break compatibility with 5.5 or 5.6

dynamic_array is not used in the structs, so there I did things correctly.

Sergei> Besides, what's the point of preserving MEM_ROOT, if you
Sergei> change prototypes of functions that work with it?

Becasue a MySQL client doesn't directly access MEMROOT; They do hower
allocate the MYSQL struct on stack that includes MEMROOT.

>> } DYNAMIC_ARRAY;
>> 
>> typedef struct st_my_tmpdir
>> @@ -768,16 +807,13 @@ extern my_bool real_open_cached_file(IO_
>> extern void close_cached_file(IO_CACHE *cache);
>> File create_temp_file(char *to, const char *dir, const char *pfx,
>> int mode, myf MyFlags);
>> -#define my_init_dynamic_array(A,B,C,D) init_dynamic_array2(A,B,NULL,C,D)
>> -#define my_init_dynamic_array_ci(A,B,C,D) init_dynamic_array2(A,B,NULL,C,D)
>> -#define my_init_dynamic_array2(A,B,C,D,E) init_dynamic_array2(A,B,C,D,E)
>> -#define my_init_dynamic_array2_ci(A,B,C,D,E) init_dynamic_array2(A,B,C,D,E)
>> +#define my_init_dynamic_array(A,B,C,D,E) init_dynamic_array2(A,B,NULL,C,D,E)
>> +#define my_init_dynamic_array_ci(A,B,C,D,E) init_dynamic_array2(A,B,NULL,C,D,E)
>> +#define my_init_dynamic_array2(A,B,C,D,E,F) init_dynamic_array2(A,B,C,D,E,F)
>> +#define my_init_dynamic_array2_ci(A,B,C,D,E,F) init_dynamic_array2(A,B,C,D,E,F)

Sergei> as you're changing function prototypes anyway,
Sergei> you can remove the _ci functions and all other "preserve the ABI" garbage,

Will do.

>> extern my_bool init_dynamic_array2(DYNAMIC_ARRAY *array, uint element_size,
>> void *init_buffer, uint init_alloc,
>> -                                   uint alloc_increment);
>> -/* init_dynamic_array() function is deprecated */
>> -extern my_bool init_dynamic_array(DYNAMIC_ARRAY *array, uint element_size,
>> -                                  uint init_alloc, uint alloc_increment);
>> +                                   uint alloc_increment, myf my_flags);
>> extern my_bool insert_dynamic(DYNAMIC_ARRAY *array, const uchar * element);
>> extern uchar *alloc_dynamic(DYNAMIC_ARRAY *array);
>> extern uchar *pop_dynamic(DYNAMIC_ARRAY*);
>> @@ -829,7 +865,8 @@ extern void my_free_lock(void *ptr);
>> #define ALLOC_ROOT_MIN_BLOCK_SIZE (MALLOC_OVERHEAD + sizeof(USED_MEM) + 8)
>> #define clear_alloc_root(A) do { (A)->free= (A)->used= (A)->pre_alloc= 0; (A)->min_malloc=0;} while(0)
>> extern void init_alloc_root(MEM_ROOT *mem_root, size_t block_size,
>> -			    size_t pre_alloc_size);
>> +			    size_t pre_alloc_size, myf my_flags);
>> +extern void alloc_root_set_min_malloc(MEM_ROOT *mem_root, size_t min_malloc);

Sergei> if you'd store the flag in the lowest bit of the block_size
Sergei> or if you wouldn't try to keep the old MEM_ROOT structure,
Sergei> this function wouldn't be needed

I would still need a function to change the block size, but I agree
that this is not likely to happen.

I will change that.

>> extern void *alloc_root(MEM_ROOT *mem_root, size_t Size);
>> extern void *multi_alloc_root(MEM_ROOT *mem_root, ...);
>> extern void free_root(MEM_ROOT *root, myf MyFLAGS);
>> === modified file 'include/my_tree.h'
>> --- include/my_tree.h	2011-11-03 18:17:05 +0000
>> +++ include/my_tree.h	2013-01-15 21:00:33 +0000
>> @@ -68,13 +68,15 @@ typedef struct st_tree {
>> MEM_ROOT mem_root;
>> my_bool with_delete;
>> tree_element_free free;
>> +  myf malloc_flags;
>> uint flag;
>> } TREE;
>> 
>> /* Functions on whole tree */
>> void init_tree(TREE *tree, size_t default_alloc_size, size_t memory_limit,
>> int size, qsort_cmp2 compare, my_bool with_delete,
>> -	       tree_element_free free_element, void *custom_arg);
>> +	       tree_element_free free_element, void *custom_arg,
>> +               myf my_flags_for_malloc);

Sergei> I'd just call it my_flags. it doesn't matter what tree is doing with them.
Sergei> and there may be other flags too. E.g. you could've removed 'with_delete'
Sergei> and added MY_TREE_WITH_DELETE

Will do (both).

>> void delete_tree(TREE*);
>> void reset_tree(TREE*);
>> 
>> === modified file 'include/mysql.h'
>> --- include/mysql.h	2012-11-20 14:24:39 +0000
>> +++ include/mysql.h	2013-01-15 21:00:33 +0000
>> @@ -167,7 +167,7 @@ enum mysql_option
>> MYSQL_OPT_GUESS_CONNECTION, MYSQL_SET_CLIENT_IP, MYSQL_SECURE_AUTH,
>> MYSQL_REPORT_DATA_TRUNCATION, MYSQL_OPT_RECONNECT,
>> MYSQL_OPT_SSL_VERIFY_SERVER_CERT, MYSQL_PLUGIN_DIR, MYSQL_DEFAULT_AUTH,
>> -  MYSQL_PROGRESS_CALLBACK,
>> +  MYSQL_PROGRESS_CALLBACK, MYSQL_THREAD_SPECIFIC_MALLOC,

Sergei> I don't think you need this on the client side.

Sergei> (the only use case is federated, and I don't think you need to extend
Sergei> the client API specifically for it, federated can set whatever it needs
Sergei> directliy).

I think it's better to use a clear interface for that.
CONNECTDB will need to also use this.


>> /* MariaDB options */
>> MYSQL_OPT_NONBLOCK=6000
>> };
>> === modified file 'mysys/array.c'
>> --- mysys/array.c	2012-01-13 14:50:02 +0000
>> +++ mysys/array.c	2013-01-15 21:00:33 +0000
>> @@ -62,17 +64,19 @@ my_bool init_dynamic_array2(DYNAMIC_ARRA
>> should not throw an error
>> */
>> if (init_alloc &&
>> -      !(array->buffer= (uchar*) my_malloc(element_size*init_alloc, MYF(0))))
>> +      !(array->buffer= (uchar*) my_malloc(element_size*init_alloc,
>> +                                          MYF(my_flags))))
array-> max_element=0;
>> DBUG_RETURN(FALSE);
>> }
>> 
>> my_bool init_dynamic_array(DYNAMIC_ARRAY *array, uint element_size,
>> -                           uint init_alloc, uint alloc_increment)
>> +                           uint init_alloc, uint alloc_increment,
>> +                           myf my_flags)
>> {
>> /* placeholder to preserve ABI */
>> return my_init_dynamic_array_ci(array, element_size, init_alloc,

Sergei> You can remove this, as you're changing the ABI anyway

Will do.

>> -                                  alloc_increment);
>> +                                  alloc_increment, my_flags);
>> }
>> /*
>> Insert element at the end of array. Allocate memory if needed.
>> === modified file 'mysys/my_alloc.c'
>> --- mysys/my_alloc.c	2012-01-13 14:50:02 +0000
>> +++ mysys/my_alloc.c	2013-01-15 21:00:33 +0000
>> @@ -41,16 +48,20 @@
>> Altough error can happen during execution of this function if
>> pre_alloc_size is non-0 it won't be reported. Instead it will be
>> reported as error in first alloc_root() on this memory root.
>> +
>> +    We don't want to change the api for MEMROOT.
>> +    Becasue of this, we store in min_malloc of my_thread_specific is set.
>> */
>> 
>> void init_alloc_root(MEM_ROOT *mem_root, size_t block_size,
>> -		     size_t pre_alloc_size __attribute__((unused)))
>> +		     size_t pre_alloc_size __attribute__((unused)),
>> +                     myf my_flags)
>> {
>> DBUG_ENTER("init_alloc_root");
>> DBUG_PRINT("enter",("root: 0x%lx", (long) mem_root));
>> 
mem_root-> free= mem_root->used= mem_root->pre_alloc= 0;
>> -  mem_root->min_malloc= 32;
>> +  mem_root->min_malloc= 32 | (test(my_flags & MY_THREAD_SPECIFIC) << 16);

Sergei> it'd be safer to put it in the lowest bit. min_malloc or - particularly -
Sergei> block size, are always even numbers, the lowest bit is always 0.

Will put in block size, as agreed earlier.

Sergei> or to add a new field to the MEM_ROOT, as compatibility is already
Sergei> broken anyway.

Can't add things to MEM_ROOT as it would break client struct compatiblity.

mem_root-> block_size= block_size - ALLOC_ROOT_MIN_BLOCK_SIZE;
mem_root-> error_handler= 0;
mem_root-> block_num= 4;			/* We shift this with >>2 */
>> === modified file 'mysys/my_malloc.c'
>> --- mysys/my_malloc.c	2012-01-13 14:50:02 +0000
>> +++ mysys/my_malloc.c	2013-01-15 21:00:33 +0000
>> @@ -18,6 +18,31 @@
>> #include "mysys_err.h"
>> #include <m_string.h>
>> 
>> +MALLOC_SIZE_CB malloc_size_cb_func= NULL;
>> +
>> +/**
>> +  Inform appliction that memory usage has changed

Sergei> application

ok.

>> === modified file 'mysys/mysys_priv.h'
>> --- mysys/mysys_priv.h	2011-12-12 21:58:24 +0000
>> +++ mysys/mysys_priv.h	2013-01-15 21:00:33 +0000
>> @@ -70,12 +70,13 @@ extern PSI_file_key key_file_charset, ke
>> #endif /* HAVE_PSI_INTERFACE */
>> 
>> #ifdef SAFEMALLOC
>> -void *sf_malloc(size_t size);
>> -void *sf_realloc(void *ptr, size_t size);
>> +void *sf_malloc(size_t size, myf my_flags);
>> +void *sf_realloc(void *ptr, size_t size, myf my_flags);
>> void sf_free(void *ptr);
>> +size_t sf_malloc_usable_size(void *ptr, myf *my_flags);

Sergei> this is used nowhere. dead code again.

sf_malloc_usable_size() is used in my_malloc.c trough the
MALLOC_SIZE_AND_FLAG macro.

<cut>

>> === modified file 'mysys/safemalloc.c'
>> --- mysys/safemalloc.c	2012-04-18 01:29:26 +0000
>> +++ mysys/safemalloc.c	2013-01-15 21:00:33 +0000
>> @@ -79,11 +81,21 @@ static int bad_ptr(const char *where, vo
>> static void free_memory(void *ptr);
>> static void sf_terminate();
>> 
>> +/* Setup default call to get a thread id for the memory */
>> +
>> +my_thread_id default_sf_malloc_dbug_id(void)
>> +{
>> +  return my_thread_dbug_id();
>> +}
>> +
>> +my_thread_id (*sf_malloc_dbug_id)(void)= default_sf_malloc_dbug_id;

Sergei> Why do you need different functions for getting the thread id?

Because in mysqld we want to count memory per THD, not per thread.
One thread can allocate the THD and another one can use it.

<cut>

>> === modified file 'sql/event_scheduler.cc'
>> --- sql/event_scheduler.cc	2012-09-27 23:06:56 +0000
>> +++ sql/event_scheduler.cc	2013-01-15 21:00:33 +0000
>> @@ -182,12 +180,15 @@ deinit_event_thread(THD *thd)
>> void
>> pre_init_event_thread(THD* thd)
>> {
>> +  THD *orig_thd= current_thd;
>> DBUG_ENTER("pre_init_event_thread");
>> +
>> +  my_pthread_setspecific_ptr(THR_THD,  thd);

Sergei> I'd rather see set_current_thd(thd) here:

Sergei> #define set_current_thd(X) my_pthread_setspecific_ptr(THR_THD, (X))

I can add that.

<cut>

>> === modified file 'sql/handler.cc'
>> --- sql/handler.cc	2012-12-17 00:49:19 +0000
>> +++ sql/handler.cc	2013-01-15 21:00:33 +0000
>> @@ -5395,6 +5395,7 @@ fl_log_iterator_buffer_init(struct handl
iterator-> buffer= buff;
iterator-> next= &fl_log_iterator_next;
iterator-> destroy= &fl_log_iterator_destroy;
>> +  my_dir_end(dirp);

Sergei> Why did we never see memory leaks because of that?

This is test code that you only get when you compile with
TRANS_LOG_MGM_EXAMPLE_CODE.

>> return HA_ITERATOR_OK;
>> }
>> 
>> === modified file 'sql/mysqld.cc'
>> --- sql/mysqld.cc	2013-01-09 03:34:33 +0000
>> +++ sql/mysqld.cc	2013-01-15 21:00:33 +0000
>> @@ -2606,9 +2607,10 @@ bool one_thread_per_connection_end(THD *
>> 
>> /* It's safe to broadcast outside a lock (COND... is not deleted here) */
>> DBUG_PRINT("signal", ("Broadcasting COND_thread_count"));
>> +  mysql_cond_broadcast(&COND_thread_count);
>> +
>> DBUG_LEAVE;                                   // Must match DBUG_ENTER()
>> my_thread_end();
>> -  mysql_cond_broadcast(&COND_thread_count);

Sergei> why?

You can't all mysql_cond_broadcast(&COND_thread_count) after
my_thread_end().  This causes problems with safemutex that needs
mysys_var to work.

my_thread_end() is the last function that a thread should call.

>> pthread_exit(0);
>> return 0;                                     // Avoid compiler warnings
>> @@ -7024,6 +7096,7 @@ SHOW_VAR status_vars[]= {
>> {"Key",                      (char*) &show_default_keycache, SHOW_FUNC},
>> {"Last_query_cost",          (char*) offsetof(STATUS_VAR, last_query_cost), SHOW_DOUBLE_STATUS},
>> {"Max_used_connections",     (char*) &max_used_connections,  SHOW_LONG},
>> +  {"Memory_used",              (char*) &total_memory_used, SHOW_LONGLONG},

Sergei> May be this should rather be:
Sergei>   SHOW SESSION STATUS -> thd->memory_used
Sergei>   SHOW GLOBAL STATUS -> total_memory_used
Sergei> (which means that total_memory_used/memory_used should be in the status_var struct)

That's a good idea. I can easily fix that.

>> {"Not_flushed_delayed_rows", (char*) &delayed_rows_in_use,    SHOW_LONG_NOFLUSH},
>> {"Open_files",               (char*) &my_file_opened,         SHOW_LONG_NOFLUSH},
>> {"Open_streams",             (char*) &my_stream_opened,       SHOW_LONG_NOFLUSH},
>> === modified file 'sql/net_serv.cc'
>> --- sql/net_serv.cc	2012-02-28 17:53:05 +0000
>> +++ sql/net_serv.cc	2013-01-15 21:00:33 +0000
>> @@ -139,6 +140,7 @@ my_bool my_net_init(NET *net, Vio* vio)
net-> net_skip_rest_factor= 0;
net-> last_errno=0;
net-> unused= 0;
>> +  net->thread_specific_malloc= test(my_flags & MY_THREAD_SPECIFIC);

Sergei> why do you need that in NET ?

As NET is almost always thread specific, except in a few cases.
I need to be able to remember the original flag for future mallocs
done by NET.


>> === modified file 'sql/sql_analyse.h'
>> --- sql/sql_analyse.h	2011-11-03 18:17:05 +0000
>> +++ sql/sql_analyse.h	2013-01-15 21:00:33 +0000
>> @@ -121,7 +121,8 @@ class field_str :public field_info
>> must_be_blob(0), was_zero_fill(0),
>> was_maybe_zerofill(0), can_be_still_num(1)
>> { init_tree(&tree, 0, 0, sizeof(String), (qsort_cmp2) sortcmp2,
>> -		0, (tree_element_free) free_string, NULL); };
>> +		0, (tree_element_free) free_string, NULL,
>> +                MY_THREAD_SPECIFIC); };

Sergei> Now, when I'm thinking of it, I got the idea that it's better to have
Sergei> MY_THREAD_SPECIFIC as a default, and add MY_GLOBAL_ALLOC instead to mark
Sergei> global allocations explicitly. It's more future-proof.

This would cause more problems:
- There are many more global mallocs than thread specific.
- Allocating things globally will never cause a problem; You can allocate
  that in one thread and free it in another.
- Using MY_THREAD_SPECIFIC will lead to crashes if you do things
  wrongly.

In other words, it doesn't matter much if you forget
MY_THREAD_SPECIFIC. If it would be the other way around, you would
likely get a lot of crashes both in current code an if you do a wrong
merge from MySQL and forget to add MY_GLOBAL_ALLOC.

Sergei> When one copies 
Sergei> init_dynamic_array() from somewhere and zeros out unknown or unneeded
Sergei> parameters, the memory will be per-thread by default. And if that's
Sergei> wrong, the test suite will find it, you have an assert for this.
Sergei> To mark memory global as an explicit action, one needs to think about it.

It's not clear when memory can be marked as MY_THREAD_SPECIFIC. I
spent a LOT of time finding all the safe places.

Doing it the other way around would cause people to work much more as
they would get crashes they can't explain. Also finding the missing
malloc is not trivial as valgrind is of no help here.

Sergei> Currently, the memory is global by default, and if one doesn't think,#
Sergei> copy-pastes, and leaves new allocation global, it will not be noticed,
Sergei> the memory will simply be unaccounted for.

It will be in the global pool, not in the thread pool. This is not
optimal, but also not fatal.  The other way around would cause a lot
of grief from developers that doesn't understand why things crashes...

>> 
>> void	 add();
>> void	 get_opt_type(String*, ha_rows);
>> === modified file 'sql/sql_error.cc'
>> --- sql/sql_error.cc	2012-01-13 14:50:02 +0000
>> +++ sql/sql_error.cc	2013-01-15 21:00:33 +0000
>> @@ -457,25 +457,37 @@ Diagnostics_area::disable_status()
>> m_status= DA_DISABLED;
>> }
>> 
>> -Warning_info::Warning_info(ulonglong warn_id_arg, bool allow_unlimited_warnings)
>> +Warning_info::Warning_info(ulonglong warn_id_arg,
>> +                           bool allow_unlimited_warnings, bool initialize)

Sergei> I'd rather add a special constructor for Warning_info
Sergei> that THD would use. Because there's only one place where you need
Sergei> initialize=false, and all other "normal" uses, should not suffer (be
Sergei> changed) because of that.

I can do that. thanks for the idea.

Regards,
Monty


Follow ups

References