maria-developers team mailing list archive
-
maria-developers team
-
Mailing list archive
-
Message #07947
Re: MDEV-7004 review
Hi Sergey,
thanks for looking into this!
On Wed, Nov 26, 2014 at 08:00:44PM +0100, Sergei Golubchik wrote:
> Hi, Sergey
>
> I didn't review MDEV-7200 (InnoDB) and patches that are already in 10.1.
Patch for MDEV-7200 is not good to go into 10.1 as is anyway. It is here just as
a reminder that we can gain another 2% with simple fix.
Some of my patches depend on Monty's patches that haven't actually been pushed
to 10.1 yet. I'll hold those.
Further comments inline.
> Almost everything else was fine, see few comments below.
>
> > ------------------------------------------------------------
> > revno: 4513
> > committer: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> > branch nick: 10.0-power
> > timestamp: Fri 2014-11-07 22:03:18 +0400
> > message:
> > Multi-rwlock lock for table definition cache hash.
> > added:
> > modified:
> > diff:
> > === modified file 'sql/table_cache.cc'
> > --- sql/table_cache.cc 2014-10-21 12:20:39 +0000
> > +++ sql/table_cache.cc 2014-11-07 18:03:18 +0000
> > @@ -53,6 +53,77 @@
> > #include "table.h"
> > #include "sql_base.h"
> >
> > +
> > +/**
> > + Table definition hash lock.
> > +
> > + Lock consists of multiple rw-locks (8 by default). Lock is write locked
> > + when all instances are write locked. Lock is read locked when any instance
> > + is read locked.
> > +
> > + This class was implemented to workaround scalability bottleneck of table
> > + definition hash rw-lock.
> > +
> > + To be replaced with lock-free hash when we understand how to solve the
> > + following problems:
> > + - lock-free hash iterator (can be workarounded by introducing global
> > + all_tables list);
>
>
> I don't particularly like that solution.
> Could you try to switch to lock-free hash now?
>
> I'll try to do the iterator asap
This is the change that allowed us to break through 27kTPS to 33kTPS. I don't
like it either. My raw lf-hash prototype shows similar results.
I think we hardly need to have this bottleneck solved in 10.1 one way or
another. So my question is: do you feel like we will be able to implement this
all on time for 10.1?
>
> > + - lf_hash_search() may fail because of OOM. It is rather complex to handle
> > + it properly.
> > +*/
> > +
> > +class TDC_lock
> > +{
> > + uint m_instances;
> > + bool m_write_locked;
> > + char pad[128];
Interesting, why did I add the above padding... probably leftover from
intermediate patch. Will remove it.
> > + struct st_locks
> > + {
> > + mysql_rwlock_t lock;
> > + char pad[128];
>
> In these cases I usually do: pad[128 - sizeof(mysql_rwlock_t)];
Hmm... strange that I didn't define macro for this, I thought I did.
Your suggestion guarantees that st_locks will be at least 128 bytes. But these
128 bytes may be split across 2 cache lines. Or malloc returns pointer aligned
on cache line size?
>
> > + } *locks;
> > +public:
> > + int init(PSI_rwlock_key key, uint instances)
> > + {
> > + m_instances= instances;
> > + m_write_locked= false;
> > + locks= (st_locks *) my_malloc(sizeof(struct st_locks) * m_instances, MYF(MY_WME));
> > + if (!locks)
> > + return 1;
> > + for (uint i= 0; i < m_instances; i++)
> > + mysql_rwlock_init(key, &locks[i].lock);
> > + return 0;
> > + }
> > + void destroy(void)
> > + {
> > + for (uint i= 0; i < m_instances; i++)
> > + mysql_rwlock_destroy(&locks[i].lock);
> > + my_free(locks);
> > + }
> > + void wrlock(void)
> > + {
> > + for (uint i= 0; i < m_instances; i++)
> > + mysql_rwlock_wrlock(&locks[i].lock);
> > + m_write_locked= true;
> > + }
> > + void rdlock(THD *thd= 0)
> > + {
> > + mysql_rwlock_rdlock(&locks[thd ? thd->thread_id % m_instances : 0].lock);
> > + }
> > + void unlock(THD *thd= 0)
> > + {
> > + if (m_write_locked)
> > + {
> > + m_write_locked= false;
> > + for (uint i= 0; i < m_instances; i++)
> > + mysql_rwlock_unlock(&locks[i].lock);
> > + }
> > + else
> > + mysql_rwlock_unlock(&locks[thd ? thd->thread_id % m_instances : 0].lock);
> > + }
> > +};
> > +
> > +
> > /** Configuration. */
> > ulong tdc_size; /**< Table definition cache threshold for LRU eviction. */
> > ulong tc_size; /**< Table cache threshold for LRU eviction. */
> > ------------------------------------------------------------
> > revno: 4512
> > committer: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> > branch nick: 10.0-power
> > timestamp: Fri 2014-10-31 15:44:36 +0400
> > message:
> > Preallocate locks on THD mem_root to avoid expensive malloc.
> > modified:
> > diff:
> > === modified file 'sql/lock.cc'
> > --- sql/lock.cc 2014-09-30 17:31:14 +0000
> > +++ sql/lock.cc 2014-10-31 11:44:36 +0000
> > @@ -700,7 +699,7 @@ MYSQL_LOCK *get_lock_data(THD *thd, TABL
> > TABLE **to, **table_buf;
> > DBUG_ENTER("get_lock_data");
> >
> > - DBUG_ASSERT((flags == GET_LOCK_UNLOCK) || (flags == GET_LOCK_STORE_LOCKS));
> > + DBUG_ASSERT((flags & GET_LOCK_UNLOCK) || (flags & GET_LOCK_STORE_LOCKS));
>
> Not the same, old assert also verified that these flags aren't set both.
> New assert doesn't.
> But I don't think it matters much.
You're right, I'll change this: the stricter - the better.
>
> > DBUG_PRINT("info", ("count %d", count));
> >
> > for (i=lock_count=table_count=0 ; i < count ; i++)
> > ------------------------------------------------------------
> > revno: 4508
> > committer: Sergey Vojtovich <svoj@xxxxxxxxxxx>
> > branch nick: 10.0
> > timestamp: Tue 2014-10-21 16:20:39 +0400
> > message:
> > modified:
> > diff:
> > === modified file 'sql/handler.cc'
> > --- sql/handler.cc 2014-11-13 10:01:31 +0000
> > +++ sql/handler.cc 2014-10-21 12:20:39 +0000
> > @@ -410,13 +410,16 @@ static int hton_ext_based_table_discover
> > static void update_discovery_counters(handlerton *hton, int val)
> > {
> > if (hton->discover_table_existence == full_discover_for_existence)
> > - my_atomic_add32(&need_full_discover_for_existence, val);
> > + my_atomic_add32_explicit(&need_full_discover_for_existence, val,
> > + MY_MEMORY_ORDER_RELAXED);
> >
> > if (hton->discover_table_names)
> > - my_atomic_add32(&engines_with_discover_table_names, val);
> > + my_atomic_add32_explicit(&engines_with_discover_table_names, val,
> > + MY_MEMORY_ORDER_RELAXED);
> >
> > if (hton->discover_table)
> > - my_atomic_add32(&engines_with_discover, val);
> > + my_atomic_add32_explicit(&engines_with_discover, val,
> > + MY_MEMORY_ORDER_RELAXED);
>
> I wouldn't do that. Technically there's a dependency between these counters
> and loaded engines, so the order, kind of, matters. I don't think there
> is any practical risk of bugs here, but, on the other hand, these
> counters are only updated when an engine is loaded or unloaded,
> so there's no need to bother with the relaxed memory ordering here.
Ok, I'll revert this.
>
> > }
> >
> > int ha_finalize_handlerton(st_plugin_int *plugin)
> >
> > === modified file 'sql/mysqld.cc'
> > --- sql/mysqld.cc 2014-11-18 21:25:47 +0000
> > +++ sql/mysqld.cc 2014-10-21 12:20:39 +0000
> > @@ -3884,7 +3884,7 @@ static void my_malloc_size_cb_func(long
> > }
> > // workaround for gcc 4.2.4-1ubuntu4 -fPIE (from DEB_BUILD_HARDENING=1)
> > int64 volatile * volatile ptr=&global_status_var.memory_used;
> > - my_atomic_add64(ptr, size);
> > + my_atomic_add64_explicit(ptr, size, MY_MEMORY_ORDER_RELAXED);
>
> good!
>
> > }
> > }
> >
> >
> > === modified file 'sql/mysqld.h'
> > --- sql/mysqld.h 2014-11-13 08:56:28 +0000
> > +++ sql/mysqld.h 2014-10-21 12:20:39 +0000
> > @@ -629,7 +629,7 @@ inline __attribute__((warn_unused_result
> > {
> > query_id_t id;
> > my_atomic_rwlock_wrlock(&global_query_id_lock);
> > - id= my_atomic_add64(&global_query_id, 1);
> > + id= my_atomic_add64_explicit(&global_query_id, 1, MY_MEMORY_ORDER_RELAXED);
>
> I believe this is fine
>
> > my_atomic_rwlock_wrunlock(&global_query_id_lock);
> > return (id);
> > }
> > === modified file 'sql/table_cache.cc'
> > --- sql/table_cache.cc 2014-06-10 18:20:33 +0000
> > +++ sql/table_cache.cc 2014-10-21 12:20:39 +0000
> > @@ -139,7 +139,7 @@ uint tc_records(void)
> > {
> > uint count;
> > my_atomic_rwlock_rdlock(&LOCK_tdc_atomics);
> > - count= my_atomic_load32(&tc_count);
> > + count= my_atomic_load32_explicit(&tc_count, MY_MEMORY_ORDER_RELAXED);
>
> I'm not sure about this file. You should know better whether
> it's safe to use relaxed memory ordering here.
It should be Ok. This counter should not have strict memory ordering
constraints.
Thanks,
Sergey
Follow ups
References