yade-dev team mailing list archive

Thread
Date

Re: multicore speed / threads issues

To: Václav Šmilauer <eudoxos@xxxxxxxx>
From: Janek Kozicki <janek_listy@xxxxx>
Date: Fri, 30 Apr 2010 17:11:17 +0200
Cc: yade-dev@xxxxxxxxxxxxxxxxxxx
Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAALVBMVEUBAQEtLS1KSkpRUVFXV1dYWFhjY2Nzc3N3d3eHh4eKioqdnZ24uLjLy8vc3NxVIagyAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAB3RJTUUH2AIVEzgS1fgQtQAAAjRJREFUOMtt1DFv00AUAOAzFQNbjigSyoQaRaBMhKgLUyKXpVNNeUpk9vyDqFJhQ1kiBuaqAwJCqvPtSLY7RlTn5+5IdnYkkt/AOyfxXVLe5vf53Z1875kd34tOEax8djmj6GyjhB5bxz50GdsVZr9fqRjZwAtKOJw5Wqs2MMZ16ALHsaDncF7xAHix1oEFHAB8f+pRjcO4gfZDykcYzbiucRolOLUJ6kjA0xtVt+A6TySlM0RajIpK6DzwKZ/nOYbF/gclHMo1ZOHYY/+Ha+AWuM+3oMS4eeqYzZ8FiCltgUqI8cd2wwAVpJk+8LWYjBtnJdQpHQqJMd4Oxt4bU9ESiFGc5hkqaH74asAX4iabP5I5gZ+qjgGlJCqZa3h3lxhoeVcSE1qLQC4sqKOK9MGW9E3izFqqHokoztLFEgXg31sbZEKnWi2T74A4NxfVQqlkjKtcAWD+zcArFEES01dR0E/nnV0IgugmDd/2L84sOAouRBBHEc7gtc8teDkRlE0iNQPo2w3Xhh/D4TCIQ4LRLoTvgwjj6RRgavdurxYGMaIuGOyAW/PpNlCcU9/93AHenAWYjPoAwa+G3e3to/MgFNTAEKvKDjzuCzHTnY3qqdXtx24VijzQfZ0yewZ5cwRFQaa+mIYr1uI0I76+3W4xhlvoVRwOA0Fdl64HlJnxP6T8YpX/Lga4Wv4A3ErrU5oTfN7Mu/llXMl8RXEPji/lQkN3H7qXqgC2By47EXeU/7PJ/wPxRKMnuZwIeAAAAABJRU5ErkJggg==
In-reply-to: <1272568681.2326.16.camel@flux>

Václav Šmilauer said:     (by the date of Thu, 29 Apr 2010 21:18:01 +0200)

> That is quite possible. I also had best results with 5 cores on
> Opterons, but Xeons were getting faster up to 8 cores (about 5.5x).
> Depends on memory bus and other stuff I don't understand that much.
> 
> The jumping speed could be also explained bu the GUI thread blocked by
> computation, so it updates only very rarely and shown nonsense values.
> 
> Note that there are still some global locks (such as when creating
> interaction) and non-parallel parts like the collider.
> 
> See https://www.yade-dem.org/wiki/Triaxial_Test_Parallel and
> https://www.yade-dem.org/wiki/Performance_Tuning, in most cases 3-4
> cores give the best performance.

I see, how about using this one, to avoid global locking:

http://www.chaoticmind.net/~hcb/projects/boost.atomic/

It seems however that currently only fundamental types are supported.
I asked the author about that. If you remove 'Something' from
attached test file, it will work nicely.

-- 
Janek Kozicki                               http://janek.kozicki.pl/  |

#include<iostream>
#include<boost/thread.hpp>
#include<boost/bind.hpp>
#include<boost/atomic.hpp>
#include<vector>

class Something
{
	private:
		int val;
	public:
		Something(){};
		Something(int i):val(i){};
		void set_val(int i)	{val=i;};
		int get_val()		{return val;};
};

class Numbers
{
	public:
		boost::atomic<int> sum,t1,t2;
		Numbers():sum(0),t1(0),t2(0){};
};

class Thread1
{
	public:
		void run(Numbers& c)
		{
			int i(100000);
			while(i-->0)
				++c.sum,c.t1++;
		}
};

class Thread2
{
	public:
		void run(Numbers& c)
		{
			int i(100000);
			while(i-->0)
				c.sum++,c.t2++;
		}
};

class Display
{
	public:
		void run(Numbers& n)
		{
			int i(103);
			while(i-->0)
			{
				int sum(n.sum);
				int t1(n.t1);
				int t2(n.t2);
				int SUM(t1+t2);
				std::cout << n.sum << " " << n.t1 << " " << n.t2 << " .. " << SUM << " " << (int)(SUM==sum) << "\n";
			}
		}
};

int main(int argc, char** argv)
{
	Numbers num;
	std::vector<boost::atomic<Something> > v;
	v.resize(100);
	Thread1 t1;
	Thread2 t2;
	Display disp;

	boost::thread_group run;
	run.create_thread(boost::bind(&Thread1::run,boost::ref(t1),  boost::ref(num)));
	run.create_thread(boost::bind(&Thread2::run,boost::ref(t2),  boost::ref(num)));
	run.create_thread(boost::bind(&Display::run,boost::ref(disp),boost::ref(num)));
	run.join_all();
}

Attachment: makefile
Description: Binary data

Follow ups

Re: multicore speed / threads issues
From: Václav Šmilauer, 2010-04-30

References

multicore speed / threads issues
From: Janek Kozicki, 2010-04-29
Re: multicore speed / threads issues
From: Václav Šmilauer, 2010-04-29