dolfin team mailing list archive

Thread
Date

Re: Extending DOLFIN CMake scripts for CUDA interoperability

To: johan.hake@xxxxxxxxx
From: Florian Rathgeber <florian.rathgeber@xxxxxxxxxxxxxx>
Date: Fri, 28 Jan 2011 15:18:02 +0000
Cc: Florian Rathgeber <florian.rathgeber@xxxxxxxxxxxxxx>, dolfin@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201101260852.13331.johan.hake@gmail.com>
Openpgp: id=C72D0316
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.6.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Great you're interested in leveraging my work. Clearly GPU computing
requires quite fundamental changes that will not make it to the trunk
anytime soon. And as you've rightly pointed out it's not sufficient to
make changes to DOLFIN, but also to UFC and FFC and these changes needed
to be coordinated.

There actually is a blueprint about UFC support for many-core
architectures which I commented on:
https://blueprints.launchpad.net/ufc/+spec/ufc-many-core

I'd welcome other's comment on this and I can try to update it with the
latest findings. If there are any other movements or plans in this
direction in the FEniCS community I'm happy to discuss issues.

Florian

On 26/01/11 16:52, Johan Hake wrote:
> On Wednesday January 26 2011 05:39:05 Florian Rathgeber wrote:
>> Johan,
>>
>> Glad you're interested! That FFC branch is actually already published:
>> lp:~florian-rathgeber/ffc/gpu-wrappers
> 
> Ok cool!
> 
>> It is a dirty hack rather though, but I'd be happy about suggestions on
>> how to improve its design.
> 
> It would be cool to try to leverage your efforts into the FEniCS project in 
> one way or another. Unfortunately are the experience of gpu coding limited 
> among the core developers. We are catching up on the distributed and shared 
> memory parallelism but we still lack functionalities here.
> 
> Reading your suggestions it also seems as we would need to make changes in 
> many places: ufc, ffc and dolfin to get it going. Such changes are always 
> painful and it is not obvious where they best fit.
> 
> I am not a ffc wiz so I would probably not be able to tell what should go 
> where. But generally speaking is ffc only dealing with ufc code generation. 
> code for dolfin wrappers, are placed in dolfin/site-packages/dolfin_tools. But 
> I guess this has not always been the case, especial not during development ;)
> 
> Depending on other efforts on gpu integration in FEniCS it might be a good 
> time to start scribbling on a Blueprint? Maybe you have time to sketch some 
> thoughts from your experience?
> 
> 
> Johan
> 
>> Florian
>>
>> On 22/01/11 08:49, Johan Hake wrote:
>>> Florian!
>>>
>>> Thanks for sharing your Master thesis. It was informative reading! You
>>> mention that you modified FFC too. Will this branch also be published?
>>>
>>> Johan
>>>
>>> On Friday January 21 2011 02:34:43 Florian Rathgeber wrote:
>>>> There is no master plan I know of. For my MSc project with Johan Jansson
>>>> at KTH last year I implemented GPU assembly and solve using DOLFIN.
>>>> There are 2 backends: a native CUDA one and one using the cusp library
>>>> from NVIDIA. I'm currently trying to get my code in shape and working
>>>> with the current dolfin-dev to publish in a branch on launchpad
>>>> (lp:~florian-rathgeber/dolfin/gpu-backend). The code currently in there
>>>> is broken, I had pushed it to show the CMake problems I was seeing.
>>>> Hopefully I can push something working soon.
>>>>
>>>> If you want to read up on the background my MSc thesis is probably a
>>>> good start:
>>>> http://www.nada.kth.se/utbildning/grukth/exjobb/rapportlistor/2010/rappo
>>>> rte r10/rathgeber_florian_10106.pdf
>>>>
>>>> I don't know how much time I will have to maintain this, but I thought
>>>> it would be useful to have something out for people to play with and
>>>> show there is work in this direction in the FEniCS community.
>>>>
>>>> Florian
>>>>
>>>> On 20/01/11 22:47, Anders Logg wrote:
>>>>> On Thu, Jan 20, 2011 at 11:03:51PM +0100, Marie E. Rognes wrote:
>>>>>> On 01/20/2011 10:10 PM, Johan Hake wrote:
>>>>>>     Florian!
>>>>>>     
>>>>>>     Out of curiosity, are you planing to implement GPU assembly too?
>>>>>>     To me it looked like your code "only" exploited solve on the GPU.
>>>>>>     
>>>>>>     I guess GPU assemble is even more parallelizable than the solving
>>>>>>     process. At least if you settle with gathering the elemement
>>>>>>     matrices in parallel and then fanning them out in some sort of
>>>>>>     serial operation. In this way you miss the possibility to solve on
>>>>>>     the GPU, which I guess you are exploiting.
>>>>>>
>>>>>> I'm real interested too in hearing more about your plans!
>>>>>>
>>>>>> There has been quite a bit of mentions with regard to "doing stuff on
>>>>>> GPUs" from different parts of the FEniCS community over the last year
>>>>>> or so. Is there a master plan out there somewhere?
>>>>>
>>>>> I don't think there's a master plan (yet), but many are interested and
>>>>> the group at Imperial have been working on it for some time. It would
>>>>> be interesting to hear more about the progress.
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~dolfin
>>>>> Post to     : dolfin@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~dolfin
>>>>> More help   : https://help.launchpad.net/ListHelp
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAk1C3ioACgkQ8Z6llsctAxb6EQCfR/twUjuJ0NoGfBuDp92ErfyL
degAoNbGDQiihRF0HlgWxIKJLCz+60QG
=+Xu5
-----END PGP SIGNATURE-----

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

References

Extending DOLFIN CMake scripts for CUDA interoperability
From: Florian Rathgeber, 2011-01-14
Re: Extending DOLFIN CMake scripts for CUDA interoperability
From: Johan Hake, 2011-01-22
Re: Extending DOLFIN CMake scripts for CUDA interoperability
From: Florian Rathgeber, 2011-01-26
Re: Extending DOLFIN CMake scripts for CUDA interoperability
From: Johan Hake, 2011-01-26