Bruno Chareyre proposed the following answer:

I am not able to answer either.

I can tell that a typical yade performance on a typical desktop is of the order of 0.5e6 particle*iteration/second.
This "particle*iteration/second" value is sometimes called "Cundall's number", it is a good way to evaluate performance.
Unfortunately  I've not been able to know that value for any other code based on the literature.

I agree with Robert's on the fact that, probably, some highly
specialized codes should achieve better performances (imagine you have
just one contact model and you optimize an entire GPU code for that
unique model). That's just a guess though.


