schooltool-developers team mailing list archive

Thread
Date

Re: Down the rabbit hole, profiling your Python code - Remco Wendt

To: Tom Hoffman <tom.hoffman@xxxxxxxxx>
From: Kit BLAKE <kitblake@xxxxxxxxxx>
Date: Thu, 19 Apr 2012 17:06:23 +0200
Cc: SchoolTool Developers <schooltool-developers@xxxxxxxxxxxxxxxxxxx>
In-reply-to: <bcaec51d255e5f972904bdfb43ed@google.com>

Remco posted a link to his slides on the python-nl list:

Hello all,

Thanks again for an interesting and inspiring Django Meetup. You canfind the slides of my presentation 'Down the rabbit hole, profilingyour Python code' on slideshare: http://www.slideshare.net/sshanx/down-the-rabbit-hole-profiling-in-django/. Of course Reinout also made notes, which you can find here: http://reinout.vanrees.org/weblog/2012/04/18/profiling-python.html

Next meetup will be the 4th of July. Hope to see you all there! And doremember that anything interesting you discover, the classical A-HAerlebnis, could be great material for a lightning talk!

Also worth noting: on May 11th is the Pygrunn conference in Groningen.Featuring Michael Bayer (author of sqlalchemy) as keynote speaker andalso Armin Ronachere, see: http://www.pygrunn.nl/ I'm attending andspeaker, so hope to see you all there!


Cheers,
Remco
--
Maykin Media
Herengracht 416, 1017 BZ Amsterdam
tel.: +31 (0)20 753 05 23
mob.: +31 (0)6 187 967 06
http://www.maykinmedia.nl


On 19 Apr 2012 : 17, at 00:06, Tom Hoffman wrote:

Sent to you by Tom Hoffman via Google Reader:


Down the rabbit hole, profiling your Python code - Remco Wendt
via Reinout van Rees' weblog by Reinout van Rees on 4/18/12

(Talk at the April 2012 Dutch Django meeting)
There's a lot happening between an incoming request and an outgoingresponse. Part of it is your code, part is in libraries. You don'tcare about most of those parts, as you probably mostly care aboutthe resulting end product for the customer.
There is a lot of interest in scaling, but not so much in profilingyour performance. Profiling means running your code in such a waythat Python's interpreter gathers statistics on all the calls youmake. This has a huge performance impact, so don't use it inproduction. But it gives you invaluable data on what's actuallyhappening in your code.
The most interesting thing about profiling is the low hanging fruit.Often there are two or three expensive functions that you can easilyimprove: with a limited effort you get a lot of extra performance.It is not effective to focus on a hard problem that you can onlyimprove 2%.
Python has lots of tools. The most well-known is cProfile (profileis not that good; hotspot seems deprecated). Line profiler looks atthe number of times a line is executed.
Run cProfile like this:

import cProfile
cProfile.run('your_method()')
An alternative is:

python -m cProfile your_script.py -o your_script.profile
With that -o option you get an output file that you can run throughPython's pstats to get the actual statistics.
A very handy visualizer is run snake run that displays the profilinginformation as a "tree map". An alternative is kcachegrind, but youneed to call pyprof2calltree to convert Python's profilinginformation to kcachegrind's.
What to look for:
• Things you didn't expect. Perhaps you spot something sub-obtimalor strange that needs investigating.• If there's much time spend in just one single function. This ispossible low-hanging fruit.
	• Lots of calls to the same function.
Some things you can do to improve your performance:

	•
Caching

	•
Get stuff out of inner loops.

	•
Remove logging. And especially watch out when logging databaseobjects in Django: your objects's __unicode__() might call more thanyou want, like self.parent.xyz...
Regarding debug logging: you can make them conditional with if__debug__:. Running python with -O optimizes them away.
Apart from code profiling (cpu/IO) there's also memory profiling forlooking at memory usage. Small note on Django: it has an (intended)memory leak in debug mode (the query cache). No real problem, butkeep it in mind when doing memory profiling.
Tools for memory profiling: heapy and meliea. Meliea is nice as youcan run it on your server (ahem) and then copy it to your localmachine for evaluation with, again, run snake run.
Profiling is all good and fun, but the environment is different onyour production server. How to do profiling there? You might haveone of several wsgi process that runs in profiling mode, forinstance, with a load balancer that only trickles a few results tothat single wsgi process.
Or you can use Boaz Leskes' pycounters, "instrumenting productioncode".
To close off: you should know about this. It should be part of yourprofessional toolkit. And... it should be in IDEs. Several of themalready have it. Komodo already has it, but what about PyCharm?Remco hopes that this blog entry sparks IDE vendors into action whenneeded :-)
Some input from the questions:
• Django has profiling middleware that you can switch on for aspecific request with a GET parameter.
	• There's WSGI middleware (like dozer).




Things you can do from here:
	• Subscribe to Reinout van Rees' weblog using Google Reader
• Get started using Google Reader to easily keep up with all yourfavorite sites
_______________________________________________
Mailing list: https://launchpad.net/~schooltool-developers
Post to     : schooltool-developers@xxxxxxxxxxxxxxxxxxx
Unsubscribe : https://launchpad.net/~schooltool-developers
More help   : https://help.launchpad.net/ListHelp


--
Gauss - The People Magnet  http://getGauss.com/

References

Down the rabbit hole, profiling your Python code - Remco Wendt
From: Tom Hoffman, 2012-04-18