Django profiler

November 10, 2007 [last comment: November 19, 2007]

I wrote simple script that can be used to profile any Django-powered web-site and find how many SQL-queries are used per page, how heavy html-pages are, etc.

To use the the script it's enough to put django-profile.py somewhere in PYTHONPATH and call it from the directory that holds projects' settings.py. Script uses internal Django client for receiving additional information about web-pages and so you even don't need to have your server running.

The idea of script is simple – it loads links from homepage, discover URLs of 'details' pages (test get_absolute_url() method of project models) and loads additional links from optionally ./profile-pages.txt. Then it acts like a web-spider trying to obtain new links from already known pages.

def get_urls(depth=3, apps=None):
    urls = set(['/'])
    if options.read_urls:
        urls.update(get_predefined_pages(options.read_urls))
        urls.update(get_model_urls())
    if depth > 1:
        urls.update(get_base_urls(urls))
    new_urls = urls

    while depth > 2 and new_urls:
        new_urls = get_urls_from_content(new_urls) - urls

        urls.update(new_urls)
        depth -= 1
    if not options.all_urls:
        urls = remove_dublicated_views(urls)
    return list(urls)

For big projects we can end with huge list of urls (for example if you have in database 1000 news stories – all can be included), so I also added ability to profile only one URL per each unique project view.

def remove_dublicated_views(hrefs):
    """Remove pages mapped to same view."""
    from django.core.urlresolvers import resolve, Resolver404

    
    resolvers = []
    unique_urls = []
    for href in hrefs:
        try:
            r = resolve(href)
            if not r:
                continue

        except Resolver404:
            continue
        view, args, kwargs = r[0], list(r[1]), r[2]        
        # resolve() don't return url mapping name,

        and when generic views are used
        # it's a problem, so we do this trick to find
        # "really" different generic views
        args = [arg for arg in args if arg not in href]
        kwargs = dict([k, v] for k, v in kwargs.items()
                      if not isinstance(v, basestring) or v not in href)
        r = (view, args, kwargs)
        if r not in resolvers:
            resolvers += [r]
            unique_urls += [href]
    return unique_urls

When all above steps were applied we have final list of urls and can profile each that is simplest part of script. For example bellow is shown how to detect number of SQL queries used per page.

@props(name='SQL queries usage', sort_key='SQL', reverse=True)

def profile_sql(url):
    """Find SQL queriers usage for each page"""
    if options.verbosity:
        print "profile_sql", url,
    from django.conf import settings

    old_debug = settings.DEBUG
    settings.DEBUG = True
    from django.db import connection

    connection.queries = []
    response =_internal_request(url)
    if options.verbosity:
        print "%d SQL queries, status code: %s " %
              (len(connection.queries), response.status_code)
    if options.verbosity > 1:
        for query in connection.queries:
            print query['sql'], query['time']
    settings.DEBUG = old_debug

    return {'SQL': len(connection.queries), 'Status': response.status_code}

You can review colourised variant of source code – django-profile.py.html or download it – django-profile.py.

Profile result are printed to stdout in tabular form. Bellow is shown profile for mysoftparade.com:

$ django-profile.py
Found 12 urls.
**********************************************
Page size
**********************************************
              Path Size, b Status <img> <link>

           /about/    6321    200     1      4
                 /    5508    200     1      4
/blog/hello-world/    5350    200     1      4
/projects/concept/    3916    200     1      4
         /tag/sql/    3381    200     1      4
       /blog/2007/    3369    200     1      4
    /blog/2007/11/    3055    200     1      4
            /blog/    3049    200     1      4
        /projects/    2830    200     1      4
             /tag/    2361    200     1      4
            /feed/     935    200     0      2
/feed/tag/concept/     915    200     0      2

*****************************
SQL queries usage
*****************************
              Path Status SQL
/blog/hello-world/    200   5
                 /    200   5
/feed/tag/concept/    200   3
/projects/concept/    200   3
         /tag/sql/    200   3
           /about/    200   3
            /feed/    200   2
       /blog/2007/    200   2
            /blog/    200   1
        /projects/    200   1
             /tag/    200   1
    /blog/2007/11/    200   1

Enjoy it!

Tags: concept  django  performance  sql 
Submitted 4 comments: accepted - 4, in moderation queue - 0. Add your comment.
  • Anonymous on November 19, 2007
    Be careful, this changes your DEBUG settings so your pages will always generate a traceback and not email you on error. This is unacceptable on production sites.
  • Dima Dogadaylo on November 19, 2007
    django-profile.py don't affect tested project in any way. after it using you will have in settings.DEBUG what you had before testing.
  • Anonymous on November 19, 2007
    This is fantastic, thank you!
  • Anonymous on November 19, 2007
    Ahem... %s/responce/response/g


   
Web log, research lab and soft parade of Dima Dogadaylo.
Email: entropyhacker at gmail dot com