|
|
WebAlchemy accelerates Django in 100 timesI host my Django blog at shared hosting environment on Dreamhost, where hundreds of sites are also running on the same server. In general I enjoy Dreamhost hosting (especially, taking into account their price for unlimited domains hosting plan) but sometimes I need more speed. When I wrote WebAlchemy and configured www.mysoftparade.com to use it, I achieved almost immediate responses of my web-site. With WebAlchemy only pages involved in form processing are served directly by Django, the rest of the pages most of the time are served directly by Apache as static content with static content speed. In other words for Django-powered site it's possible to achieve speed about 2000 request/sec, against about 500 request/sec with memcached and about 20 request/sec for "typical" (10 fast SQL queries) page without caching at all. Actual performance results of course will vary from server/application/configuration. As it was said: The magic is done inside
For example, let's analyze life-cycle of the current page http://www.mysoftparade.com/blog/webalchemy-django-apache/.
WebAlchemy uses Django signal framework to monitor changes in models. Mapping between models and views are defined in file webalchemy_settings.py. Bellow is content of this file for the www.mysoftparade.com: # -*- coding:utf-8 -*- # $Id: webalchemy_settings.py 289 2007-11-18 17:03:13Z dogada $ # Copyright (c) 2007, Dima Dogadaylo (www.mysoftparade.com) """WebAlchemy configuration for the www.mysoftparade.com. Mapping between models and views that models affect. WebAlchemy uses this information to switch between dinamic and static versions of an url. """ import sys from os import path from django.core.urlresolvers import reverse from django.conf import settings from concept.views import home, about, latest_feed, latest_feed_clone, tag_feed from concept.tagging.models import TaggedItem, Tag from concept.tagging.views import tag_list, tag_detail from concept.blog.views import * from concept.blog.models import Entry from concept.comments.models import Comment from concept.projects.models import Project from concept.projects.views import * from concept.webalchemy.core import Site from concept.webalchemy.utils import content_object class TagHandler(object): """Produce list of urls affected by a changed Tag, TaggedItem or object with tags.""" def get_affected_paths(self, obj): """Return list of paths affected by this object. It handles only pages depends on tag directly, content objects details and lists pages are handled when content_object saved. It also possible to rebuild here content object related pages, but this is not need for concept. """ tags = [] if type(obj) == TaggedItem: tags = [obj.tag.name] elif type(obj) == Tag: tags = [obj.name] else: # an tagged object tags = [tag.name for tag in Tag.objects.get_for_object(obj)] paths = [reverse(tag_list, None, (), {})] for tag_name in tags: paths.append(reverse(tag_detail, None, (), {'object_id': tag_name})) paths.append(reverse(tag_feed, None, (), {'url': 'tag/%s' % tag_name})) return paths class EntryArchiveHandler(object): """Archive urls affected by the changed Entry.""" def get_affected_paths(self, obj): paths = [reverse(entry_archive_year, None, (str(obj.created.year),))] paths.append(reverse(entry_archive_month, None, (str(obj.created.year),str(obj.created.month),))) return paths site = Site(settings.WA_PUBLISHER) site.bind(home, [Entry, Project]) site.bind_static(about) site.bind([entry_detail, entry_queue], Entry, None, slug="slug") site.bind([entry_detail, entry_queue], Comment, content_object(Entry), slug="slug") site.bind([entry_list, latest_feed, latest_feed_clone], Entry) site.bind(project_detail, Project, None, slug="slug") site.bind(project_list, Project) site.bind_custom([tag_feed, tag_detail, tag_list], [TaggedItem, Tag], TagHandler()) # even if tags itself aren't changed we need to rebuild tag views because # it may use changed parts of content objests (name, summary, etc.) site.bind_custom([tag_feed, tag_detail, tag_list], [Entry], TagHandler()) site.bind_custom([entry_archive_year, entry_archive_month], [Entry], EntryArchiveHandler())This quite short file defines mapping between all models used on web-site and between all pages. As you see it supports custom binders, that makes possible to define rules of any level of complexity. So if your requirements aren't satisfied by standard model2view binders, you can define own binder by providing object with only one method get_affected_paths.
To test WebAlchemy with your project you need to download webalchemy.tar.gz with following content: webalchemy/core.py webalchemy/htaccess.py webalchemy/__init__.py webalchemy/models.py webalchemy/publishers.py webalchemy/rewriters.py webalchemy/utils.py LICENSE.txtPlace the content of the archive in your project directory or anywhere in PYTHONPATH. Ensure that WebAlchemy is visible for you project: $ ./manage.py shell >>> import concept.webalchemyThen activate WebAlchemy usage in your settings.py. Here at Dreamhost in FastCGI environment I use following configuration: WEBALCHEMY_SETTINGS = "concept.webalchemy_settings" HTDOCS_ROOT = '/root/dogada/mysoftparade.com/' from webalchemy.rewriters import FastCGIRewriter from webalchemy.publishers import ApacheHtaccessPublisher WA_PUBLISHER = ApacheHtaccessPublisher(FastCGIRewriter, HTDOCS_ROOT, '', delete_files = True)Configuartion is a bit different on my local workstation where I use Apache with mod_python: WEBALCHEMY_SETTINGS = "concept.webalchemy_settings" HTDOCS_ROOT = '/var/www/concept/public/' from webalchemy.rewriters import ModPythonRewriter from webalchemy.publishers import ApacheHtaccessPublisher WA_PUBLISHER = ApacheHtaccessPublisher(ModPythonRewriter, HTDOCS_ROOT, '/cache', delete_files = True)There is no common answer on the question: which rewriter to use? It depends... For my purposes I created 2 rewriters: ModPythonRewriter and FastCGIRewriter,
they produce different .htaccess files for my local
environment and Dreamhost environment where blog is hosted. In case
with ModPython I store files in /cache/
directory that don't have associated PythonHandler:
<Location "/">
SetHandler python-program
PythonHandler django.core.handlers.modpython
SetEnv DJANGO_SETTINGS_MODULE concept.settings
SetEnv PYTHON_EGG_CACHE /var/www/concept/.cache
</Location>
<Location "/cache">
SetHandler None
</Location>
So it's enough to rewrite request path to /cache/* and
it will be handled by Apache as static file. So .htaccess files in
this case looks like:
#webalchemy_begin <IfModule mod_rewrite.c> RewriteEngine On RewriteBase /blog/webalchemy-django-apache/ RewriteRule ^$ /cache/blog/webalchemy-django-apache/index.html [L] </IfModule> #webalchemy_endOn Dreamhost I don't use special cache prefix because its configuration is slightly different. Example of .htaccess for FastCGI-environment here at Dreamhost was shown in the begging of the article. Also you need to add
from django.conf import settings from concept.webalchemy.core import Site from myproject.views import homepage from myproject.blog.models import Entry site = Site(settings.WA_PUBLISHER) site.bind(homepage, Entry) With such configuration your home page will become static and will be updated when an Entry instance will be changed. Then you can add more advanced binding that will take into account the parameters of changed instances and update only the pages affected by this concrete instance. During debugging take into account page response headers. When
WebAlchemy will be activated for site, it will add
If for an url you see If instead of the page you see error with code 500 and in the
Apache error.log something like When Apache treats the url as static file, it doesn't only respond much more faster but also send different headers that is usefull for testing. For example here is headers for WebAlchemy-powered Django page, when it was served as static file: HTTP/1.1 200 OK Date: Sun, 18 Nov 2007 17:33:26 GMT Server: Apache/2.2.3 (Ubuntu) DAV/2 SVN/1.4.3 mod_python/3.2.10 Python/2.5.1 Last-Modified: Sun, 18 Nov 2007 16:58:22 GMT Accept-Ranges: bytes Content-Length: 17810 Keep-Alive: timeout=5, max=100 Connection: Keep-Alive Content-Type: text/html; charset=UTF-8When the same page served by Django as dynamic content the headers are different: HTTP/1.1 200 OK Date: Sun, 18 Nov 2007 17:36:22 GMT Server: Apache/2.2.3 (Ubuntu) DAV/2 SVN/1.4.3 mod_python/3.2.10 Python/2.5.1 vary: Cookie webalchemy: saved Connection: close Content-Type: text/html; charset=utf-8 You can create your own rewriter which best suites your specific needs. See concept/webalchemy/rewriters.py for guidelines. You can define your rewriter in your project code and tell WebAlchemy to use it in standard Django settings.py: from myproject.utils import MyRewriter from webalchemy.publishers import ApacheHtaccessPublisher WA_PUBLISHER = ApacheHtaccessPublisher(MyRewiter, HTDOCS_ROOT) For example it's possible not to deal with file extensions at
all (now extensions are used to provide original mime-type of url
even after it becomes static file) and use If you are so big that need to use several app servers, you will need a shared between all app servers htdocs filesystem to which all app servers will write. However for reading each app server can use local htdocs filesystem replicated from shared htdocs filesystem. From the other point of view, do you really need many app servers and deal with replication/sharding if with WebAlchemy you can decrease a lot the database load and accelerate "reader" pages in 100 times? Behind the scene I'm leaving for now other interesting issues like setting correct content-type for urls, merging manual and autogenerated content of .htaccess files, creating non-conflict rewriting rules, etc. Enjoy the code and have fun! |
Web log, research lab and soft parade of Dima Dogadaylo.
Email: entropyhacker at gmail dot com Table of Contents |
| [about | blog | projects]
© Dima Dogadaylo |
|
I hope I'm wrong :-).
I can rebuilt also all details pages when an entry is changed, but don't do this, because for such things better to use a kind of sever side includes.
BTW, I fixed issue. Thanks.
Or you can use .htaccess directly with lighttpd if it support it :)
It'll keep things much simpler and transparant than this hack.
All that's required are the proper django code to create the flat files when pages are edited. The editing pages become the only dynamic part of the site.
You can even deliver pages directly from memcached using Nginx's memcached engine.
Actually on this site only /search/ and comments posting are always served by Django, all the rest pages, served by Apache.
maybe i did something wrong in webalchemy_settings.py ? i leave it empty
Dima, is it needed to put something in webalchemy_settings ? can i leave it empty ?
i thougth it is only to tell to Webalchemy that it must update a specified page..
i will try again with django trunk.
I can't understand why do you need any rewrite rules in .htaccess if you already had index.html? Apache will send index.html back to client by default. When index.html will be deleted, django will start in any case.
Could you please explain?
thanks
Seriously, you don't need to make all you pages static, but homepage, rss-feeds, common entry pages like, /community/, /news/, /photos/latest/ IMHO MUST but static and should not use SQL-queries.
You always can start with little changes, make your homepage fast as rocket and then think about other pages.
www.mysoftparade.com/blog/webalchemy-vs-staticgenerator/
have you used the directive "AllowOverride all" somewhere in your httpd.conf
so that apache evaluate all .htaccess it encounter in /cache ??
SetHandler None
thanks
when there is no view involved ? surly there need to be some way to bind_static in such cases
Also you only need a single .htaccess file in your front Apache server to serve static media or proxy to the back Django server (Apache, Lighttpd, FCGI, mod_python, whatever).
DirectoryIndex index.html
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}/index.html !-f
RewriteRule ^(.*)$ http://127.0.0.1:1704/$1 [P,QSA,L]
Any way to use this for caching a page depending on the status of the user (logged in or not) something like the Django built in cache has.
I understand the best case is for static like pages, but for example the home page in my site which is classic for this kind of cache has 'logged in as X' and different menus for logged in users.
Date: Thu, 22 May 2008 14:05:25 GMT
Server: Apache/2.2.8 (Win32) DAV/2 mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.8 OpenSSL/0.9.8g mod_autoindex_color PHP/5.2.5
Vary: Cookie
Content-Type: text/html; charset=utf-8
WebAlchemy: saved
X-Transfer-Encoding: chunked
Content-length: 13095
I am keeping get this kind of head message. And I am using TrivialPublisher not mod_rewrite to deal with .htaccess file.
It seems work that it generate a static file.
This seems like a great piece of software and I'm trying to integrate it into one of my projects.
I've run in to a problem that I don't quite understand(I'm still learning python). When I look at the headers, it always comes up as "WebAlchemy: ignored". You mentioned that this probably means that I haven't binded the method to the model yet.
However after looking at the source, I found that "view in self.get_binded_views()" the process_response function inside the Site object always returns false even when the view is in the binded_views set.
I did a small comparison of the view that is resolved from request.path and the actual view:
>>> resolve(request.path)[0]
>>> self.get_binded_views()
set([])
I'm not exactly sure why these are coming up as false when they are the same function, probably something to do with the 0x0 address not being the same I'm guessing. Do you know why this is happening? I cannot get webalchemy to generate static pages when it is always ignored even when the view is binded. I'm sure this is probably not happening to you or it would have been fixed.
I'm using django from the trunk running on Python 2.5 on Windows. I'm currently testing this off the development server that comes with django. Is it possible that this happens only with this configuration?
>>> resolve(request.path)[0]
function popular at 0x019CC2F0
>>> self.get_binded_views()
set([function popular at 0x0199F470])
thanks in advance!