A week ago I published MathCaptchaForm that I use in my blog for preventing spam. Malcolm Tredinnick and then other people noticed that solution don't protect from replay attacks and once solved question can be used by spam bots on different web-sites. So I added protection against replay attacks and implemented it without database usage. It was original requirement to keep MathCaptchaForm as lightweight as possible.
I removed from the form captha_question and added new field captcha_token that holds a hash from question, answer, settings.SECRET_KEY, settings.SITE_URL and expires time (1 hour by default).
def _make_token(self, q, a, expires):
data = base64.urlsafe_b64encode(\
pickle.dumps({'q': q, 'expires': expires}))
return self._sign(q, a, expires) + data
def _sign(self, q, a, expires):
plain = [getattr(settings, 'SITE_URL', ''), settings.SECRET_KEY,\
q, a, expires]
plain = "".join([str(p) for p in plain])
return sha.new(plain).hexdigest()
As you see captcha_token contains hash and question with expires time in plain form, but don't contain the answer. When form is submitted, from above fields and user answer is built new hash that we compare with old hash. If hashes aren't equal, form is rejected.
def clean(self):
"""Check captcha answer."""
cd = self.cleaned_data
# don't check captcha if no answer
if 'captcha_answer' not in cd:
return cd
t = cd.get('captcha_token')
if t:
form_sign = self._sign(t['q'], cd['captcha_answer'],
t['expires'])
if form_sign != t['sign']:
self._errors['captcha_answer'] = ["Are you human?"]
else:
self.reset_captcha()
return super(MathCaptchaForm, self).clean()
If captcha is expired, we reset it and generate new captcha. Here is a bit tricky moment – we need to change field values of already bound form, but django.newforms seems don't provide a solution for this (initial values only affect unbound forms). For storing fields values in forms is used django.http.QueryDict that is immutable, so I was need to make it temporary mutable to reset only captcha fields and don't touch other fields of the form (that will extend MathCaptchaForm).
def reset_captcha(self):
"""Generate new question and valid token
for it, reset previous answer if any."""
q, a = self._generate_captcha()
expires = time.time() +\
getattr(settings, 'CAPTCHA_EXPIRES_SECONDS', 60*60)
token = self._make_token(q, a, expires)
self.initial['captcha_token'] = token
self._plain_question = q
# reset captcha fields for bound form
if self.data:
def _reset():
self.data['captcha_token'] = token
self.data['captcha_answer'] = ''
if hasattr(self.data, '_mutable') and not self.data._mutable:
self.data._mutable = True
_reset()
self.data._mutable = False
else:
_reset()
The usage of the form was not changed, you just need to extend MathCaptchaForm like in the example bellow.
class CommentForm(MathCaptchaForm):
"""Form for editing a comment."""
author = forms.CharField(label='Name', required=True,
max_length=Comment._meta.get_field('author').maxlength,
widget=TextInput(attrs={'size':20}))
url = forms.URLField(label='URL', required=False,
max_length=Comment._meta.get_field('url').maxlength,
widget=TextInput(attrs={'size':24}))
content = forms.CharField(label="Your comment",
max_length=Comment._meta.get_field('content').maxlength,
widget=Textarea(attrs={'cols':80, 'rows': 10}))
And then add captcha_token and captcha_answer fields in a template for your form.
{{ comment_form.captcha_token }}
<label for="id_captcha_answer"
{% if comment_form.captcha_answer.errors %}
title="{{ comment_form.captcha_answer.errors|join:", " }}">
<em class="error">{{ comment_form.knotty_question }}=</em>
{% else %}
title="Human? Enter answer!">{{ comment_form.knotty_question }}=
{% endif %}
</label>
{{ comment_form.captcha_answer }}
The full source code of MathCaptchaForm is bellow.
#!/usr/bin/env python
# -*- coding:utf-8 -*-
# Copyright (c) 2007, Dima Dogadaylo (www.mysoftparade.com)
import re
import sha
import pickle
import base64
import time
from random import randint
from django import newforms as forms
from django.conf import settings
class MathCaptchaForm(forms.Form):
"""Lightweight mathematical captcha where human is asked to solve
a simple mathematical calculation like 3+5=?. It don't use database
and don't require external libraries.
From concatenation of time, question, answer, settings.SITE_URL and
settings.SECRET_KEY is built hash that is validated on each form
submission. It makes impossible to "record" valid captcha form
submission and "replay" it later - form will not be validated
because captcha will be expired.
For more info see:
http://www.mysoftparade.com/blog/improved-mathematical-captcha/
"""
A_RE = re.compile("^(\d+)$")
captcha_answer = forms.CharField(max_length = 2, required=True,
widget = forms.TextInput(attrs={'size':'2'}))
captcha_token = forms.CharField(max_length=200, required=True,
widget=forms.HiddenInput())
def __init__(self, *args, **kwargs):
"""Initalise captcha_question and captcha_token for the form."""
super(MathCaptchaForm, self).__init__(*args, **kwargs)
# reset captcha for unbound forms
if not self.data:
self.reset_captcha()
def reset_captcha(self):
"""Generate new question and valid token
for it, reset previous answer if any."""
q, a = self._generate_captcha()
expires = time.time() +\
getattr(settings, 'CAPTCHA_EXPIRES_SECONDS', 60*60)
token = self._make_token(q, a, expires)
self.initial['captcha_token'] = token
self._plain_question = q
# reset captcha fields for bound form
if self.data:
def _reset():
self.data['captcha_token'] = token
self.data['captcha_answer'] = ''
if hasattr(self.data, '_mutable') and not self.data._mutable:
self.data._mutable = True
_reset()
self.data._mutable = False
else:
_reset()
def _generate_captcha(self):
"""Generate question and return it along with correct answer."""
a, b = randint(1,9), randint(1,9)
return ("%s+%s" % (a,b), a+b)
def _make_token(self, q, a, expires):
data = base64.urlsafe_b64encode(\
pickle.dumps({'q': q, 'expires': expires}))
return self._sign(q, a, expires) + data
def _sign(self, q, a, expires):
plain = [getattr(settings, 'SITE_URL', ''), settings.SECRET_KEY,\
q, a, expires]
plain = "".join([str(p) for p in plain])
return sha.new(plain).hexdigest()
@property
def plain_question(self):
return self._plain_question
@property
def knotty_question(self):
"""Wrap plain_question in some invisibe for humans markup with random
nonexisted classes, that makes life of spambots a bit harder because
form of question is vary from request to request."""
digits = self._plain_question.split('+')
return "+".join(['<span class="captcha-random-%s">%s</span>' %\
(randint(1,9), d) for d in digits])
def clean_captcha_token(self):
t = self._parse_token(self.cleaned_data['captcha_token'])
if time.time() > t['expires']:
raise forms.ValidationError("Captcha is expired.")
self._plain_question = t['q']
return t
def _parse_token(self, t):
try:
sign, data = t[:40], t[40:]
data = pickle.loads(base64.urlsafe_b64decode(str(data)))
return {'q': data['q'],
'expires': float(data['expires']),
'sign': sign}
except Exception, e:
import sys
sys.stderr.write("Captcha error: %r\n" % e)
raise forms.ValidationError("Invalid captcha!")
def clean_captcha_answer(self):
a = self.A_RE.match(self.cleaned_data.get('captcha_answer'))
if not a:
raise forms.ValidationError("Number is expected!")
return int(a.group(0))
def clean(self):
"""Check captcha answer."""
cd = self.cleaned_data
# don't check captcha if no answer
if 'captcha_answer' not in cd:
return cd
t = cd.get('captcha_token')
if t:
form_sign = self._sign(t['q'], cd['captcha_answer'],
t['expires'])
if form_sign != t['sign']:
self._errors['captcha_answer'] = ["Are you human?"]
else:
self.reset_captcha()
return super(MathCaptchaForm, self).clean()
And a final note about replay attacks. It are still possible for same site during expires time – a bot can't generate captcha_token but if it solved a CAPTHA question it can reuse it during a hour for other forms on same site. The key phrase here is «if it solved a CAPTHA question». If a bot can answer CAPTHA questions, protection against replay attacks isn't need at all – the bot will act like a human. But implemented protection makes it impossible to record solved CAPTCHA form and reuse it on other sites in any time.
It's also possible to add to hash URL of master page for the form and recorded CAPTCHA fields will become invalid for other pages of same site even during expires time, but I think for spam bot owners it's simple to teach spam bot mathematics rather than deal with expired every hour CAPTCHA hashes.
Our rate $2 per 1000 captcha.
We just wanna make the relationship for long terms. can we go forward? Thank you, (For inquiry amir4@yours.com or
khoknaa@yahoo.com)
Best Regards
Amir Hossain Dewan
Data Home Ltd.
amir4@yours.com
khoknaa@yahoo.com