The audio challenges used by django-simple-captcha are generated by the flite program.
However, flite produces constant output for any given input.
So, for example, the command:
$ flite -t "B, C, H, U" -o BCHU.wav
will produce the same output file every time it is run.
By default, the challenges are only four letters long. This means that there are only 26^4 = 456976 different possible audio files which can be used for captcha challenges. It follows that it is possible to generate all possible challenges and store them in a database to lookup later. Then, when presented with a challenge, we can simply look it up in the database to find the solution. (For size reasons, we have decided to use hashes instead of storing all of the files.)
So that this type of attack is not possible, ideally action should be taken to ensure that one solution can be represented by many different audio challenges. It may be possible to do this using different voice files, randomly varying speed and pitch, or adding random sounds in between letters (though care must be taken such that whatever operation taken is difficult to reverse).
Until this is done, a more temporary solution may be to increase the length of challenges offered to your users. You should not disable audio challenges altogether, since this will mean that users with vision problems will not be able to use your website.
It is possible that flite produces different output depending on software versions or operating system, and it definitely produces different output if a different voice from the default is selected. To get around this, an attacker would need to determine the flite setup of the target server, though this would be possible by simply analysing examples of received challenges and trying to generate a match.
- Install dependencies. Note that this tool uses Python 2, not Python 3.
Using pip:
$ pip install -r requirements.txt
Using apt (on Debian-based systems):
$ apt install python-bs4
- Run the testproject created by the developers:
$ git clone https://github.com/mbi/django-simple-captcha
$ cd django-simple-captcha/testproject
$ python manage.py runserver
If all has gone well, then the server should be running on http://localhost:8000 If this is not the case, you will need to modify the argument to the getCaptcha() function in captcha_cracker.py before running it.
- Run the script! It should download a challenge from the server, and print the solution to the terminal.
$ python captcha_craker.py
- If this does not work, you may have a different flite setup to the one I had when I generated the database, and you will have to generate your own. This can take a long time, so be patient, and do not interrupt it while it is running. To do this, you must:
4.1. Remove the non-working database
$ rm checksums.db
4.2. Uncomment genDatabase() in captcha_cracker.py. That is, replace the line:
#genDatabase()
with
genDatabase()
4.3. Run the modified script!
$ python captcha_cracker.py
Copyright 2017 Riley Baird. GNU General Public License 3.0 or (at your option) any later version. For full details, see the file named LICENSE.