django-dilla: Uber-cool DB spammer for Django

I remember one of my Django projects wherein I was asked to populate the database with 30k dummy rows of data. So what I did was to find myself a dictionary of words for string-based fields, use it for populating the fields for the objects of that specific model and then push them to the database. Unsurprisingly, it took me hours to write the code and hours to finish executing it. Had I known that there is a tool that generates data randomly, I would have saved a lot of time I used to do all the extra work.

Django-dilla does just that. It lets you spam your database with dummy information with just a single command.

Consider the model below, which is designed for the sake of having a variety of Field types.

class Contact(models.Model):

    GENDER_CHOICES = (
        ('M', 'Male'),
        ('F', 'Female'),
    )

    first_name = models.CharField(max_length=50)
    last_name = models.CharField(max_length=50)
    gender = models.CharField(max_length=1,choices=GENDER_CHOICES)
    birthday = models.DateField()
    phone = models.CharField(max_length=10)
    height = models.DecimalField(max_digits=5,decimal_places=2)
    weight = models.DecimalField(max_digits=5,decimal_places=2)
    ip = models.IPAddressField(null=True,blank=True)
    website = models.URLField(null=True,blank=True)
    last_updated = models.DateField(auto_now=True)
    activate = models.BooleanField(default=True)

    def return_values(self):
        return [(field.name, field.value_to_string(self)) for field in Contact._meta.fields]

Running Dilla with cycles set to 1000, ie. generate 1000 records, only took a few seconds to execute:
python manage.py run_dilla --cycles=1000
Dilla is going to spam your database.                 Do you wish to proceed? (Y/N)y
Dilla finished!
        1 app(s) spammed 1000 row(s) affected,         9026 field(s) filled, 1974 field(s) ommited.

Showing the first two results:

>>> for c in Contact.objects.all()[:2]:
...     for k,v in c.return_values():
...         print '%s: %s' % (k,v)
... 
id: 1
first_name: reapplied hat tinting
last_name: puppy tim's nationalism's
gender: F
birthday: 2011-02-16
phone: juveniles 
height: 14.92
weight: 2.93
ip: 40.208.225.246
website: None
last_updated: 2011-03-10
activate: True
id: 2
first_name: perimeter's continued stealing
last_name: waters prioresses afghanistan
gender: F
birthday: 2011-03-18
phone: intention'
height: 4.59
weight: 4.61
ip: 14.141.132.128
website: http://compactness.com/cypriots/?packet's=trifles
last_updated: 2011-03-10
activate: True

And now there’s dummy data! It’s fast and easy! There’s just one problem though.. it’s very ugly. What kind of person is named “reapplied hat tinting puppy tim’s nationalism’s” and has some kind of code for a phone number?

To get around this problem, Dilla provides a way to customize the handler for a field. It can be a global handler which returns data for a particular Field type, or a strict handler which handles specific attributes in a model. Here is a custom spammer that generates the data that goes in the phone field.

from dilla import spam
import string
import random

@spam.strict_handler('contacts.Contact.phone')
def get_phone(field):
    phone = ''
    for i in range(0,9):
        phone += str(random.choice(range(0,9)))
    return phone

The improved results are as follows:

>>> for c in Contact.objects.all()[:2]:
...     for k,v in c.return_values():
...         print '%s: %s' % (k,v)
... 
id: 1
first_name: furby domina kaz
last_name: sandie stubborn nuisance
gender: M
birthday: 2011-02-19
phone: 746324764
height: 13.84
weight: 6.41
ip: 206.62.153.72
website: None
last_updated: 2011-03-10
activate: True
id: 2
first_name: frazzles joker puffingbilly
last_name: petty midnightsmedley minni
gender: F
birthday: 2011-02-12
phone: 868315646
height: 14.61
weight: 17.78
ip: None
website: http://oldescratch.com/sitka/?floella=ricka
last_updated: 2011-03-10
activate: True

References: https://github.com/aerosol/django-dilla