I remember one of my Django projects wherein I was asked to populate the database with 30k dummy rows of data. So what I did was to find myself a dictionary of words for string-based fields, use it for populating the fields for the objects of that specific model and then push them to the database. Unsurprisingly, it took me hours to write the code and hours to finish executing it. Had I known that there is a tool that generates data randomly, I would have saved a lot of time I used to do all the extra work.
Django-dilla does just that. It lets you spam your database with dummy information with just a single command.
Consider the model below, which is designed for the sake of having a variety of Field types.
class Contact(models.Model): GENDER_CHOICES = ( ('M', 'Male'), ('F', 'Female'), ) first_name = models.CharField(max_length=50) last_name = models.CharField(max_length=50) gender = models.CharField(max_length=1,choices=GENDER_CHOICES) birthday = models.DateField() phone = models.CharField(max_length=10) height = models.DecimalField(max_digits=5,decimal_places=2) weight = models.DecimalField(max_digits=5,decimal_places=2) ip = models.IPAddressField(null=True,blank=True) website = models.URLField(null=True,blank=True) last_updated = models.DateField(auto_now=True) activate = models.BooleanField(default=True) def return_values(self): return [(field.name, field.value_to_string(self)) for field in Contact._meta.fields]
Running Dilla with cycles set to 1000, ie. generate 1000 records, only took a few seconds to execute:
python manage.py run_dilla --cycles=1000 Dilla is going to spam your database. Do you wish to proceed? (Y/N)y Dilla finished! 1 app(s) spammed 1000 row(s) affected, 9026 field(s) filled, 1974 field(s) ommited.
Showing the first two results:
>>> for c in Contact.objects.all()[:2]: ... for k,v in c.return_values(): ... print '%s: %s' % (k,v) ... id: 1 first_name: reapplied hat tinting last_name: puppy tim's nationalism's gender: F birthday: 2011-02-16 phone: juveniles height: 14.92 weight: 2.93 ip: 126.96.36.199 website: None last_updated: 2011-03-10 activate: True id: 2 first_name: perimeter's continued stealing last_name: waters prioresses afghanistan gender: F birthday: 2011-03-18 phone: intention' height: 4.59 weight: 4.61 ip: 188.8.131.52 website: http://compactness.com/cypriots/?packet's=trifles last_updated: 2011-03-10 activate: True
And now there’s dummy data! It’s fast and easy! There’s just one problem though.. it’s very ugly. What kind of person is named “reapplied hat tinting puppy tim’s nationalism’s” and has some kind of code for a phone number?
To get around this problem, Dilla provides a way to customize the handler for a field. It can be a global handler which returns data for a particular Field type, or a strict handler which handles specific attributes in a model. Here is a custom spammer that generates the data that goes in the phone field.
from dilla import spam import string import random @spam.strict_handler('contacts.Contact.phone') def get_phone(field): phone = '' for i in range(0,9): phone += str(random.choice(range(0,9))) return phone
The improved results are as follows:
>>> for c in Contact.objects.all()[:2]: ... for k,v in c.return_values(): ... print '%s: %s' % (k,v) ... id: 1 first_name: furby domina kaz last_name: sandie stubborn nuisance gender: M birthday: 2011-02-19 phone: 746324764 height: 13.84 weight: 6.41 ip: 184.108.40.206 website: None last_updated: 2011-03-10 activate: True id: 2 first_name: frazzles joker puffingbilly last_name: petty midnightsmedley minni gender: F birthday: 2011-02-12 phone: 868315646 height: 14.61 weight: 17.78 ip: None website: http://oldescratch.com/sitka/?floella=ricka last_updated: 2011-03-10 activate: True