Retrieving Data Across Apps In Django

Nov. 9, 2008

3:48 pm

The last month's been pretty quiet here at TNF. I'm blaming part of that on the bump, but in all honesty, making ice cream and gatorade runs for your wife doesn't take up too terribly much of one's time. But learning a new web development framework does.

I started picking at Django, a web framework written in Python, shortly after Agregado was released. I've been fascinated by the lifestream concept since first seeing it in the wild, circa 2006, but what I've always wanted is a site that displays activity and saves stream items to a database for posterity. So in anticipation of the many upcoming baby photos posted to Flickr, videos posted to YouTube, and tweets twit, that's what I'm building: a multi-user lifestream site with a blog application baked in.

It turns out Django is almost perfectly suited for the job.

Almost?

Yeah, almost. A Django project is made up of one or many applications. Each application contains one or many model classes that define data types for that app. But what about combining data across applications – as you'll likely need to do when building a lifestream site? That's where things get tricky.

Luckily, Django comes with a built-in content type framework. This framework lets you, among other things, define a generic (or meta) content type that's essentially a collection of pointers to objects in other applications. It also, as you'd expect, provides tools that let us get at those objects for easy use in our templates. I've used a method described at Webmonkey, though I should note that certain parts of their tutorial are based on the pre-1.0 Django release.

So here's the coolest part about using generic content types. When you're dealing with an app that provides tools for accesing only a single app at a time – like the uber-popular Django Tagging application – you've already got a way to bridge the gap.

What I wanted for my site was a single group of tags referenced across all content types. That is, visiting http://mysite.com/tags/apple/ should retrieve photos and blog posts and videos tagged apple. But the tagging app (at least at the time of this post) only natively allows per-model tag retrieval. So: http://mysite.com/photos/tags/apple/ and http://mysite.com/videos/tags/apple/.

I got around this problem by delegating tag retrieval and display to my lifestream meta content type. Here's the view function I'm using to retrieve items for a single tag, where "tag" is apple in the url http://mysite.com/tags/apple/:

  1. def tag_archive(request, tag):
  2. tag = tag.lower()
  3. tagged_items = []
  4. items = StreamItem.objects.all()
  5. for item in items:
  6. object = item.content_object
  7. object_tags = Tag.objects.get_for_object(object)
  8. for object_tag in object_tags:
  9. object_tag_names = []
  10. object_tag_names.append(object_tag.name.lower())
  11. if tag in object_tag_names:
  12. tagged_items.append(object)
  13.  
  14. context = {
  15. 'tag': tag,
  16. 'tagged_items': tagged_items,
  17. }
  18. return render_to_response(
  19. 'tags/item_list.html',
  20. context,
  21. context_instance = RequestContext(request)
  22. )

Granted, since the site still has a relatively small number of site-wide objects, it's hard to tell how well this solution will hold up when we're retrieving hundreds (or thousands) of records per tag. It's a brutish way of getting at the data. But for the time being, it works. And well.

Listing all tags from the site, complete with usage counts, involves just a bit more work:

  1. def tag_list(request):
  2. all_tags = []
  3. items = StreamItem.objects.all()
  4. for item in items:
  5. object = item.content_object
  6. object_tags = Tag.objects.get_for_object(object)
  7. for object_tag in object_tags:
  8. tag = object_tag.name.lower()
  9. all_tags.append(tag)
  10. def count_dupes(dupe_list):
  11. unique_set = set(dupe_list)
  12. return [{'name': item, 'count': dupe_list.count(item)} for item in unique_set]
  13. tag_list = count_dupes(all_tags)
  14. tag_list.sort(key = lambda k: k['count'], reverse=True)
  15.  
  16. context = {
  17. 'tag_list': tag_list,
  18. }
  19. return render_to_response(
  20. 'tags/tag_list.html',
  21. context,
  22. context_instance = RequestContext(request)
  23. )

Counting the tag instances is what stumped me – but that's only because I'm new to Python (and, honestly, programming in general). My solution to that problem starts on line 10. I've used a function that uses list comprehensions to build a dictionary per tag that contains the name of the tag and the number of times that tag's been used site-wide. Then on line 14, we're using a lambda function to sort the list by the number of times each tag was used.

Wait – no links to the finished product?

Nope, not yet. Because I wasn't sure if what I wanted to do was possible given my modest skills, I've completed all the programming and still have no design to wrap it in. It's been an interesting experiment – and somewhat freeing. This design pattern might just stick.

Whaddya think?