Retrieving Data Across Apps In Django

Nov. 9, 2008

11:48 am

The last month's been pretty quiet here at TNF. I'm blaming part of that on the bump, but in all honesty, making ice cream and gatorade runs for your wife doesn't take up too terribly much of one's time. But learning a new web development framework does.

I started picking at Django, a web framework written in Python, shortly after Agregado was released. I've been fascinated by the lifestream concept since first seeing it in the wild, circa 2006, but what I've always wanted is a site that displays activity and saves stream items to a database for posterity. So in anticipation of the many upcoming baby photos posted to Flickr, videos posted to YouTube, and tweets twit, that's what I'm building: a multi-user lifestream site with a blog application baked in.

It turns out Django is almost perfectly suited for the job.

Almost?

Yeah, almost. A Django project is made up of one or many applications. Each application contains one or many model classes that define data types for that app. But what about combining data across applications – as you'll likely need to do when building a lifestream site? That's where things get tricky.

Luckily, Django comes with a built-in content type framework. This framework lets you, among other things, define a generic (or meta) content type that's essentially a collection of pointers to objects in other applications. It also, as you'd expect, provides tools that let us get at those objects for easy use in our templates. I've used a method described at Webmonkey, though I should note that certain parts of their tutorial are based on the pre-1.0 Django release.

So here's the coolest part about using generic content types. When you're dealing with an app that provides tools for accesing only a single app at a time – like the uber-popular Django Tagging application – you've already got a way to bridge the gap.

What I wanted for my site was a single group of tags referenced across all content types. That is, visiting http://mysite.com/tags/apple/ should retrieve photos and blog posts and videos tagged apple. But the tagging app (at least at the time of this post) only natively allows per-model tag retrieval. So: http://mysite.com/photos/tags/apple/ and http://mysite.com/videos/tags/apple/.

I got around this problem by delegating tag retrieval and display to my lifestream meta content type. Here's the view function I'm using to retrieve items for a single tag, where "tag" is apple in the url http://mysite.com/tags/apple/:

  1. def tag_archive(request, tag):
  2. tag = tag.lower()
  3. tagged_items = []
  4. items = StreamItem.objects.all()
  5. for item in items:
  6. object = item.content_object
  7. object_tags = Tag.objects.get_for_object(object)
  8. for object_tag in object_tags:
  9. object_tag_names = []
  10. object_tag_names.append(object_tag.name.lower())
  11. if tag in object_tag_names:
  12. tagged_items.append(object)
  13.  
  14. context = {
  15. 'tag': tag,
  16. 'tagged_items': tagged_items,
  17. }
  18. return render_to_response(
  19. 'tags/item_list.html',
  20. context,
  21. context_instance = RequestContext(request)
  22. )

Granted, since the site still has a relatively small number of site-wide objects, it's hard to tell how well this solution will hold up when we're retrieving hundreds (or thousands) of records per tag. It's a brutish way of getting at the data. But for the time being, it works. And well.

Listing all tags from the site, complete with usage counts, involves just a bit more work:

  1. def tag_list(request):
  2. all_tags = []
  3. items = StreamItem.objects.all()
  4. for item in items:
  5. object = item.content_object
  6. object_tags = Tag.objects.get_for_object(object)
  7. for object_tag in object_tags:
  8. tag = object_tag.name.lower()
  9. all_tags.append(tag)
  10. def count_dupes(dupe_list):
  11. unique_set = set(dupe_list)
  12. return [{'name': item, 'count': dupe_list.count(item)} for item in unique_set]
  13. tag_list = count_dupes(all_tags)
  14. tag_list.sort(key = lambda k: k['count'], reverse=True)
  15.  
  16. context = {
  17. 'tag_list': tag_list,
  18. }
  19. return render_to_response(
  20. 'tags/tag_list.html',
  21. context,
  22. context_instance = RequestContext(request)
  23. )

Counting the tag instances is what stumped me – but that's only because I'm new to Python (and, honestly, programming in general). My solution to that problem starts on line 10. I've used a function that uses list comprehensions to build a dictionary per tag that contains the name of the tag and the number of times that tag's been used site-wide. Then on line 14, we're using a lambda function to sort the list by the number of times each tag was used.

Wait – no links to the finished product?

Nope, not yet. Because I wasn't sure if what I wanted to do was possible given my modest skills, I've completed all the programming and still have no design to wrap it in. It's been an interesting experiment – and somewhat freeing. This design pattern might just stick.

Comments

May 15, 2009

12:17 am

Hi, Matt, nice tip :) You’ve saved me a good hours worth of messing about working it out. I will be setting that up as soon as I have a break from promoting my internet marketing blog LOL. Keep up the good work!

Maya - Mobile Internet (#)

August 11, 2009

2:08 am

Hi matt!
It Would be possible to put the two drives in a motherboard that supports RAID 0 and possibly see the data,perhaps via Knoppix.It is stored using the ext3 filesystem..............

india Search engine marketing company (#)

August 31, 2009

4:28 am

Hi,

Is there can any one tell me about this one....I am having so many distraction about this one..

web design company (#)

January 2, 2010

4:10 am

Hi matt!

From this application,Data flows from a myriad of sources including relational sources, service-oriented applications, syndicated feeds, and data-centric documents and messages....

Facebook Layouts (#)

January 10, 2010

11:34 pm

Hello guys!

Amazing postings and for that data across You could install it onto your second bay in the pc, and reload it from there....

Internet marketing company (#)

April 28, 2010

12:52 pm

A new content type is actually made up of multiple (sometimes many) different classes. These define not only how users interact with its content.

dmoz site submission (#)

Whaddya think?