| |||||||||||||||||||||||||||||||||||||||||||||||||
One thing that's missing from most CF-powered, dynamically-generated syndication feeds is support for Conditional GET. Here's what you need to know.
Why Conditional GET?
When aggregators (both client- and server-side) retrieve your feed from the Web, they're looking for new items that have been posted since the last time they checked. If they spot something new, they add it to their databases and wait for the next update run.
Unfortunately, there's no way to look inside a feed without downloading it first. So if an aggregator's user has set the app to poll your feed once every hour, it will dutifully download the full content of that document 24 times each day... even if you haven't updated the feed in a week. If you've got a good-sized feed of ~100Kb or so, the process can start to look like a not-inconsiderable waste of bandwidth.
That's where conditional GET comes in. If a client application (like an RSS aggregator) supports conget, it will preface every hit on your feed with a HEAD request that will return only the file's HTTP headers. It will then examine the returned headers for information that suggests whether or not the document has been updated since the lsat request... if the headers say "no", then the client stops right there, and your server never takes that 100Kb hit.
How Does It Work In ColdFusion?
You may be able to get by with using <cfcache>. But if you're like me and use custom caching routines, or just want precise control over things, that may not be an option. And 'sides, doing it manually isn't hard at all.
- Determine the Last-Modified date of the most recent item in your feed. Chances are, that will be the latest blog entry's pubdate, atom:issued, or dc:date.
- Generate an ETag. An ETag can be pretty much anything, as long as you can be sure that it will be reasonably unique over time. Many HTTP servers create an MD5 hash of the document and use that... personally, I just reuse the last-modified timestamp. It isn't ideal, but it works for RSS in most cases.
- Take your date and etag and return them to the client with <cfheader>, like so:
<cfheader name="Last-Modified" value="#mydate#">
<cfheader name="ETag" value="""#myetag#""">- When a user-agent requests the RSS page, use GetHttpRequestData() to to see if it sends If-Modified-Since and/or If-None-Match HTTP request headers. If it does, that means the client has visited you before, and has stored the Last-Modified and ETag values that you provided... proceed to the next step. If it doesn't, deliver your content as usual.
- Compare the If-Modified-Since value against your current Last-Modified value. The ideal approach would be to use CF's date functions for this, but a simple string comparison will usually do the job.
- Compare the If-None-Match value against your current ETag.
- If both comparisons return true, you can immediately halt processing the page and issue something like:
<cfheader statuscode = "304" statustext = "Not Modified">
<cfabort>- If either comparison returns false, deliver your content as usual.
That's pretty much it. If you'd like to test your code, I recommend using The LiveHTTPHeaders extension for Firefox. As you request your new con-get-enabled feed, you should see the various headers been sent and received in real time.
UPDATED: Cleaned up some potentially misleading bits in the 4th step, and made the last step explicit.
06-03-2004 10:23:52PM - Permalink - Comment [0] - Trackback
category: XML
related topics: (RSS) (ColdFusion) (CFMX) (CF) (Atom) (syndication)