You are here: Home > Dive Into Python > HTTP Web Services > Putting it all together | << >> | ||||
Dive Into PythonPython from novice to pro |
You've seen all the pieces for building an intelligent HTTP web services client. Now let's see how they all fit together.
Example 11.17. The openanything function
This function is defined in openanything.py.
def openAnything(source, etag=None, lastmodified=None, agent=USER_AGENT): # non-HTTP code omitted for brevity if urlparse.urlparse(source)[0] == 'http': # open URL with urllib2 request = urllib2.Request(source) request.add_header('User-Agent', agent) if etag: request.add_header('If-None-Match', etag) if lastmodified: request.add_header('If-Modified-Since', lastmodified) request.add_header('Accept-encoding', 'gzip') opener = urllib2.build_opener(SmartRedirectHandler(), DefaultErrorHandler()) return opener.open(request)
Example 11.18. The fetch function
This function is defined in openanything.py.
def fetch(source, etag=None, last_modified=None, agent=USER_AGENT): '''Fetch data and metadata from a URL, file, stream, or string''' result = {} f = openAnything(source, etag, last_modified, agent) result['data'] = f.read() if hasattr(f, 'headers'): # save ETag, if the server sent one result['etag'] = f.headers.get('ETag') # save Last-Modified header, if the server sent one result['lastmodified'] = f.headers.get('Last-Modified') if f.headers.get('content-encoding', '') == 'gzip': # data came back gzip-compressed, decompress it result['data'] = gzip.GzipFile(fileobj=StringIO(result['data']])).read() if hasattr(f, 'url'): result['url'] = f.url result['status'] = 200 if hasattr(f, 'status'): result['status'] = f.status f.close() return result
Example 11.19. Using openanything.py
>>> import openanything >>> useragent = 'MyHTTPWebServicesApp/1.0' >>> url = 'http://diveintopython.org/redir/example301.xml' >>> params = openanything.fetch(url, agent=useragent) >>> params {'url': 'http://diveintomark.org/xml/atom.xml', 'lastmodified': 'Thu, 15 Apr 2004 19:45:21 GMT', 'etag': '"e842a-3e53-55d97640"', 'status': 301, 'data': '<?xml version="1.0" encoding="iso-8859-1"?> <feed version="0.3" <-- rest of data omitted for brevity -->'} >>> if params['status'] == 301: ... url = params['url'] >>> newparams = openanything.fetch( ... url, params['etag'], params['lastmodified'], useragent) >>> newparams {'url': 'http://diveintomark.org/xml/atom.xml', 'lastmodified': None, 'etag': '"e842a-3e53-55d97640"', 'status': 304, 'data': ''}
The very first time you fetch a resource, you don't have an ETag hash or Last-Modified date, so you'll leave those out. (They're optional parameters.) | |
What you get back is a dictionary of several useful headers, the HTTP status code, and the actual data returned from the server. openanything handles the gzip compression internally; you don't care about that at this level. | |
If you ever get a 301 status code, that's a permanent redirect, and you need to update your URL to the new address. | |
The second time you fetch the same resource, you have all sorts of information to pass back: a (possibly updated) URL, the ETag from the last time, the Last-Modified date from the last time, and of course your User-Agent. | |
What you get back is again a dictionary, but the data hasn't changed, so all you got was a 304 status code and no data. |
<< Handling compressed data |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
Summary >> |