You are here: Home > Dive Into Python > Scripts and Streams > Putting it all together | << >> | ||||
Dive Into PythonPython from novice to pro |
You've covered a lot of ground. Let's step back and see how all the pieces fit together.
To start with, this is a script that takes its arguments on the command line, using the getopt module.
def main(argv): ... try: opts, args = getopt.getopt(argv, "hg:d", ["help", "grammar="]) except getopt.GetoptError: ... for opt, arg in opts: ...
You create a new instance of the KantGenerator class, and pass it the grammar file and source that may or may not have been specified on the command line.
k = KantGenerator(grammar, source)
The KantGenerator instance automatically loads the grammar, which is an XML file. You use your custom openAnything function to open the file (which could be stored in a local file or a remote web server), then use the built-in minidom parsing functions to parse the XML into a tree of Python objects.
def _load(self, source): sock = toolbox.openAnything(source) xmldoc = minidom.parse(sock).documentElement sock.close()
Oh, and along the way, you take advantage of your knowledge of the structure of the XML document to set up a little cache of references, which are just elements in the XML document.
def loadGrammar(self, grammar): for ref in self.grammar.getElementsByTagName("ref"): self.refs[ref.attributes["id"].value] = ref
If you specified some source material on the command line, you use that; otherwise you rip through the grammar looking for the "top-level" reference (that isn't referenced by anything else) and use that as a starting point.
def getDefaultSource(self): xrefs = {} for xref in self.grammar.getElementsByTagName("xref"): xrefs[xref.attributes["id"].value] = 1 xrefs = xrefs.keys() standaloneXrefs = [e for e in self.refs.keys() if e not in xrefs] return '<xref id="%s"/>' % random.choice(standaloneXrefs)
Now you rip through the source material. The source material is also XML, and you parse it one node at a time. To keep the code separated and more maintainable, you use separate handlers for each node type.
def parse_Element(self, node): handlerMethod = getattr(self, "do_%s" % node.tagName) handlerMethod(node)
You bounce through the grammar, parsing all the children of each p element,
def do_p(self, node): ... if doit: for child in node.childNodes: self.parse(child)
replacing choice elements with a random child,
def do_choice(self, node): self.parse(self.randomChildElement(node))
and replacing xref elements with a random child of the corresponding ref element, which you previously cached.
def do_xref(self, node): id = node.attributes["id"].value self.parse(self.randomChildElement(self.refs[id]))
Eventually, you parse your way down to plain text,
def parse_Text(self, node): text = node.data ... self.pieces.append(text)
which you print out.
def main(argv): ... k = KantGenerator(grammar, source) print k.output()
<< Handling command-line arguments |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
Summary >> |