IBM developerWorks: Charming Python: Tinkering with XML and PythonJun 25, 2000, 14:51 (1 Talkback[s])
(Other stories by David Mertz)
"A major element of getting started on working with XML in Python is sorting out the comparative capabilities of all the available modules. In this first installment of his new Python column, "Charming Python," David Mertz briefly describes the most popular and useful XML-related Python modules, and points you to resources for downloading individual modules and reading more about them. This article will help you determine which modules are most appropriate for your specific task."
"Python is in many ways an ideal language for working with XML documents. Like Perl, REBOL, REXX, and TCL, it is a flexible scripting language with powerful text manipulation capabilities. Moreover, more than most types of text files (or streams), XML documents typically encode rich and complex data structures. The familiar "read some lines and compare them to some regular expressions" style of text processing is generally not well suited to adequately parsing and processing XML. Python, fortunately (and more so than most other languages), has both straightforward ways of dealing with complex data structures (usually with classes and attributes), and a range of XML-related modules to aid in parsing, processing, and generating XML."
"One general concept to keep in mind about XML is that XML documents can be processed in either a validating or non-validating fashion. In the former type of processing, it is necessary to read a "Document Type Definition" (DTD) prior to reading an XML document it applies to. The processing in this case will evaluate not just the simple syntactic rules for XML documents in general, but also the specific grammatical constraints of the DTD. In many cases, non-validating processing is adequate (and generally both faster to run, and easier to program) -- we trust the document creator to follow the rules of the document domain. Most modules discussed below are non-validating; descriptions will indicate where validation options exist."