Python XML processing with |
Abstract
Describes the lxml
package for reading and writing XML
files with the Python programming language.
This publication is available in Web form and also as a PDF document. Please
forward any comments to tcc-doc@nmt.edu
.
Table of Contents
ElementTree
represents XMLetree
moduleComment()
constructorElement()
constructorElementTree()
constructorfromstring()
function: Create an element from
a stringparse()
function: build an ElementTree
from a fileProcessingInstruction()
constructorQName()
constructorSubElement()
constructortostring()
function: Serialize
as XMLXMLID()
function: Convert text to
XML with a dictionary of id
valuesclass ElementTree
: A complete XML documentElementTree.find()
ElementTree.findall()
: Find matching
elementsElementTree.findtext()
: Retrieve the
text content from an elementElementTree.getiterator()
: Make an
iteratorElementTree.getroot()
: Find the root
elementElementTree.xpath()
: Evaluate an
XPath expressionElementTree.write()
: Translate back
to XMLclass Element
: One element in the treeElement
instanceElement.append()
: Add a new element
childElement.clear()
: Make an element emptyElement.find()
: Find a matching
sub-elementElement.findall()
: Find all matching
sub-elementsElement.findtext()
: Extract text
contentElement.get()
: Retrieve an attribute
value with defaultingElement.getchildren()
: Get element
childrenElement.getiterator()
: Make an
iterator to walk a subtreeElement.getroottree()
: Find the ElementTree
containing this elementElement.insert()
: Insert a new child
elementElement.items()
: Produce attribute
names and valuesElement.iterancestors()
: Find an
element's ancestorsElement.iterchildren()
: Find all childrenElement.iterdescendants()
: Find all
descendantsElement.itersiblings()
: Find other
children of the same parentElement.keys()
: Find all attribute
namesElement.remove()
: Remove a child
elementElement.set()
: Set an attribute
valueElement.xpath()
: Evaluate an XPath
expressionetbuilder.py
: A simplified XML builder moduleetbuilder
moduleCLASS()
: Adding class
attributesFOR()
: Adding for
attributessubElement()
: Adding a child
elementaddText()
: Adding text content to an
elementetbuilder
CLASS()
: Helper function for adding
CSS class
attributesFOR()
: Helper function for adding
XHTML for
attributessubElement()
: Add a child
elementaddText()
: Add text content to an
elementclass ElementMaker
: The factory
classElementMaker.__init__()
: ConstructorElementMaker.__call__()
: Handle calls
to the factory instanceElementMaker.__handleArg()
: Process
one positional argumentElementMaker.__getattr__()
: Handle
arbitrary method callstestetbuilder
: A test driver for
etbuilder
rnc_validate
: A module to validate XML against a
Relax NG schemarnc_validate
modulernc_validate
modulernc_validate.py
: PrologueRelaxException
class RelaxValidator
RelaxValidator.validate()
RelaxValidator.__init__()
:
ConstructorRelaxValidator.__makeRNG()
: Find or
create an .rng
fileRelaxValidator.__getModTime()
: When
was this file last changed?RelaxValidator.__trang()
: Translate
.rnc
to .rng
formatmain()
checkArgs()
usage()
fatal()
message()
validateFile()
With the continued growth of both Python and XML, there is
a plethora of packages out there that help you read,
generate, and modify XML files from Python scripts.
Compared to most of them, the lxml
package has two big advantages:
Performance. Reading and writing even fairly large XML files takes an almost imperceptible amount of time.
Ease of programming. The lxml
package is based on ElementTree
,
which Fredrik Lundh invented to simplify and streamline
XML processing.
lxml
is similar in many ways to two other, earlier packages:
Fredrik Lundh continues to maintain his original
version of ElementTree
.
xml.etree.ElementTree
is now an official part of the Python library. There
is a C-language version called cElementTree
which may be even faster than lxml
for some applications.
However, the author prefers lxml
for providing a number of
additional features that make life easier. In particular,
support for XPath makes it considerably easier to manage
more complex XML structures.
ElementTree
represents XML