Structured Markup Processing Tools¶
Python supports a variety of modules to work with various forms of structured data markup. This includes modules to work with the Standard Generalized Markup Language (SGML) and the Hypertext Markup Language (HTML), and several interfaces for working with the Extensible Markup Language (XML).
html— HyperText Markup Language supporthtml.parser— Simple HTML and XHTML parserHTMLParser- Example HTML Parser Application
HTMLParserMethodsHTMLParser.feed()HTMLParser.close()HTMLParser.reset()HTMLParser.getpos()HTMLParser.get_starttag_text()HTMLParser.handle_starttag()HTMLParser.handle_endtag()HTMLParser.handle_startendtag()HTMLParser.handle_data()HTMLParser.handle_entityref()HTMLParser.handle_charref()HTMLParser.handle_comment()HTMLParser.handle_decl()HTMLParser.handle_pi()HTMLParser.unknown_decl()
- Examples
html.entities— Definitions of HTML general entities- XML Processing Modules
xml.etree.ElementTree— The ElementTree XML API- Tutorial
- XPath support
- Reference
- XInclude support
- Reference
- Functions
- Element Objects
ElementElement.tagElement.textElement.tailElement.attribElement.clear()Element.get()Element.items()Element.keys()Element.set()Element.append()Element.extend()Element.find()Element.findall()Element.findtext()Element.insert()Element.iter()Element.iterfind()Element.itertext()Element.makeelement()Element.remove()
- ElementTree Objects
- QName Objects
- TreeBuilder Objects
- XMLParser Objects
- XMLPullParser Objects
- Exceptions
xml.dom— The Document Object Model API- Module Contents
- Objects in the DOM
- DOMImplementation Objects
- Node Objects
Node.nodeTypeNode.parentNodeNode.attributesNode.previousSiblingNode.nextSiblingNode.childNodesNode.firstChildNode.lastChildNode.localNameNode.prefixNode.namespaceURINode.nodeNameNode.nodeValueNode.hasAttributes()Node.hasChildNodes()Node.isSameNode()Node.appendChild()Node.insertBefore()Node.removeChild()Node.replaceChild()Node.normalize()Node.cloneNode()
- NodeList Objects
- DocumentType Objects
- Document Objects
- Element Objects
Element.tagNameElement.getElementsByTagName()Element.getElementsByTagNameNS()Element.hasAttribute()Element.hasAttributeNS()Element.getAttribute()Element.getAttributeNode()Element.getAttributeNS()Element.getAttributeNodeNS()Element.removeAttribute()Element.removeAttributeNode()Element.removeAttributeNS()Element.setAttribute()Element.setAttributeNode()Element.setAttributeNodeNS()Element.setAttributeNS()
- Attr Objects
- NamedNodeMap Objects
- Comment Objects
- Text and CDATASection Objects
- ProcessingInstruction Objects
- Exceptions
- Conformance
xml.dom.minidom— Minimal DOM implementationxml.dom.pulldom— Support for building partial DOM treesxml.sax— Support for SAX2 parsersxml.sax.handler— Base classes for SAX handlersContentHandlerDTDHandlerEntityResolverErrorHandlerLexicalHandlerfeature_namespacesfeature_namespace_prefixesfeature_string_interningfeature_validationfeature_external_gesfeature_external_pesall_featuresproperty_lexical_handlerproperty_declaration_handlerproperty_dom_nodeproperty_xml_stringall_properties- ContentHandler Objects
ContentHandler.setDocumentLocator()ContentHandler.startDocument()ContentHandler.endDocument()ContentHandler.startPrefixMapping()ContentHandler.endPrefixMapping()ContentHandler.startElement()ContentHandler.endElement()ContentHandler.startElementNS()ContentHandler.endElementNS()ContentHandler.characters()ContentHandler.ignorableWhitespace()ContentHandler.processingInstruction()ContentHandler.skippedEntity()
- DTDHandler Objects
- EntityResolver Objects
- ErrorHandler Objects
- LexicalHandler Objects
xml.sax.saxutils— SAX Utilitiesxml.sax.xmlreader— Interface for XML parsersXMLReaderIncrementalParserLocatorInputSourceAttributesImplAttributesNSImpl- XMLReader Objects
XMLReader.parse()XMLReader.getContentHandler()XMLReader.setContentHandler()XMLReader.getDTDHandler()XMLReader.setDTDHandler()XMLReader.getEntityResolver()XMLReader.setEntityResolver()XMLReader.getErrorHandler()XMLReader.setErrorHandler()XMLReader.setLocale()XMLReader.getFeature()XMLReader.setFeature()XMLReader.getProperty()XMLReader.setProperty()
- IncrementalParser Objects
- Locator Objects
- InputSource Objects
- The
AttributesInterface - The
AttributesNSInterface
xml.parsers.expat— Fast XML parsing using ExpatExpatErrorerrorXMLParserTypeErrorString()ParserCreate()- XMLParser Objects
xmlparser.Parse()xmlparser.ParseFile()xmlparser.SetBase()xmlparser.GetBase()xmlparser.GetInputContext()xmlparser.ExternalEntityParserCreate()xmlparser.SetParamEntityParsing()xmlparser.UseForeignDTD()xmlparser.buffer_sizexmlparser.buffer_textxmlparser.buffer_usedxmlparser.ordered_attributesxmlparser.specified_attributesxmlparser.ErrorByteIndexxmlparser.ErrorCodexmlparser.ErrorColumnNumberxmlparser.ErrorLineNumberxmlparser.CurrentByteIndexxmlparser.CurrentColumnNumberxmlparser.CurrentLineNumberxmlparser.XmlDeclHandler()xmlparser.StartDoctypeDeclHandler()xmlparser.EndDoctypeDeclHandler()xmlparser.ElementDeclHandler()xmlparser.AttlistDeclHandler()xmlparser.StartElementHandler()xmlparser.EndElementHandler()xmlparser.ProcessingInstructionHandler()xmlparser.CharacterDataHandler()xmlparser.UnparsedEntityDeclHandler()xmlparser.EntityDeclHandler()xmlparser.NotationDeclHandler()xmlparser.StartNamespaceDeclHandler()xmlparser.EndNamespaceDeclHandler()xmlparser.CommentHandler()xmlparser.StartCdataSectionHandler()xmlparser.EndCdataSectionHandler()xmlparser.DefaultHandler()xmlparser.DefaultHandlerExpand()xmlparser.NotStandaloneHandler()xmlparser.ExternalEntityRefHandler()
- ExpatError Exceptions
- Example
- Content Model Descriptions
- Expat error constants
codesmessagesXML_ERROR_ASYNC_ENTITYXML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REFXML_ERROR_BAD_CHAR_REFXML_ERROR_BINARY_ENTITY_REFXML_ERROR_DUPLICATE_ATTRIBUTEXML_ERROR_INCORRECT_ENCODINGXML_ERROR_INVALID_TOKENXML_ERROR_JUNK_AFTER_DOC_ELEMENTXML_ERROR_MISPLACED_XML_PIXML_ERROR_NO_ELEMENTSXML_ERROR_NO_MEMORYXML_ERROR_PARAM_ENTITY_REFXML_ERROR_PARTIAL_CHARXML_ERROR_RECURSIVE_ENTITY_REFXML_ERROR_SYNTAXXML_ERROR_TAG_MISMATCHXML_ERROR_UNCLOSED_TOKENXML_ERROR_UNDEFINED_ENTITYXML_ERROR_UNKNOWN_ENCODINGXML_ERROR_UNCLOSED_CDATA_SECTIONXML_ERROR_EXTERNAL_ENTITY_HANDLINGXML_ERROR_NOT_STANDALONEXML_ERROR_UNEXPECTED_STATEXML_ERROR_ENTITY_DECLARED_IN_PEXML_ERROR_FEATURE_REQUIRES_XML_DTDXML_ERROR_CANT_CHANGE_FEATURE_ONCE_PARSINGXML_ERROR_UNBOUND_PREFIXXML_ERROR_UNDECLARING_PREFIXXML_ERROR_INCOMPLETE_PEXML_ERROR_XML_DECLXML_ERROR_TEXT_DECLXML_ERROR_PUBLICIDXML_ERROR_SUSPENDEDXML_ERROR_NOT_SUSPENDEDXML_ERROR_ABORTEDXML_ERROR_FINISHEDXML_ERROR_SUSPEND_PEXML_ERROR_RESERVED_PREFIX_XMLXML_ERROR_RESERVED_PREFIX_XMLNSXML_ERROR_RESERVED_NAMESPACE_URIXML_ERROR_INVALID_ARGUMENTXML_ERROR_NO_BUFFERXML_ERROR_AMPLIFICATION_LIMIT_BREACH