使用DOM解析的時候是需要把文檔的所有內容讀入內存然后建立一個DOM樹結構,然后通過DOM提供的接口來實現XML文件的解析,如果文件比較小的時候肯定是很方便的。但是如果是XML文件很大的話,那么這種方式的解析效率肯定會大打折扣的,所以SAX解析就很有必要的了。SAX采用基於事件驅動的處理方式,它將XML文檔轉換成一系列的事件,由單獨的事件處理器來決定如何處理。在讀入文檔的過程中便實現了解析過程,現在就簡單介紹下SAX解析的具體實現過程。
1.主要對象
SAXParserFactory:解析工廠
SAXParser:解析器,通過解析工廠獲取
ContentHander、DTDHander、ErrorHandler,EntityResolver:事件處理器接口
DefaultHandler:繼承了上面的四個事件接口,在實際開發中直接從DefaultHandler繼承並實現相關函數就可以了
2.XML文檔
和上次DOM解析的XML文件是一樣的
<?xml version="1.0" encoding="UTF-8"?> <world> <comuntry id="1"> <name>China</name> <capital>Beijing</capital> <population>1234</population> <area>960</area> </comuntry> <comuntry id="2"> <name id="">America</name> <capital>Washington</capital> <population>234</population> <area>900</area> </comuntry> <comuntry id="3"> <name >Japan</name> <capital>Tokyo</capital> <population>234</population> <area>60</area> </comuntry> <comuntry id="4"> <name >Russia</name> <capital>Moscow</capital> <population>34</population> <area>1960</area> </comuntry> </world>
3.主要接口分析
EntityResolver :
package org.xml.sax; import java.io.IOException; public interface EntityResolver { /** * Allow the application to resolve external entities. * * <p>The parser will call this method before opening any external * entity except the top-level document entity. Such entities include * the external DTD subset and external parameter entities referenced * within the DTD (in either case, only if the parser reads external * parameter entities), and external general entities referenced * within the document element (if the parser reads external general * entities). The application may request that the parser locate * the entity itself, that it use an alternative URI, or that it * use data provided by the application (as a character or byte * input stream).</p> * * <p>Application writers can use this method to redirect external * system identifiers to secure and/or local URIs, to look up * public identifiers in a catalogue, or to read an entity from a * database or other input source (including, for example, a dialog * box). Neither XML nor SAX specifies a preferred policy for using * public or system IDs to resolve resources. However, SAX specifies * how to interpret any InputSource returned by this method, and that * if none is returned, then the system ID will be dereferenced as * a URL. </p> * * <p>If the system identifier is a URL, the SAX parser must * resolve it fully before reporting it to the application.</p> * * @param publicId The public identifier of the external entity * being referenced, or null if none was supplied. * @param systemId The system identifier of the external entity * being referenced. * @return An InputSource object describing the new input source, * or null to request that the parser open a regular * URI connection to the system identifier. * @exception org.xml.sax.SAXException Any SAX exception, possibly * wrapping another exception. * @exception java.io.IOException A Java-specific IO exception, * possibly the result of creating a new InputStream * or Reader for the InputSource. * @see org.xml.sax.InputSource */ public abstract InputSource resolveEntity (String publicId, String systemId) throws SAXException, IOException; }
DTDHandler :
package org.xml.sax; /** * Receive notification of basic DTD-related events. * * <blockquote> * <em>This module, both source code and documentation, is in the * Public Domain, and comes with <strong>NO WARRANTY</strong>.</em> * See <a href='http://www.saxproject.org'>http://www.saxproject.org</a> * for further information. * </blockquote> * * <p>If a SAX application needs information about notations and * unparsed entities, then the application implements this * interface and registers an instance with the SAX parser using * the parser's setDTDHandler method. The parser uses the * instance to report notation and unparsed entity declarations to * the application.</p> * * <p>Note that this interface includes only those DTD events that * the XML recommendation <em>requires</em> processors to report: * notation and unparsed entity declarations.</p> * * <p>The SAX parser may report these events in any order, regardless * of the order in which the notations and unparsed entities were * declared; however, all DTD events must be reported after the * document handler's startDocument event, and before the first * startElement event. * (If the {@link org.xml.sax.ext.LexicalHandler LexicalHandler} is * used, these events must also be reported before the endDTD event.) * </p> * * <p>It is up to the application to store the information for * future use (perhaps in a hash table or object tree). * If the application encounters attributes of type "NOTATION", * "ENTITY", or "ENTITIES", it can use the information that it * obtained through this interface to find the entity and/or * notation corresponding with the attribute value.</p> * * @since SAX 1.0 * @author David Megginson * @see org.xml.sax.XMLReader#setDTDHandler */ public interface DTDHandler { /** * Receive notification of a notation declaration event. * * <p>It is up to the application to record the notation for later * reference, if necessary; * notations may appear as attribute values and in unparsed entity * declarations, and are sometime used with processing instruction * target names.</p> * * <p>At least one of publicId and systemId must be non-null. * If a system identifier is present, and it is a URL, the SAX * parser must resolve it fully before passing it to the * application through this event.</p> * * <p>There is no guarantee that the notation declaration will be * reported before any unparsed entities that use it.</p> * * @param name The notation name. * @param publicId The notation's public identifier, or null if * none was given. * @param systemId The notation's system identifier, or null if * none was given. * @exception org.xml.sax.SAXException Any SAX exception, possibly * wrapping another exception. * @see #unparsedEntityDecl * @see org.xml.sax.Attributes */ public abstract void notationDecl (String name, String publicId, String systemId) throws SAXException; /** * Receive notification of an unparsed entity declaration event. * * <p>Note that the notation name corresponds to a notation * reported by the {@link #notationDecl notationDecl} event. * It is up to the application to record the entity for later * reference, if necessary; * unparsed entities may appear as attribute values. * </p> * * <p>If the system identifier is a URL, the parser must resolve it * fully before passing it to the application.</p> * * @exception org.xml.sax.SAXException Any SAX exception, possibly * wrapping another exception. * @param name The unparsed entity's name. * @param publicId The entity's public identifier, or null if none * was given. * @param systemId The entity's system identifier. * @param notationName The name of the associated notation. * @see #notationDecl * @see org.xml.sax.Attributes */ public abstract void unparsedEntityDecl (String name, String publicId, String systemId, String notationName) throws SAXException; }
ContentHandler:
package org.xml.sax; /** * Receive notification of the logical content of a document. * * <blockquote> * <em>This module, both source code and documentation, is in the * Public Domain, and comes with <strong>NO WARRANTY</strong>.</em> * See <a href='http://www.saxproject.org'>http://www.saxproject.org</a> * for further information. * </blockquote> * * <p>This is the main interface that most SAX applications * implement: if the application needs to be informed of basic parsing * events, it implements this interface and registers an instance with * the SAX parser using the {@link org.xml.sax.XMLReader#setContentHandler * setContentHandler} method. The parser uses the instance to report * basic document-related events like the start and end of elements * and character data.</p> * * <p>The order of events in this interface is very important, and * mirrors the order of information in the document itself. For * example, all of an element's content (character data, processing * instructions, and/or subelements) will appear, in order, between * the startElement event and the corresponding endElement event.</p> * * <p>This interface is similar to the now-deprecated SAX 1.0 * DocumentHandler interface, but it adds support for Namespaces * and for reporting skipped entities (in non-validating XML * processors).</p> * * <p>Implementors should note that there is also a * <code>ContentHandler</code> class in the <code>java.net</code> * package; that means that it's probably a bad idea to do</p> * * <pre>import java.net.*; * import org.xml.sax.*; * </pre> * * <p>In fact, "import ...*" is usually a sign of sloppy programming * anyway, so the user should consider this a feature rather than a * bug.</p> * * @since SAX 2.0 * @author David Megginson * @see org.xml.sax.XMLReader * @see org.xml.sax.DTDHandler * @see org.xml.sax.ErrorHandler */ public interface ContentHandler { /** * Receive an object for locating the origin of SAX document events. * * <p>SAX parsers are strongly encouraged (though not absolutely * required) to supply a locator: if it does so, it must supply * the locator to the application by invoking this method before * invoking any of the other methods in the ContentHandler * interface.</p> * * <p>The locator allows the application to determine the end * position of any document-related event, even if the parser is * not reporting an error. Typically, the application will * use this information for reporting its own errors (such as * character content that does not match an application's * business rules). The information returned by the locator * is probably not sufficient for use with a search engine.</p> * * <p>Note that the locator will return correct information only * during the invocation SAX event callbacks after * {@link #startDocument startDocument} returns and before * {@link #endDocument endDocument} is called. The * application should not attempt to use it at any other time.</p> * * @param locator an object that can return the location of * any SAX document event * @see org.xml.sax.Locator */ public void setDocumentLocator (Locator locator); /** * Receive notification of the beginning of a document. * * <p>The SAX parser will invoke this method only once, before any * other event callbacks (except for {@link #setDocumentLocator * setDocumentLocator}).</p> * * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception * @see #endDocument */ public void startDocument () throws SAXException; /** * Receive notification of the end of a document. * * <p><strong>There is an apparent contradiction between the * documentation for this method and the documentation for {@link * org.xml.sax.ErrorHandler#fatalError}. Until this ambiguity is * resolved in a future major release, clients should make no * assumptions about whether endDocument() will or will not be * invoked when the parser has reported a fatalError() or thrown * an exception.</strong></p> * * <p>The SAX parser will invoke this method only once, and it will * be the last method invoked during the parse. The parser shall * not invoke this method until it has either abandoned parsing * (because of an unrecoverable error) or reached the end of * input.</p> * * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception * @see #startDocument */ public void endDocument() throws SAXException; /** * Begin the scope of a prefix-URI Namespace mapping. * * <p>The information from this event is not necessary for * normal Namespace processing: the SAX XML reader will * automatically replace prefixes for element and attribute * names when the <code>http://xml.org/sax/features/namespaces</code> * feature is <var>true</var> (the default).</p> * * <p>There are cases, however, when applications need to * use prefixes in character data or in attribute values, * where they cannot safely be expanded automatically; the * start/endPrefixMapping event supplies the information * to the application to expand prefixes in those contexts * itself, if necessary.</p> * * <p>Note that start/endPrefixMapping events are not * guaranteed to be properly nested relative to each other: * all startPrefixMapping events will occur immediately before the * corresponding {@link #startElement startElement} event, * and all {@link #endPrefixMapping endPrefixMapping} * events will occur immediately after the corresponding * {@link #endElement endElement} event, * but their order is not otherwise * guaranteed.</p> * * <p>There should never be start/endPrefixMapping events for the * "xml" prefix, since it is predeclared and immutable.</p> * * @param prefix the Namespace prefix being declared. * An empty string is used for the default element namespace, * which has no prefix. * @param uri the Namespace URI the prefix is mapped to * @throws org.xml.sax.SAXException the client may throw * an exception during processing * @see #endPrefixMapping * @see #startElement */ public void startPrefixMapping (String prefix, String uri) throws SAXException; /** * End the scope of a prefix-URI mapping. * * <p>See {@link #startPrefixMapping startPrefixMapping} for * details. These events will always occur immediately after the * corresponding {@link #endElement endElement} event, but the order of * {@link #endPrefixMapping endPrefixMapping} events is not otherwise * guaranteed.</p> * * @param prefix the prefix that was being mapped. * This is the empty string when a default mapping scope ends. * @throws org.xml.sax.SAXException the client may throw * an exception during processing * @see #startPrefixMapping * @see #endElement */ public void endPrefixMapping (String prefix) throws SAXException; /** * Receive notification of the beginning of an element. * * <p>The Parser will invoke this method at the beginning of every * element in the XML document; there will be a corresponding * {@link #endElement endElement} event for every startElement event * (even when the element is empty). All of the element's content will be * reported, in order, before the corresponding endElement * event.</p> * * <p>This event allows up to three name components for each * element:</p> * * <ol> * <li>the Namespace URI;</li> * <li>the local name; and</li> * <li>the qualified (prefixed) name.</li> * </ol> * * <p>Any or all of these may be provided, depending on the * values of the <var>http://xml.org/sax/features/namespaces</var> * and the <var>http://xml.org/sax/features/namespace-prefixes</var> * properties:</p> * * <ul> * <li>the Namespace URI and local name are required when * the namespaces property is <var>true</var> (the default), and are * optional when the namespaces property is <var>false</var> (if one is * specified, both must be);</li> * <li>the qualified name is required when the namespace-prefixes property * is <var>true</var>, and is optional when the namespace-prefixes property * is <var>false</var> (the default).</li> * </ul> * * <p>Note that the attribute list provided will contain only * attributes with explicit values (specified or defaulted): * #IMPLIED attributes will be omitted. The attribute list * will contain attributes used for Namespace declarations * (xmlns* attributes) only if the * <code>http://xml.org/sax/features/namespace-prefixes</code> * property is true (it is false by default, and support for a * true value is optional).</p> * * <p>Like {@link #characters characters()}, attribute values may have * characters that need more than one <code>char</code> value. </p> * * @param uri the Namespace URI, or the empty string if the * element has no Namespace URI or if Namespace * processing is not being performed * @param localName the local name (without prefix), or the * empty string if Namespace processing is not being * performed * @param qName the qualified name (with prefix), or the * empty string if qualified names are not available * @param atts the attributes attached to the element. If * there are no attributes, it shall be an empty * Attributes object. The value of this object after * startElement returns is undefined * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception * @see #endElement * @see org.xml.sax.Attributes * @see org.xml.sax.helpers.AttributesImpl */ public void startElement (String uri, String localName, String qName, Attributes atts) throws SAXException; /** * Receive notification of the end of an element. * * <p>The SAX parser will invoke this method at the end of every * element in the XML document; there will be a corresponding * {@link #startElement startElement} event for every endElement * event (even when the element is empty).</p> * * <p>For information on the names, see startElement.</p> * * @param uri the Namespace URI, or the empty string if the * element has no Namespace URI or if Namespace * processing is not being performed * @param localName the local name (without prefix), or the * empty string if Namespace processing is not being * performed * @param qName the qualified XML name (with prefix), or the * empty string if qualified names are not available * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception */ public void endElement (String uri, String localName, String qName) throws SAXException; /** * Receive notification of character data. * * <p>The Parser will call this method to report each chunk of * character data. SAX parsers may return all contiguous character * data in a single chunk, or they may split it into several * chunks; however, all of the characters in any single event * must come from the same external entity so that the Locator * provides useful information.</p> * * <p>The application must not attempt to read from the array * outside of the specified range.</p> * * <p>Individual characters may consist of more than one Java * <code>char</code> value. There are two important cases where this * happens, because characters can't be represented in just sixteen bits. * In one case, characters are represented in a <em>Surrogate Pair</em>, * using two special Unicode values. Such characters are in the so-called * "Astral Planes", with a code point above U+FFFF. A second case involves * composite characters, such as a base character combining with one or * more accent characters. </p> * * <p> Your code should not assume that algorithms using * <code>char</code>-at-a-time idioms will be working in character * units; in some cases they will split characters. This is relevant * wherever XML permits arbitrary characters, such as attribute values, * processing instruction data, and comments as well as in data reported * from this method. It's also generally relevant whenever Java code * manipulates internationalized text; the issue isn't unique to XML.</p> * * <p>Note that some parsers will report whitespace in element * content using the {@link #ignorableWhitespace ignorableWhitespace} * method rather than this one (validating parsers <em>must</em> * do so).</p> * * @param ch the characters from the XML document * @param start the start position in the array * @param length the number of characters to read from the array * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception * @see #ignorableWhitespace * @see org.xml.sax.Locator */ public void characters (char ch[], int start, int length) throws SAXException; /** * Receive notification of ignorable whitespace in element content. * * <p>Validating Parsers must use this method to report each chunk * of whitespace in element content (see the W3C XML 1.0 * recommendation, section 2.10): non-validating parsers may also * use this method if they are capable of parsing and using * content models.</p> * * <p>SAX parsers may return all contiguous whitespace in a single * chunk, or they may split it into several chunks; however, all of * the characters in any single event must come from the same * external entity, so that the Locator provides useful * information.</p> * * <p>The application must not attempt to read from the array * outside of the specified range.</p> * * @param ch the characters from the XML document * @param start the start position in the array * @param length the number of characters to read from the array * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception * @see #characters */ public void ignorableWhitespace (char ch[], int start, int length) throws SAXException; /** * Receive notification of a processing instruction. * * <p>The Parser will invoke this method once for each processing * instruction found: note that processing instructions may occur * before or after the main document element.</p> * * <p>A SAX parser must never report an XML declaration (XML 1.0, * section 2.8) or a text declaration (XML 1.0, section 4.3.1) * using this method.</p> * * <p>Like {@link #characters characters()}, processing instruction * data may have characters that need more than one <code>char</code> * value. </p> * * @param target the processing instruction target * @param data the processing instruction data, or null if * none was supplied. The data does not include any * whitespace separating it from the target * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception */ public void processingInstruction (String target, String data) throws SAXException; /** * Receive notification of a skipped entity. * This is not called for entity references within markup constructs * such as element start tags or markup declarations. (The XML * recommendation requires reporting skipped external entities. * SAX also reports internal entity expansion/non-expansion, except * within markup constructs.) * * <p>The Parser will invoke this method each time the entity is * skipped. Non-validating processors may skip entities if they * have not seen the declarations (because, for example, the * entity was declared in an external DTD subset). All processors * may skip external entities, depending on the values of the * <code>http://xml.org/sax/features/external-general-entities</code> * and the * <code>http://xml.org/sax/features/external-parameter-entities</code> * properties.</p> * * @param name the name of the skipped entity. If it is a * parameter entity, the name will begin with '%', and if * it is the external DTD subset, it will be the string * "[dtd]" * @throws org.xml.sax.SAXException any SAX exception, possibly * wrapping another exception */ public void skippedEntity (String name) throws SAXException; }
ErrorHandler:
package org.xml.sax; /** * Basic interface for SAX error handlers. * * <blockquote> * <em>This module, both source code and documentation, is in the * Public Domain, and comes with <strong>NO WARRANTY</strong>.</em> * See <a href='http://www.saxproject.org'>http://www.saxproject.org</a> * for further information. * </blockquote> * * <p>If a SAX application needs to implement customized error * handling, it must implement this interface and then register an * instance with the XML reader using the * {@link org.xml.sax.XMLReader#setErrorHandler setErrorHandler} * method. The parser will then report all errors and warnings * through this interface.</p> * * <p><strong>WARNING:</strong> If an application does <em>not</em> * register an ErrorHandler, XML parsing errors will go unreported, * except that <em>SAXParseException</em>s will be thrown for fatal errors. * In order to detect validity errors, an ErrorHandler that does something * with {@link #error error()} calls must be registered.</p> * * <p>For XML processing errors, a SAX driver must use this interface * in preference to throwing an exception: it is up to the application * to decide whether to throw an exception for different types of * errors and warnings. Note, however, that there is no requirement that * the parser continue to report additional errors after a call to * {@link #fatalError fatalError}. In other words, a SAX driver class * may throw an exception after reporting any fatalError. * Also parsers may throw appropriate exceptions for non-XML errors. * For example, {@link XMLReader#parse XMLReader.parse()} would throw * an IOException for errors accessing entities or the document.</p> * * @since SAX 1.0 * @author David Megginson * @see org.xml.sax.XMLReader#setErrorHandler * @see org.xml.sax.SAXParseException */ public interface ErrorHandler { /** * Receive notification of a warning. * * <p>SAX parsers will use this method to report conditions that * are not errors or fatal errors as defined by the XML * recommendation. The default behaviour is to take no * action.</p> * * <p>The SAX parser must continue to provide normal parsing events * after invoking this method: it should still be possible for the * application to process the document through to the end.</p> * * <p>Filters may use this method to report other, non-XML warnings * as well.</p> * * @param exception The warning information encapsulated in a * SAX parse exception. * @exception org.xml.sax.SAXException Any SAX exception, possibly * wrapping another exception. * @see org.xml.sax.SAXParseException */ public abstract void warning (SAXParseException exception) throws SAXException; /** * Receive notification of a recoverable error. * * <p>This corresponds to the definition of "error" in section 1.2 * of the W3C XML 1.0 Recommendation. For example, a validating * parser would use this callback to report the violation of a * validity constraint. The default behaviour is to take no * action.</p> * * <p>The SAX parser must continue to provide normal parsing * events after invoking this method: it should still be possible * for the application to process the document through to the end. * If the application cannot do so, then the parser should report * a fatal error even if the XML recommendation does not require * it to do so.</p> * * <p>Filters may use this method to report other, non-XML errors * as well.</p> * * @param exception The error information encapsulated in a * SAX parse exception. * @exception org.xml.sax.SAXException Any SAX exception, possibly * wrapping another exception. * @see org.xml.sax.SAXParseException */ public abstract void error (SAXParseException exception) throws SAXException; /** * Receive notification of a non-recoverable error. * * <p><strong>There is an apparent contradiction between the * documentation for this method and the documentation for {@link * org.xml.sax.ContentHandler#endDocument}. Until this ambiguity * is resolved in a future major release, clients should make no * assumptions about whether endDocument() will or will not be * invoked when the parser has reported a fatalError() or thrown * an exception.</strong></p> * * <p>This corresponds to the definition of "fatal error" in * section 1.2 of the W3C XML 1.0 Recommendation. For example, a * parser would use this callback to report the violation of a * well-formedness constraint.</p> * * <p>The application must assume that the document is unusable * after the parser has invoked this method, and should continue * (if at all) only for the sake of collecting additional error * messages: in fact, SAX parsers are free to stop reporting any * other events once this method has been invoked.</p> * * @param exception The error information encapsulated in a * SAX parse exception. * @exception org.xml.sax.SAXException Any SAX exception, possibly * wrapping another exception. * @see org.xml.sax.SAXParseException */ public abstract void fatalError (SAXParseException exception) throws SAXException; }
上面是四個基本處理事件的接口源碼,通過閱讀代碼就可以知道每個事件需要完成的事情。
4.SAX解析具體實現過程,主要包括兩個過程一個是解析規則的定義還有就是文件的讀取
事件處理MyHandler.java
import java.io.IOException; import org.xml.sax.Attributes; import org.xml.sax.InputSource; import org.xml.sax.Locator; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; import org.xml.sax.helpers.DefaultHandler; public class MyHandler extends DefaultHandler { /** * 開始前綴 URI 名稱空間范圍映射。 * 此事件的信息對於常規的命名空間處理並非必需: * 當 http://xml.org/sax/features/namespaces 功能為 true(默認)時, * SAX XML 讀取器將自動替換元素和屬性名稱的前綴。 * 參數意義如下: * prefix :前綴 * uri :命名空間 */ @Override public void startPrefixMapping(String prefix, String uri) throws SAXException { // TODO Auto-generated method stub System.out.println("(startPrefixMapping)start prefix_mapping : xmlns:"+prefix+" = " +"\""+uri+"\""); } /** * 結束前綴 URI 范圍的映射。 * @param prefix 前綴 */ @Override public void endPrefixMapping(String prefix) throws SAXException { // TODO Auto-generated method stub System.out.println("(endPrefixMapping)end prefix_mapping : "+prefix); } /** * 文檔結束 */ @Override public void endDocument() throws SAXException { // TODO Auto-generated method stub System.out.println("(endDocument)doument is ended"); } /** * 接收文檔的結尾的通知。 * 參數意義如下: * uri :元素的命名空間 * localName :元素的本地名稱(不帶前綴) * qName :元素的限定名(帶前綴) */ @Override public void endElement(String uri, String localName, String qName) throws SAXException { // TODO Auto-generated method stub System.out.println("(endElement)end element : "+qName+"("+uri+")"); } /** * 接收元素內容中可忽略的空白的通知。 * 參數意義如下: * ch : 來自 XML 文檔的字符 * start : 數組中的開始位置 * length : 從數組中讀取的字符的個數 */ @Override public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException { // TODO Auto-generated method stub StringBuffer buffer = new StringBuffer(); for(int i = start ; i < start+length ; i++){ switch(ch[i]){ case '\\':buffer.append("\\\\");break; case '\r':buffer.append("\\r");break; case '\n':buffer.append("\\n");break; case '\t':buffer.append("\\t");break; case '\"':buffer.append("\\\"");break; default : buffer.append(ch[i]); } } System.out.println("(ignorableWhitespace)ignorable whitespace("+length+"): "+buffer.toString()); } /** * 接收用來查找 SAX 文檔事件起源的對象。 * 參數意義如下: * locator : 可以返回任何 SAX 文檔事件位置的對象 */ @Override public void setDocumentLocator(Locator locator) { // TODO Auto-generated method stub System.out.println("(setDocumentLocator)set document_locator : (lineNumber = "+locator.getLineNumber() +",columnNumber = "+locator.getColumnNumber() +",systemId = "+locator.getSystemId() +",publicId = "+locator.getPublicId()+")"); } /** * 接收文檔的開始的通知。 */ @Override public void startDocument() throws SAXException { // TODO Auto-generated method stub System.out.println("(startDocument)document is startting"); } /** * 接收元素開始的通知。 * 參數意義如下: * uri :元素的命名空間 * localName :元素的本地名稱(不帶前綴) * qName :元素的限定名(帶前綴) * atts :元素的屬性集合 */ @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { // TODO Auto-generated method stub System.out.println("(startElement)start element : "+qName+"("+uri+")"); } /** * 接收注釋聲明事件的通知。 * 參數意義如下: * name - 注釋名稱。 * publicId - 注釋的公共標識符,如果未提供,則為 null。 * systemId - 注釋的系統標識符,如果未提供,則為 null。 */ @Override public void notationDecl(String name, String publicId, String systemId) throws SAXException { // TODO Auto-generated method stub System.out.println("(notationDecl)notation declare : (name = "+name +",systemId = "+publicId +",publicId = "+systemId+")"); } /** * 允許應用程序解析外部實體。 * 解析器將在打開任何外部實體(頂級文檔實體除外)前調用此方法 * 參數意義如下: * publicId : 被引用的外部實體的公共標識符,如果未提供,則為 null。 * systemId : 被引用的外部實體的系統標識符。 * 返回: * 一個描述新輸入源的 InputSource 對象,或者返回 null, * 以請求解析器打開到系統標識符的常規 URI 連接。 */ @Override public InputSource resolveEntity(String publicId, String systemId) throws IOException, SAXException { // TODO Auto-generated method stub return super.resolveEntity(publicId, systemId); } /** * 接收跳過的實體的通知。 * 參數意義如下: * name : 所跳過的實體的名稱。如果它是參數實體,則名稱將以 '%' 開頭, * 如果它是外部 DTD 子集,則將是字符串 "[dtd]" */ @Override public void skippedEntity(String name) throws SAXException { // TODO Auto-generated method stub System.out.println("(skippedEntity)the name of the skipped entity : "+name); } /** * 接收未解析的實體聲明事件的通知。 * 參數意義如下: * name - 未解析的實體的名稱。 * publicId - 實體的公共標識符,如果未提供,則為 null。 * systemId - 實體的系統標識符。 * notationName - 相關注釋的名稱。 */ @Override public void unparsedEntityDecl(String name, String publicId, String systemId, String notationName) throws SAXException { // TODO Auto-generated method stub System.out.println("(unparsedEntityDecl)unparsed entity declare : (name = "+name +",systemId = "+publicId +",publicId = "+systemId +",notationName = "+notationName+")"); } /** * 接收處理指令的通知。 * 參數意義如下: * target : 處理指令目標 * data : 處理指令數據,如果未提供,則為 null。 */ @Override public void processingInstruction(String target, String data) throws SAXException { // TODO Auto-generated method stub System.out.println("(processingInstruction)process instruction : (target = \"" +target+"\",data = \""+data+"\")"); } /** * 接收字符數據的通知。 * 在DOM中 ch[begin:end] 相當於Text節點的節點值(nodeValue) */ @Override public void characters(char[] ch, int start, int length) throws SAXException { // TODO Auto-generated method stub StringBuffer buffer = new StringBuffer(); for(int i = start ; i < start+length ; i++){ switch(ch[i]){ case '\\':buffer.append("\\\\");break; case '\r':buffer.append("\\r");break; case '\n':buffer.append("\\n");break; case '\t':buffer.append("\\t");break; case '\"':buffer.append("\\\"");break; default : buffer.append(ch[i]); } } System.out.println("(characters)characters("+length+"): "+buffer.toString()); } /** * 錯誤異常處理 可恢復 */ @Override public void error(SAXParseException e) throws SAXException { // TODO Auto-generated method stub System.err.println("(error)Error ("+e.getLineNumber()+"," +e.getColumnNumber()+") : "+e.getMessage()); } /** * 致命性錯誤處理 不可恢復 */ @Override public void fatalError(SAXParseException e) throws SAXException { // TODO Auto-generated method stub System.err.println("(fatalError)FatalError ("+e.getLineNumber()+"," +e.getColumnNumber()+") : "+e.getMessage()); } /** * 警告處理 */ @Override public void warning(SAXParseException e) throws SAXException { // TODO Auto-generated method stub System.err.println("(warning)("+e.getLineNumber()+"," +e.getColumnNumber()+") : "+e.getMessage()); } }
解析開始:
SAXParse.java
import java.io.File; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io.IOException; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; import javax.xml.parsers.SAXParserFactory; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; /** * 1.得到SAX解析器的工廠實例 * 2.從SAX工廠實例中獲得SAX解析器 * 3.把要解析的XML文檔轉化為輸入流,以便DOM解析器解析它 * 4.解析XML文檔 */ public class SAXParse { /** * @param args */ public static void main(String[] args) { // TODO Auto-generated method stub // 得到SAX解析工廠 SAXParserFactory factory = SAXParserFactory.newInstance(); // 創建解析器 SAXParser parser =null; try { parser = factory.newSAXParser(); XMLReader xmlReader = parser.getXMLReader(); InputSource input = new InputSource(new FileInputStream(new File("world.xml"))); xmlReader.setContentHandler(new MyHandler()); xmlReader.parse(input); } catch (ParserConfigurationException | SAXException e) { // TODO Auto-generated catch block e.printStackTrace(); }catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } }
5.結果輸出;
(setDocumentLocator)set document_locator : (lineNumber = 1,columnNumber = 1,systemId = null,publicId = null) (startDocument)document is startting (startElement)start element : world() (characters)characters(2): \n\t (startElement)start element : comuntry() (characters)characters(3): \n\t\t (startElement)start element : name() (characters)characters(5): China (endElement)end element : name() (characters)characters(3): \n\t\t (startElement)start element : capital() (characters)characters(7): Beijing (endElement)end element : capital() (characters)characters(3): \n\t\t (startElement)start element : population() (characters)characters(4): 1234 (endElement)end element : population() (characters)characters(3): \n\t\t (startElement)start element : area() (characters)characters(3): 960 (endElement)end element : area() (characters)characters(2): \n\t (endElement)end element : comuntry() (characters)characters(2): \n\t (startElement)start element : comuntry() (characters)characters(3): \n\t\t (startElement)start element : name() (characters)characters(7): America (endElement)end element : name() (characters)characters(3): \n\t\t (startElement)start element : capital() (characters)characters(10): Washington (endElement)end element : capital() (characters)characters(3): \n\t\t (startElement)start element : population() (characters)characters(3): 234 (endElement)end element : population() (characters)characters(3): \n\t\t (startElement)start element : area() (characters)characters(3): 900 (endElement)end element : area() (characters)characters(2): \n\t (endElement)end element : comuntry() (characters)characters(2): \n\t (startElement)start element : comuntry() (characters)characters(3): \n\t\t (startElement)start element : name() (characters)characters(5): Japan (endElement)end element : name() (characters)characters(3): \n\t\t (startElement)start element : capital() (characters)characters(5): Tokyo (endElement)end element : capital() (characters)characters(3): \n\t\t (startElement)start element : population() (characters)characters(3): 234 (endElement)end element : population() (characters)characters(3): \n\t\t (startElement)start element : area() (characters)characters(2): 60 (endElement)end element : area() (characters)characters(2): \n\t (endElement)end element : comuntry() (characters)characters(2): \n\t (startElement)start element : comuntry() (characters)characters(3): \n\t\t (startElement)start element : name() (characters)characters(6): Russia (endElement)end element : name() (characters)characters(3): \n\t\t (startElement)start element : capital() (characters)characters(6): Moscow (endElement)end element : capital() (characters)characters(3): \n\t\t (startElement)start element : population() (characters)characters(2): 34 (endElement)end element : population() (characters)characters(3): \n\t\t (startElement)start element : area() (characters)characters(4): 1960 (endElement)end element : area() (characters)characters(2): \n\t (endElement)end element : comuntry() (characters)characters(1): \n (endElement)end element : world() (endDocument)doument is ended
6.SAX解析完成,這是一個很簡單的解析讀取過程,具體的應用需要定制。