一、EXCEL的底層實現
能力有限,了解的比較淺,有不足之處望指正,首先看下圖:
一、 excel2007是使用xml格式來存儲的,把一個excel文件后綴改為.zip,打開之后就直接可以看到一個excel文件對應的xml格式的文件了。
1.1、docProps目錄
1.1.1、 core.xml文件

<?xml version="1.0" encoding="utf-8"?> <cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dcmitype="http://purl.org/dc/dcmitype/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <!-- 創建人 --> <dc:creator>Administrator</dc:creator> <!-- 最后修改人 --> <cp:lastModifiedBy>Administrator</cp:lastModifiedBy> <!-- 創建時間 --> <dcterms:created xsi:type="dcterms:W3CDTF">2008-09-11T17:22:52Z</dcterms:created> <!-- 修改時間 --> <dcterms:modified xsi:type="dcterms:W3CDTF">2017-03-31T15:29:39Z</dcterms:modified> </cp:coreProperties>
1.1.2、 app.xml文件

<?xml version="1.0" encoding="utf-8"?> <Properties xmlns="http://schemas.openxmlformats.org/officeDocument/2006/extended-properties" xmlns:vt="http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"> <!-- 文檔類型 --> <Application>Microsoft Excel</Application> <!-- 是否加密 --> <DocSecurity>0</DocSecurity> <ScaleCrop>false</ScaleCrop> <HeadingPairs> <vt:vector size="2" baseType="variant"> <vt:variant> <vt:lpstr>工作表</vt:lpstr> </vt:variant> <vt:variant> <vt:i4>3</vt:i4> </vt:variant> </vt:vector> </HeadingPairs> <TitlesOfParts> <vt:vector size="3" baseType="lpstr"> <vt:lpstr>Sheet1</vt:lpstr> <vt:lpstr>Sheet2</vt:lpstr> <vt:lpstr>Sheet3</vt:lpstr> </vt:vector> </TitlesOfParts> <Company/> <LinksUpToDate>false</LinksUpToDate> <!-- 是否共享 --> <SharedDoc>false</SharedDoc> <HyperlinksChanged>false</HyperlinksChanged> <!-- 版本 --> <AppVersion>12.0000</AppVersion> </Properties>
1.2、 xl目錄,該目錄保存的就是excel中的詳細內容了
1.2.1、workbook.xml文件
文件包含一對<sheets>標簽,其中的每個<sheet>元素都代表Excel 2007文件中的一個sheet,工作表的名稱就是其name屬性的值,這里有三個sheet

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"> <fileVersion appName="xl" lastEdited="4" lowestEdited="4" rupBuild="4506"/> <workbookPr defaultThemeVersion="124226"/> <bookViews> <workbookView xWindow="0" yWindow="105" windowWidth="12765" windowHeight="5715"/> </bookViews> <sheets> <sheet name="Sheet1" sheetId="1" r:id="rId1"/> <sheet name="Sheet2" sheetId="2" r:id="rId2"/> <sheet name="Sheet3" sheetId="3" r:id="rId3"/> </sheets> <calcPr calcId="125725"/> </workbook>
1.2.2、 _rels/workbook.xml.rels
定義每個sheetid對應的sheet內容文件sheet1.xml,共享的單元格內容文件sharedstring.xml,樣式文件style.xml是當前單元格的樣式字體,顏色等樣式的xml配置。

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"> <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet3.xml"/> <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet2.xml"/> <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet1.xml"/> <Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings" Target="sharedStrings.xml"/> <Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/> <Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml"/> </Relationships>
1.2.3、 worksheets文件夾,保存的就是excel中的sheet
1.2.4、sheet.xml文件

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"> <!-- 容積,即該sheet中以供占用了多大的地方,哪個單元格到哪個單元格,矩形的左上角單元格和右下角單元格 此屬性可以判斷列數,如下就是:(M-A+1) --> <dimension ref="A1:M8"/> <!-- sheet視圖的配置信息,如打開時默認選中的單元格是哪個 --> <sheetViews> <sheetView tabSelected="1" workbookViewId="0"> <selection activeCell="G12" sqref="G12"/> </sheetView> </sheetViews> <!-- sheet的格式參數 --> <sheetFormatPr defaultRowHeight="14.25"/> <!-- 行中每列的配置參數 --> <cols> <col min="1" max="1" width="7.875" customWidth="1"/> <col min="2" max="2" width="6.75" customWidth="1"/> <col min="3" max="3" width="6" customWidth="1"/> <col min="4" max="4" width="4.5" customWidth="1"/> <col min="5" max="5" width="8.5" customWidth="1"/> <col min="6" max="6" width="9.375" customWidth="1"/> <col min="7" max="7" width="13.5" customWidth="1"/> <col min="8" max="8" width="7" customWidth="1"/> <col min="9" max="9" width="7.625" customWidth="1"/> <col min="10" max="10" width="4.625" customWidth="1"/> <col min="11" max="11" width="9.5" bestFit="1" customWidth="1"/> </cols> <!-- 數據區域,這個里面就是我們的單元格的相關數據信息了,是以行(Row)的方式記錄的 --> <sheetData> <!-- excel中的行 --> <row r="1" spans="1:13"> <!-- excel中的列 r:單元格坐標 s:單元格的style,對應style.xml中cellXfs元素下的某個元素xf元素, 此處就是對應索引為12的元素,從0開始計算,即為第13個xf元素。 t:元素類型,s表示是字符串類型,對應到sharedStrings.xml中的sst元素 --> <c r="A1" s="12" t="s"> <!-- 如果是t="s" 則存儲的是sharedStrings.xml中sst元素下的某個si元素,此處對應索引為0的元素,即第一個si元素 否則,這里存儲的就是單元格的值 --> <v>0</v> </c><c r="B1" s="12" t="s"> <v>10</v> </c><c r="C1" s="12" t="s"> <v>1</v> </c><c r="D1" s="12" t="s"> <v>2</v> </c><c r="E1" s="12" t="s"> <v>3</v> </c><c r="F1" s="12" t="s"> <v>4</v> </c><c r="G1" s="12" t="s"> <v>5</v> </c><c r="H1" s="12" t="s"> <v>6</v> </c><c r="I1" s="12" t="s"> <v>7</v> </c><c r="J1" s="12" t="s"> <v>8</v> </c><c r="K1" s="12" t="s"> <v>9</v> </c> </row> <row r="2" spans="1:13"> <c r="A2" t="s"> <v>11</v> </c><c r="B2" s="10" t="s"> <v>12</v> </c><c r="C2" s="1"> <v>2</v> </c><c r="D2" s="2" t="s"> <v>13</v> </c><c r="E2" s="3" t="s"> <v>14</v> </c><c r="F2" s="4"> <v>89</v> </c><c r="G2" s="5"> <v>10</v> </c><c r="H2" s="6"> <v>12</v> </c><c r="I2" s="7"> <v>0.2</v> </c><c r="J2" s="8"> <v>0.5</v> </c><c r="K2" s="9"> <v>4564645</v> </c><c r="L2" s="10"/> <c r="M2" s="11"/> </row> <row r="3" spans="1:13"> <!-- 此處為什么沒有A3? 因為excel中常規列如果沒有值,此處就不記錄,因此在讀取excel是應該特別注意,否則可能導致列錯亂 --> <c r="B3" s="10"/> <c r="C3" s="1"/> <c r="D3" s="2"/> <c r="E3" s="3"/> <c r="F3" s="4"/> <c r="G3" s="5"/> <c r="H3" s="6"/> <c r="I3" s="7"/> <c r="J3" s="8"/> <c r="K3" s="9"/> </row> <!-- 該行每列都是常規列,因此沒有值的就不記錄在這里 --> <row r="4" spans="1:13"> <c r="A4" s="12" t="s"> <v>15</v> </c><c r="E4" t="s"> <v>16</v> </c> </row> <!-- 5,6,7 行都是合並再一起的單元格 --> <row r="5" spans="1:13"> <!-- 合並單元格的值記錄在起始單元格里,其余單元格置空 --> <c r="A5" s="14" t="s"> <v>17</v> </c><c r="B5" s="13"/> <c r="C5" s="13"/> <c r="D5" s="13"/> <c r="E5" s="13"/> <c r="F5" s="13"/> <c r="G5" s="13"/> <c r="H5" s="13"/> <c r="I5" s="13"/> <c r="J5" s="13"/> <c r="K5" s="13"/> </row> <row r="6" spans="1:13"> <c r="A6" s="13"/> <c r="B6" s="13"/> <c r="C6" s="13"/> <c r="D6" s="13"/> <c r="E6" s="13"/> <c r="F6" s="13"/> <c r="G6" s="13"/> <c r="H6" s="13"/> <c r="I6" s="13"/> <c r="J6" s="13"/> <c r="K6" s="13"/> </row> <row r="7" spans="1:13"> <c r="A7" s="13"/> <c r="B7" s="13"/> <c r="C7" s="13"/> <c r="D7" s="13"/> <c r="E7" s="13"/> <c r="F7" s="13"/> <c r="G7" s="13"/> <c r="H7" s="13"/> <c r="I7" s="13"/> <c r="J7" s="13"/> <c r="K7" s="13"/> </row> <row r="8" spans="1:13"> <c r="A8" s="13"/> <c r="B8" s="13"/> <c r="C8" s="13"/> <c r="D8" s="13"/> <c r="E8" s="13"/> <c r="F8" s="13"/> <c r="G8" s="13"/> <c r="H8" s="13"/> <c r="I8" s="13"/> <c r="J8" s="13"/> <c r="K8" s="13"/> </row> </sheetData> <!-- 這個是記錄哪個到哪個是合並單元格的信息的 --> <mergeCells count="1"> <!-- A5到K8被合並了 --> <mergeCell ref="A5:K8"/> </mergeCells> <!-- --> <phoneticPr fontId="1" type="noConversion"/> <!-- 頁的相關屬性設置 --> <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/> <!-- 分頁設置 --> <pageSetup paperSize="9" orientation="portrait" r:id="rId1"/> <!-- 圖形信息,由圖可以看出,這里引用了draw的文檔了--> <drawing/> </worksheet>
1.2.5、sharedStrings.xml文件

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="18" uniqueCount="18"> <!-- si存儲的是字符串的值 --> <!-- 此處存儲的是sheet.xml中c(列)中屬性帶有t="s"的列的值, 帶有t="s"字符串列的定位是通過c元素下的v元素中的值來定位的 此處對應的就是v的值為0的單元格的實際值 --> <si> <t>常規</t> <phoneticPr fontId="1" type="noConversion"/> </si> <!-- sheet.xml中c元素中屬性t="s"的子元素<v>1</v>的單元格的實際值 --> <si> <t>數值</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>貨幣</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>會計專用</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>短日期</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>長日期</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>時間</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>百分比</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>分數</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>科學計數</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>文本</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>common</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>String</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>$</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>D1</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>常規A</t> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <r> <rPr> <sz val="11"/> <color theme="1"/> <rFont val="宋體"/> <family val="3"/> <charset val="134"/> </rPr><t>常規</t> </r><r> <rPr> <sz val="11"/> <color theme="1"/> <rFont val="Tahoma"/> <family val="2"/> <charset val="134"/> </rPr><t>D</t> </r> <phoneticPr fontId="1" type="noConversion"/> </si> <si> <t>說明: 為了簡單,行數就設置的少點 第一行為標題行(本身是常規行),分別標注了每列的屬性。 第三行和第二行一樣的列格式 第四行為常規行</t> <phoneticPr fontId="1" type="noConversion"/> </si> </sst>
1.2.6、style.xml文件

<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <styleSheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main"> <!-- 數字格式化顯示的定義 --> <numFmts count="5"> <!-- 此處的numFmtId和cellXfs中的numFmtId對應 --> <!-- 會計專用 --> <!-- 該元素表示cellXfs中xf元素的numFmtId="44"對應的格式化樣式 --> <numFmt numFmtId="44" formatCode="_ "¥"* #,##0.00_ ;_ "¥"* \-#,##0.00_ ;_ "¥"* "-"??_ ;_ @_ "/> <!-- 數值格式 --> <numFmt numFmtId="176" formatCode="0.00_);[Red]\(0.00\)"/> <!-- 貨幣 --> <numFmt numFmtId="177" formatCode=""¥"#,##0.00_);[Red]\("¥"#,##0.00\)"/> <!-- 長日期 --> <numFmt numFmtId="178" formatCode="[$-F800]dddd\,\ mmmm\ dd\,\ yyyy"/> <!-- 時間 --> <numFmt numFmtId="179" formatCode="[$-F400]h:mm:ss\ AM/PM"/> </numFmts> <!-- 字體樣式 --> <fonts count="3"> <font> <sz val="11"/> <color theme="1"/> <name val="Tahoma"/> <family val="2"/> <charset val="134"/> </font> <font> <sz val="9"/> <name val="Tahoma"/> <family val="2"/> <charset val="134"/> </font> <font> <sz val="11"/> <color theme="1"/> <name val="宋體"/> <family val="3"/> <charset val="134"/> </font> </fonts> <!-- 填充樣式,包括背景色 --> <fills count="2"> <fill> <patternFill patternType="none"/> </fill> <fill> <patternFill patternType="gray125"/> </fill> </fills> <!-- 邊框 --> <borders count="1"> <border> <left/> <right/> <top/> <bottom/> <diagonal/> </border> </borders> <!-- 提供給cellXfs來調用 --> <cellStyleXfs count="1"> <!-- 此處的numFmtId和cellXfs中的numFmtId對應 --> <xf numFmtId="0" fontId="0" fillId="0" borderId="0"/> </cellStyleXfs> <!-- 真正的樣式定義 --> <cellXfs count="15"> <!-- 此處的numFmtId和cellStyleXfs、numFmts等中的numFmtId對應 --> <!-- sheet.xml中的c(列)元素屬性中s的屬性值都對應這里一個xf元素,s的屬性值存儲的就是xf的索引--> <xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/> <!-- 該元素表示sheet.xml中c元素中s="1"對應的樣式 --> <xf numFmtId="176" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <!-- 該元素表示sheet.xml中c元素中s="2"對應的樣式 --> <xf numFmtId="177" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="44" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="14" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="178" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="179" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="10" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="12" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="11" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="49" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="13" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/> <xf numFmtId="0" fontId="2" fillId="0" borderId="0" xfId="0" applyFont="1"/> <xf numFmtId="0" fontId="2" fillId="0" borderId="0" xfId="0" applyFont="1" applyAlignment="1"> <alignment horizontal="left" vertical="top"/> </xf> <xf numFmtId="0" fontId="2" fillId="0" borderId="0" xfId="0" applyFont="1" applyAlignment="1"> <alignment horizontal="left" vertical="top" wrapText="1"/> </xf> </cellXfs> <cellStyles count="1"> <cellStyle name="常規" xfId="0" builtinId="0"/> </cellStyles> <dxfs count="0"/> <tableStyles count="0" defaultTableStyle="TableStyleMedium9" defaultPivotStyle="PivotStyleLight16"/> </styleSheet>
二、使用java讀取EXCEL
1. 依賴的jar包

<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>3.15</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>3.15</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml-schemas</artifactId> <version>3.15</version> </dependency> <dependency> <groupId>org.apache.xerces</groupId> <artifactId>com.springsource.org.apache.xerces</artifactId> <version>2.8.1</version> </dependency> <dependency> <groupId>commons-lang</groupId> <artifactId>commons-lang</artifactId> <version>2.6</version> </dependency>
2. IExcelRowReader.java

package com.wms.xml; import java.util.List; public interface IExcelRowReader { void getRows(int sheetIndex, int curRow, List<String> rowlist); }
3. ExcelReaderUtil.java

package com.wms.xml; import org.apache.poi.openxml4j.exceptions.OpenXML4JException; import org.xml.sax.SAXException; import java.io.IOException; public class ExcelReaderUtil { public static void read2007Excel(IExcelRowReader rowReader, String path) throws OpenXML4JException, SAXException, IOException { Excel2007Reader excel2007Reader = new Excel2007Reader(); excel2007Reader.setRowReader(rowReader); excel2007Reader.load(path); } }
4. Excel2007Reader.java

package com.wms.xml; import org.apache.commons.lang.StringUtils; import org.apache.poi.openxml4j.exceptions.OpenXML4JException; import org.apache.poi.openxml4j.opc.OPCPackage; import org.apache.poi.ss.usermodel.BuiltinFormats; import org.apache.poi.ss.usermodel.DataFormatter; import org.apache.poi.xssf.eventusermodel.XSSFReader; import org.apache.poi.xssf.model.SharedStringsTable; import org.apache.poi.xssf.model.StylesTable; import org.apache.poi.xssf.usermodel.XSSFCellStyle; import org.apache.poi.xssf.usermodel.XSSFRichTextString; import org.xml.sax.Attributes; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import org.xml.sax.XMLReader; import org.xml.sax.helpers.DefaultHandler; import org.xml.sax.helpers.XMLReaderFactory; import java.io.IOException; import java.io.InputStream; import java.util.ArrayList; import java.util.Iterator; import java.util.List; public class Excel2007Reader extends DefaultHandler { //列元素 private static final String C_ELEMENT = "c"; //列中屬性r private static final String R_ATTR = "r"; //列中的v元素 private static final String V_ELEMENT = "v"; //列中的t元素 private static final String T_ELEMENT = "t"; //列中屬性值 private static final String S_ATTR_VALUE = "s"; //列中屬性值 private static final String T_ATTR_VALUE = "t"; //sheet r:Id前綴 private static final String RID_PREFIX = "rId"; //行元素 private static final String ROW_ELEMENT = "row"; //時間格式化字符串 private static final String DATE_FORMAT = "yyyy-MM-dd HH:mm:ss"; //saxParser private static final String CLASS_SAXPARSER = "org.apache.xerces.parsers.SAXParser"; //填充字符串 private static final String CELL_FILL_STR = "@"; //列的最大位數 private static final int MAX_CELL_BIT = 3; //excel 2007 的共享字符串表,對應sharedString.xml private SharedStringsTable sharedStringsTable; private final DataFormatter dataFormatter = new DataFormatter(); //當前行 private int curRow; //當前列 private int curCell; //上一次的內容 private String lastContent; //是否是String類型的 private boolean nextIsString; //單元數據類型 private CellDataType cellDataType; //當前列坐標, 如A1,B5 private String curCoordinate; //前一個列的坐標 private String preCoordinate; //行的最大列坐標 private String maxCellCoordinate; //單元格的格式表,對應style.xml private StylesTable stylesTable; //單元格存儲格式的索引,對應style.xml中的numFmts元素的子元素索引 private int numFmtIndex; //單元格存儲的格式化字符串,nmtFmt的formateCode屬性的值 private String numFmtString; //sheet的索引 private int sheetIndex = -1; //存儲每行的列元素 List<String> rowCellList = new ArrayList<String>(); //單元格的數據格式 enum CellDataType { BOOL("b"), ERROR("e"), FORMULA("str"), INLINESTR("inlineStr"), SSTINDEX("s"), NUMBER(""), DATE("m/d/yy"), NULL(""); private String name; CellDataType(String name) { this.name = name; } public String getName() { return name; } public void setName(String name) { this.name = name; } } private IExcelRowReader rowReader; public void setRowReader(IExcelRowReader rowReader) { this.rowReader = rowReader; } /** * 讀取excel中的制定索引sheet * * @param path * @param sheetIndex * @throws OpenXML4JException * @throws IOException * @throws SAXException */ public void load(String path, int sheetIndex) throws OpenXML4JException, IOException, SAXException { OPCPackage opcPackage = OPCPackage.open(path); XSSFReader xssfReader = new XSSFReader(opcPackage); //獲取styleTable stylesTable = xssfReader.getStylesTable(); //獲取共享字符串表 SharedStringsTable sharedStringsTable = xssfReader.getSharedStringsTable(); this.sharedStringsTable = sharedStringsTable; //獲取解析器 XMLReader xmlReader = fetchSheetReader(sharedStringsTable); // 根據 rId# 或 rSheet# 查找sheet InputStream sheetInputStream = xssfReader.getSheet(RID_PREFIX + sheetIndex); InputSource sheetInputSource = new InputSource(sheetInputStream); xmlReader.parse(sheetInputSource); //關閉流 closeStream(sheetInputStream); } /** * 遍歷所有的sheet * * @param path * @throws IOException * @throws OpenXML4JException * @throws SAXException */ public void load(String path) throws IOException, OpenXML4JException, SAXException { OPCPackage opcPackage = OPCPackage.open(path); XSSFReader xssfReader = new XSSFReader(opcPackage); stylesTable = xssfReader.getStylesTable(); SharedStringsTable sharedStringsTable = xssfReader.getSharedStringsTable(); this.sharedStringsTable = sharedStringsTable; XMLReader xmlReader = fetchSheetReader(sharedStringsTable); Iterator<InputStream> sheetsInputStream = xssfReader.getSheetsData(); while (sheetsInputStream.hasNext()) { curRow = 0; sheetIndex++; InputStream sheetInputStream = sheetsInputStream.next(); InputSource sheetInputSource = new InputSource(sheetInputStream); xmlReader.parse(sheetInputSource); closeStream(sheetInputStream); } } /** * 獲取sheet的解析器 * * @param sharedStringsTable * @return * @throws SAXException */ public XMLReader fetchSheetReader(SharedStringsTable sharedStringsTable) throws SAXException { XMLReader xmlReader = XMLReaderFactory.createXMLReader(CLASS_SAXPARSER); xmlReader.setContentHandler(this); return xmlReader; } @Override public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException { //c 表示列 if (C_ELEMENT.equals(qName)) { //獲取當前列坐標 String tempCurCoordinate = attributes.getValue(R_ATTR); //前一列為null,則將其設置為"@",A為第一列,ascii碼為65,前一列即為@,ascii碼64 if (preCoordinate == null) { preCoordinate = CELL_FILL_STR; } else { //存在,則前一列要設置為上一列的坐標 preCoordinate = curCoordinate; } //重置當前列 curCoordinate = tempCurCoordinate; //設置單元格類型 setCellType(attributes); //t 屬性值 String type = attributes.getValue(T_ATTR_VALUE); //s 表示該列為字符串 if (S_ATTR_VALUE.equals(type)) { nextIsString = true; } } lastContent = ""; } /** * 設置單元格的類型 * * @param attribute */ private void setCellType(Attributes attribute) { //默認是數字類型 cellDataType = CellDataType.NUMBER; //重置numFmtIndex,numFmtString的值 numFmtIndex = 0; numFmtString = ""; //單元格的格式類型 String cellType = attribute.getValue(T_ATTR_VALUE); //獲取單元格的xf索引,對應style.xml中cellXfs的子元素xf String xfIndexStr = attribute.getValue(S_ATTR_VALUE); if (CellDataType.BOOL.getName().equals(cellType)) { cellDataType = CellDataType.BOOL; } else if (CellDataType.ERROR.getName().equals(cellType)) { cellDataType = CellDataType.ERROR; } else if (CellDataType.INLINESTR.getName().equals(cellType)) { cellDataType = CellDataType.INLINESTR; } else if (CellDataType.FORMULA.getName().equals(cellType)) { cellDataType = CellDataType.FORMULA; } else if (CellDataType.SSTINDEX.getName().equals(cellType)) { cellDataType = CellDataType.SSTINDEX; } if (xfIndexStr != null) { int xfIndex = Integer.parseInt(xfIndexStr); XSSFCellStyle xssfCellStyle = stylesTable.getStyleAt(xfIndex); numFmtIndex = xssfCellStyle.getDataFormat(); numFmtString = xssfCellStyle.getDataFormatString(); if (CellDataType.DATE.getName().equals(numFmtString)) { cellDataType = CellDataType.DATE; numFmtString = DATE_FORMAT; } if (numFmtString == null) { cellDataType = CellDataType.NULL; numFmtString = BuiltinFormats.getBuiltinFormat(numFmtIndex); } } } @Override public void endElement(String uri, String localName, String qName) throws SAXException { String value = StringUtils.trim(lastContent); if (T_ELEMENT.equals(qName)) { rowCellList.add(curCell++, value); } else if (C_ELEMENT.equals(qName)) { value = getDataValue(value); //補全單元格之間的空格 fillBlackCell(curCoordinate, preCoordinate, false); rowCellList.add(curCell++, value); } else { //如果是row標簽,說明已經到了一行的結尾 if (ROW_ELEMENT.equals(qName)) { //最大列坐標以第一行的為准 if (curRow == 0) { maxCellCoordinate = curCoordinate; } //補全一行尾部可能缺失的單元格 if (maxCellCoordinate != null) { fillBlackCell(maxCellCoordinate, curCoordinate, true); } rowReader.getRows(sheetIndex, curRow, rowCellList); //一行結束 //清空rowCellList, rowCellList.clear(); //行數增加 curRow++; //當前列置0 curCell = 0; //置空當前列坐標和前一列坐標 curCoordinate = null; preCoordinate = null; } } } /** * 填充空白單元格 * * @param curCoordinate * @param preCoordinate */ private void fillBlackCell(String curCoordinate, String preCoordinate, boolean isEnd) { if (!curCoordinate.equals(preCoordinate)) { int len = calNullCellCount(curCoordinate, preCoordinate, isEnd); for (int i = 0; i < len; i++) { rowCellList.add(curCell++, ""); } } } /** * 計算當前單元格和前一個單元格之間的空白單元格數量 * 如果是尾部則不減1 * * @param curCoordinate * @param preCoordinate * @return */ private int calNullCellCount(String curCoordinate, String preCoordinate, boolean isEnd) { // excel2007最大行數是1048576,最大列數是16384,最后一列列名是XFD String curCellCoordinate = curCoordinate.replaceAll("\\d+", ""); String preCellCoordinate = preCoordinate.replaceAll("\\d+", ""); curCellCoordinate = fillChar(curCellCoordinate, MAX_CELL_BIT, CELL_FILL_STR); preCellCoordinate = fillChar(preCellCoordinate, MAX_CELL_BIT, CELL_FILL_STR); char[] cur = curCellCoordinate.toCharArray(); char[] pre = preCellCoordinate.toCharArray(); int len = (cur[0] - pre[0]) * 26 * 26 + (cur[1] - pre[1]) * 26 + (cur[2] - pre[2]); if (!isEnd) { len = len - 1; } return len; } /** * 將不足指定位數的字符串補全,高位補上指定字符串 * * @param cellCoordinate * @param maxLen * @param fillChar * @return */ private String fillChar(String cellCoordinate, int maxLen, String fillChar) { int coordinateLen = cellCoordinate.length(); if (coordinateLen < maxLen) { for (int i = 0; i < (maxLen - coordinateLen); i++) { cellCoordinate = fillChar + cellCoordinate; } } return cellCoordinate; } private String getDataValue(String lastContent) { String value = ""; XSSFRichTextString xssfRichTextString = null; switch (cellDataType) { case BOOL: char first = lastContent.charAt(0); value = first == '0' ? "FALSE" : "TRUE"; break; case ERROR: value = "\"ERROR:" + lastContent + '"'; break; case FORMULA: value = '"' + lastContent + '"'; break; case INLINESTR: xssfRichTextString = new XSSFRichTextString(lastContent); value = xssfRichTextString.getString(); xssfRichTextString = null; break; case SSTINDEX: try { int index = Integer.parseInt(lastContent); xssfRichTextString = new XSSFRichTextString(sharedStringsTable.getEntryAt (index)); value = xssfRichTextString.getString(); xssfRichTextString = null; } catch (NumberFormatException e) { value = lastContent; } break; case NUMBER: if (numFmtString != null) { value = dataFormatter.formatRawCellContents(Double.parseDouble(lastContent), numFmtIndex, numFmtString); } else { value = lastContent; } value = value.replace("_", ""); break; case DATE: value = dataFormatter.formatRawCellContents(Double.parseDouble(lastContent), numFmtIndex, numFmtString); break; default: value = ""; break; } return value; } @Override public void characters(char[] ch, int start, int length) throws SAXException { // 得到單元格內容的值 lastContent += new String(ch, start, length); } /** * 關閉流 * * @param inputStream */ public void closeStream(InputStream inputStream) { if (inputStream != null) { try { inputStream.close(); } catch (IOException e) { e.printStackTrace(); } } } public static void main(String[] args) { // System.out.println("m/d/yy" == CellDataType.DATE.getName()); // System.out.println('"' + "ss" + '"'); // char[] letter = "@@A".toCharArray(); // char[] letter_1 = "@@B".toCharArray(); // int res = (letter[0] - letter_1[0]) * 26 * 26 + (letter[1] - letter_1[1]) * 26 + // (letter[2] - letter_1[2]); // System.out.println(res); // System.out.println(Integer.valueOf('@')); IExcelRowReader rowReader = new ExcelRowReader(); try { // ExcelReaderUtil.readExcel(rowReader, // "E://2016-07-04-011940a.xls"); System.out.println("**********************************************"); ExcelReaderUtil.read2007Excel(rowReader, "F:\\studygit\\test.xlsx"); } catch (Exception e) { e.printStackTrace(); } } }