java學習筆記——jsp簡單方法讀取txt文本數據

本文轉載自查看原文 2012-04-21 19:58 3072 java/ jsp/ 文本讀取

該方法不需要數據庫和excel插件，程序簡單，實現速度快。

目標：如下面的txt文檔有200多個，每個txt文檔都有20條不規則記錄，需要將每個文檔中的每條數據保存到excel中。

這些txt是從網站中保存下來的，由於一些網站要驗證session和ip，所以不是很好實現網上抓取，就對下載下來的文本文件進行處理，以后再研究網上抓取的過程。

文本片段例子：

HIGHLY CITED PAPERS FOR (PEOPLES R CHINA)

Sorted by: Citations Publication Year Journal Title

881 - 900 (of 6910) [ 41 | 42 | 43 | 44 | 45 | 46 | 47 |

48 | 49 | 50 ]

Page 45 of 346

881 Citations: 3

Title:GEVREY HYPOELLIPTICITY FOR A CLASS OF KINETIC EQUATIONS

Authors:CHEN H; LI WX; XU CJ

Source:COMMUN PART DIFF EQUAT 36 (4): 693-728 2011

Addresses:Wuhan Univ, Sch Math & Stat, Wuhan 430072, Peoples R

China.

Univ Rouen, CNRS, UMR 6085, St Etienne, France.

Field:MATHEMATICS

882 Citations: 3

Title:HIGHER AUSLANDER ALGEBRAS ADMITTING TRIVIAL MAXIMAL ORTHOGONAL

SUBCATEGORIES

Authors:HUANG ZY; ZHANG XJ

Source:J ALGEBRA 330 (1): 375-387 MAR 15 2011

Addresses:Nanjing Univ, Dept Math, Nanjing 210093, Jiangsu Prov,

Peoples R China.

Nanjing Univ Informat Sci & Technol, Coll Math & Phys, Nanjing

210044, Jiangsu Prov, Peoples R China.

Field:MATHEMATICS

883 Citations: 3

Title:PREDATOR-PREY SYSTEM WITH STRONG ALLEE EFFECT IN PREY

例子中紅色字體為需要抽取的字段，遇到的問題有：

每一頁都有多條記錄，如何區分各個記錄；
每個字段的行數都不一樣，如何確定其行數。

解決方法：

用兩次讀取來解決問題

1、第一次讀取，用readline()方法讀取文本，用startWith()方法判斷開頭字符，如果為需要抽取的字段，則將其行號記錄到該字段的數組中；

2、第二次讀取，用for循環控制，每一次循環對於一條記錄，判斷行號是否和該字段數組中的記錄相等，如果相等則讀取文本，讀到下一個字段的開頭結束。

程序實現：

  1 <%@ page language="java" contentType="text/html;charset=UTF-8" pageEncoding="UTF-8"%>
  2 <%@ page import="java.util.*,java.io.*,java.util.regex.*" %>
  3 
  4 <html>
  5 <body>
  6 
  7 <font size="4" face="arial" color="black">
  8 <%
  9 
 10 try{
 11 
 12   // 讀取的文件數dataNum
 13  for(int dataNum =45; dataNum<=262; dataNum++){
 14 
 15    String txtpath=this.getClass().getResource("/").getPath()+dataNum+".txt";
 16    //out.println(txtpath);
 17    
 18 
 19    
 20    int lineNum=0;  //記錄行數
 21    
 22    int [] citationsNum= new int[20];  //記錄citations字段所在的行數
 23    int [] titleNum = new int[20];     //記錄title字段所在的行數
 24    int [] authorNum= new int[20];     //記錄author字段所在的行數
 25    int [] sourceNum  = new  int[20];   //記錄source字段所在的行數
 26    int [] addressesNum = new int[20];   //記錄address字段所在的行數
 27    int [] fieldNum   = new int[20];    //記錄field字段所在的行數
 28 
 29    //記錄上述每個字段數組的序號
 30    int   cNum=0;  
 31    int   tNum=0;
 32    int   aNum=0;
 33    int   sNum=0;
 34    int   adNum=0;
 35    int   fNum=0;
 36 
 37    Pattern p = Pattern.compile("Citations:");    //由於Citations不是開頭，需要用正則表達式判斷
 38    
 39    //以行讀取文件內容
 40    BufferedReader br = new BufferedReader(new FileReader(txtpath));
 41 
 42      String record = new String();
 43      String inputText = null;
 44      while ((record = br.readLine()) != null) {  
 45          
 46               lineNum++; //讀取一行行號+1
 47               
 48               //記錄citations行號
 49               Matcher m = p.matcher(record);
 50               if(m.find()){
 51                 citationsNum [cNum++]= lineNum;
 52               }
 53               
 54               //記錄title行號
 55               if((record.replace(" ","")).startsWith("Title:")){
 56                 titleNum [tNum++]= lineNum;
 57               }
 58               
 59               //記錄Author行號
 60               if((record.replace(" ","")).startsWith("Authors:")){
 61                 authorNum [aNum++]= lineNum;
 62               }
 63               
 64               //記錄Source行號
 65               if((record.replace(" ","")).startsWith("Source:")){
 66                 sourceNum [sNum++]= lineNum;
 67               }
 68               
 69               //記錄Address行號
 70               if((record.replace(" ","")).startsWith("Addresses:")){
 71                 addressesNum [adNum++]= lineNum;
 72               }
 73               
 74               //記錄Field行號
 75               if((record.replace(" ","")).startsWith("Field:")){
 76                 fieldNum [fNum++]= lineNum;
 77               }
 78  
 79      } //第一次讀取結束，每個字段的行號已經記錄到數組中
 80      
 81  
 82     
 83     //第二次讀取開始，並輸出表格
 84     
 85                String CitationsValue = new String(); //記錄讀取的字段的內容，下同
 86                String TitleValue = new String();
 87                String AuthorValue =new String();
 88                String SourceValue = new String();
 89                String AddressValue = new String();
 90                String fieldValue = new String();
 91                int lineNum2;//第二次讀取時的行號
 92                
 93          BufferedReader br2 =null; 
 94 %>
 95             
 96             <table border=1 cellspacing="0">
 97                 
 98 <%
 99                       //因為每一個txt文件有20條記錄
100                       for(int i=0; i < 20; i++){  
101                            lineNum2=0;  //開始讀取並初始化
102                            TitleValue = "";
103                                        AuthorValue ="";
104                                        SourceValue = "";
105                                        AddressValue ="";
106                                        fieldValue = "";
107                                        CitationsValue = "";
108                                        fieldValue = "";
109                                        
110                              br2 = new BufferedReader(new FileReader(txtpath));
111                              while ((record = br2.readLine()) != null) { 
112                              
113                                   lineNum2++;
114                                
115                                   if( lineNum2>= citationsNum[i] && lineNum2<titleNum[i]){  //根據citationsNum所處的位置讀取
116                             
117                              CitationsValue = CitationsValue+record;     //讀取的值存到字段中                    
118                             }
119 
120                             if( lineNum2>= titleNum[i] && lineNum2<authorNum[i]){
121                             
122                              TitleValue = TitleValue+record;                             
123                             }
124                             
125                             if( lineNum2>= authorNum[i] && lineNum2<sourceNum[i]){
126                             
127                              AuthorValue = AuthorValue+record;                             
128                             }
129                             
130                             if( lineNum2>= sourceNum[i] && lineNum2<addressesNum[i]){
131                             
132                              SourceValue = SourceValue+record;                             
133                             }
134                             
135                              if( lineNum2 >= addressesNum[i]&& lineNum2<fieldNum[i]) {
136                             
137                              AddressValue = AddressValue+record;                             
138                             }
139                             
140                             if( lineNum2 == fieldNum[i]) {
141                             
142                              fieldValue = fieldValue+record;                           
143                             }
144                   
145                         } //每一次循環都輸出一次表格
146  %>
147 
148                             <tr>
149                                 <td><%=CitationsValue%></td>
150                                 <td><%=TitleValue%></td>
151                                 <td><%=AuthorValue%></td>
152                                 <td><%=SourceValue%></td>
153                                 <td><%=AddressValue%></td>
154                                 <td><%=fieldValue%></td>
155                             </tr>
156 
157 <%
158     
159                } //讀取結束，表格輸出完畢
160                    
161 %>                 
162                         </table>
163 <%
164                
165         br.close();
166         br2.close();
167         
168  } // 所有文件的for循環結束
169  
170 }catch(Exception e){
171     e.printStackTrace();
172 }
173 
174 %>
175 
176 </font></p>
177 
178 </body>
179 </html>

運行結果：

將上面表格內容復制到excel即可。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 簡單紀要：java 從txt文本中讀取數據 java 讀取CSV數據並寫入txt文本【Python】將不規則txt文本數據讀取並轉換為csv文本 C# 讀取TXT文本數據添加到數據庫 AJAX獲取本地的txt文本數據 mysql導入txt文本數據 C#實現把txt文本數據快速讀取到excel中 mysql導入txt文本數據 C# 實現txt文本數據轉換為array二維數組方法 Java讀取文本數字