使用JDBC連接MySQL數據庫進行數據插入的時候,特別是大批量數據連續插入(10W+),如何提高效率呢?
在JDBC編程接口中Statement 有兩個方法特別值得注意:
void addBatch() throws SQLException
Adds a set of parameters to this
PreparedStatement
object's batch of commands.
int[] executeBatch() throws SQLException
Submits a batch of commands to the database for execution and if all commands execute successfully, returns an array of update counts. The
int
elements of the array that is returned are ordered to correspond to the commands in the batch, which are ordered according to the order in which they were added to the batch.
通過使用addBatch()和executeBatch()這一對方法可以實現批量處理數據。
不過值得注意的是,首先需要在數據庫鏈接中設置手動提交,
connection.setAutoCommit(false),然后在執行Statement之后執行
connection.commit()。
- package cyl.demo.ipsearcher;
- import java.io.BufferedReader;
- import java.io.FileInputStream;
- import java.io.IOException;
- import java.io.InputStreamReader;
- import java.sql.Connection;
- import java.sql.DriverManager;
- import java.sql.PreparedStatement;
- import java.sql.SQLException;
- public class DbStoreHelper {
- private String insert_sql;
- private String charset;
- private boolean debug;
- private String connectStr;
- private String username;
- private String password;
- public DbStoreHelper() {
- connectStr = "jdbc:mysql://localhost:3306/db_ip";
- // connectStr += "?useServerPrepStmts=false&rewriteBatchedStatements=true";
- insert_sql = "INSERT INTO tb_ipinfos (iplong1,iplong2,ipstr1,ipstr2,ipdesc) VALUES (?,?,?,?,?)";
- charset = "gbk";
- debug = true;
- username = "root";
- password = "***";
- }
- public void storeToDb(String srcFile) throws IOException {
- BufferedReader bfr = new BufferedReader(new InputStreamReader(new FileInputStream(srcFile), charset));
- try {
- doStore(bfr);
- } catch (Exception e) {
- e.printStackTrace();
- } finally {
- bfr.close();
- }
- }
- private void doStore(BufferedReader bfr) throws ClassNotFoundException, SQLException, IOException {
- Class.forName("com.mysql.jdbc.Driver");
- Connection conn = DriverManager.getConnection(connectStr, username,password);
- conn.setAutoCommit(false); // 設置手動提交
- int count = 0;
- PreparedStatement psts = conn.prepareStatement(insert_sql);
- String line = null;
- while (null != (line = bfr.readLine())) {
- String[] infos = line.split(";");
- if (infos.length < 5) continue;
- if (debug) {
- System.out.println(line);
- }
- psts.setLong(1, Long.valueOf(infos[0]));
- psts.setLong(2, Long.valueOf(infos[1]));
- psts.setString(3, infos[2]);
- psts.setString(4, infos[3]);
- psts.setString(5, infos[4]);
- psts.addBatch(); // 加入批量處理
- count++;
- }
- psts.executeBatch(); // 執行批量處理
- conn.commit(); // 提交
- System.out.println("All down : " + count);
- conn.close();
- }
- }
執行完成以后:
- All down : 103498
- Convert finished.
- All spend time/s : 47
這個效率仍然不高,似乎沒有達到想要的效果,需要進一步改進。
在MySQL JDBC連接字符串中還可以加入參數,
rewriteBatchedStatements=true,mysql默認關閉了batch處理,通過此參數進行打開,這個參數可以重寫向數據庫提交的SQL語句,具體參見:http://www.cnblogs.com/chenjianjx/archive/2012/08/14/2637914.html
useServerPrepStmts=false,如果不開啟(useServerPrepStmts=false),使用com.mysql.jdbc.PreparedStatement進行本地SQL拼裝,最后送到db上就是已經替換了?后的最終SQL.
在此稍加改進,連接字符串中加入下面語句(代碼構造方法中去掉注釋):
connectStr += "?useServerPrepStmts=false&rewriteBatchedStatements=true";
再次執行如下:
- All down : 103498
- Convert finished.
- All spend time/s : 10
同樣的數據量,這次執行只花費了10秒 ,處理效率大大提高.