大批量數據插入數據庫實踐

本文轉載自查看原文 2020-07-01 22:20 578 ，批量插入/ mysql

一，背景介紹

　　實際投產過程中，遇到過各種報表或者大數據分析的場景，總不可避免較大量級的數據落庫需求。

二，實現方式

　　1，事務分割提交

　　即開啟事務->插庫->每隔一定數目提交一次

　　2，mybatis的foreach標簽

　　本質上是將字符串拼接到insert語句的values中

三，say nothing without codes

　　1，先介紹事務提交方式。上代碼　　

public int synCustomerByTrans(InputStream inputStreamFromSftp) throws Exception {
        //獲取文件
        String temp = null;
        int row = 0;
        //開啟事務
        SqlSession sqlSession = transManager.openSession();
        try (InputStreamReader isr = new InputStreamReader(inputStreamFromSftp, "GBK");
             BufferedReader reader = new BufferedReader(isr);) {
            ArrayList<Customerinformation> list = new ArrayList<>(2000);
            while ((temp = reader.readLine()) != null) {
                //解析數據
                Customerinformation ci = utilForFillName.convertToCustomer(temp);
                if (ci != null) {
                    row++;
                    cbcMapper.addCustomer(ci, sqlSession);
                }
                if (row % 2000 == 0) {
                    transManager.commit(sqlSession);
                }
            }
            return row;
        } catch (Exception e) {
            e.printStackTrace();
            throw new Exception(row + "");
        } finally {
            transManager.close(sqlSession);
        }
    }

　　代碼非常簡單，即先開啟事務，接着循環讀文件，並將解析的對象插入數據庫中，每隔2000條數據提交一次，最后關閉事務，為了代碼簡單，最后關閉事務的時候會嘗試先提交，避免有多余的數據尚未提交

另外mapper里的代碼非常簡單，這里就不貼了。

　　2，執行效果

　　此次導入數據為6w，每隔對象86個屬性字段。用時40來分鍾，平均速度約250條/秒

　　3，batch方式，上代碼

public int synCustomerByBatch(InputStream inputStreamFromSftp) throws Exception {
        //獲取文件
        String temp = null;
        int row = 0;
        try (InputStreamReader isr = new InputStreamReader(inputStreamFromSftp, "GBK");
             BufferedReader reader = new BufferedReader(isr);) {
            ArrayList<Customerinformation> list = new ArrayList<>(2000);
            while ((temp = reader.readLine()) != null) {
                Customerinformation ci = utilForFillName.convertToCustomer(temp);
                if (ci != null) {
                    row++;
                    list.add(ci);
                }
                if (row % 2000 == 0) {
                    cbcMapper.batchAddCustomer(list);
                    list.clear();
                }
            }
            cbcMapper.batchAddCustomer(list);
            return row;
        } catch (Exception e) {
            e.printStackTrace();
            throw new Exception(row + "");
        }
    }

　　　　代碼也很簡單，解析處理的數據先存放在list中，（list指定容量也是減少resize耗時），然后利用mybatis的foreach插入list，代碼比較簡單也就補貼了。

　　4，執行效果

　　數據同上，6w條數據，每條數據86個字段，耗時50來秒，對的你沒看錯，50來秒，感覺被狠狠鑿了一下

四，總結

　　本來以為事務會很快，沒想到批量更快，仔細分析下，事務開啟，設置回滾點等等耗費資源比較大。values方式拼接就好，只是有點浪費空間。實驗也證實了這點。

　　事務方式占內存大約500M，而后者約占內存800M，使用的時候一定要注意。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 PHP大批量插入數據庫的3種方法和速度對比 oracle之數據同步：Oracle Sql Loader使用說明（大批量快速插入數據庫記錄） jdbc批量插入實現大批量數據快速插入 PHP大批量更新數據，大批量插入數據，mysql批量更新與插入多種方法 C#從文本文件中讀取數據大批量導入數據庫 MSSQL、MySQL 數據庫刪除大批量千萬級百萬級數據的優化 c# 大批量用戶訪問數據庫報錯 Asp.Net Core中使用FTP讀取大文件並使用SqlBulkCopy實現大批量插入SQL SERVER數據庫 POI 導出大批量數據的Excel mysql 導入大批量excel數據