PHP讀取超大的excel文件數據的方案


場景和痛點

說明

今天因為一個老同學找我,說自己公司的物流業務都是現在用excel處理,按月因為數據量大,一個excel差不多有百萬數據,文件有接近100M,打開和搜索就相當的慢

聯想到場景:要導入數據,可能excel數據量很大,這里利用常用的一些方法比如phpexcel會常有時間和內存限制問題

下面我們就利用一個利用流處理的類庫SpreadsheetReader來做大excel的讀取

編寫過程

說明

關鍵具體在代碼里注釋

代碼


<?php
/**
 * Created by PhpStorm.
 * User: qkl
 * Date: 2018/7/11
 * Time: 15:14
 */

set_time_limit(0);   // 設置腳本最大執行時間 為0 永不過期
//ini_set('memory_limit','200M');    // 臨時設置最大內存占用

function convert($size)
{
    $unit = array('b', 'kb', 'mb', 'gb', 'tb', 'pb');
    return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}

require '../vendor/autoload.php';

$start = memory_get_usage();
echo convert($start) . PHP_EOL;
//$inputFileName = './11111111.xlsx';
$inputFileName = './example1.xlsx';

// If you need to parse XLS files, include php-excel-reader

$startTime = microtime(true);

$Reader = new SpreadsheetReader($inputFileName);

//獲取當前文件所有的工作表
$sheets = $Reader->Sheets();
if (!$sheets) {
    die("沒有工作表");
}

//改變當前處理的工作表
$Reader->ChangeSheet(0);

//打印當前所在工作表的當前所在行數據
var_dump($Reader->current());

//因為reader類集成了Iter所以可以用迭代方式處理
//這里提醒 如果文件超大,這邊的處理速度會過慢,不過不會引發內存性能問題
//$i = 0;
//foreach ($Reader as $Row)
//{
//    if ($i>=3) {
//        break;
//    }
//
//    echo $i . PHP_EOL;
//    print_r($Row);
//
//    $i++;
//}

$endTime = microtime(true);
$memoryUse = memory_get_usage();

echo "內存占用:" . convert($memoryUse) . "; 用時:" . ($endTime - $startTime) . PHP_EOL;

結果

測試說明

上面讀取的example1.xlsx文件有100M左右,讀寫過慢,測試只開了讀取當前默認工作表的當前所在行數據
因數據敏感,已做屏蔽

日志記錄內存使用率


147.77 kb
array (size=50)
  0 => string 'xxxxxxxxxxxxxx' (length=25)
  1 => string 'xxxxxxxxxxxxxx' (length=15)
  2 => string 'xxxxxxxxxxxxxx' (length=18)
  3 => string 'xxxxxxxxxxxxxx' (length=12)
  4 => string 'xxxxxxxxxxxxxx' (length=12)
  5 => string 'xxxxxxxxxxxxxx' (length=12)
  6 => string 'xxxxxxxxxxxxxx' (length=24)
  7 => string 'xxxxxxxxxxxxxx' (length=12)
  8 => string 'xxxxxxxxxxxxxx' (length=27)
  9 => string 'xxxxxxxxxxxxxx' (length=12)
  10 => string 'xxxxxxxxxxxxxx' (length=15)
  11 => string 'xxxxxxxxxxxxxx' (length=28)
  12 => string 'xxxxxxxxxxxxxx' (length=9)
  13 => string 'xxxxxxxxxxxxxx' (length=12)
  14 => string 'xxxxxxxxxxxxxx' (length=9)
  15 => string 'xxxxxxxxxxxxxx' (length=6)
  16 => string 'xxxxxxxxxxxxxx' (length=9)
  17 => string 'xxxxxxxxxxxxxx' (length=3)
  18 => string 'xxxxxxxxxxxxxx' (length=6)
  19 => string 'xxxxxxxxxxxxxx' (length=3)
  20 => string 'xxxxxxxxxxxxxx' (length=15)
  21 => string 'xxxxxxxxxxxxxx' (length=15)
  22 => string 'xxxxxxxxxxxxxx' (length=19)
  23 => string 'xxxxxxxxxxxxxx' (length=13)
  24 => string 'xxxxxxxxxxxxxx' (length=19)
  25 => string 'xxxxxxxxxxxxxx' (length=12)
  26 => string 'xxxxxxxxxxxxxx' (length=12)
  27 => string 'xxxxxxxxxxxxxx' (length=12)
  28 => string 'xxxxxxxxxxxxxx' (length=6)
  29 => string 'xxxxxxxxxxxxxx' (length=12)
  30 => string 'xxxxxxxxxxxxxx' (length=6)
  31 => string 'xxxxxxxxxxxxxx' (length=15)
  32 => string 'xxxxxxxxxxxxxx' (length=24)
  33 => string 'xxxxxxxxxxxxxx' (length=18)
  34 => string 'xxxxxxxxxxxxxx' (length=18)
  35 => string 'xxxxxxxxxxxxxx' (length=24)
  36 => string 'xxxxxxxxxxxxxx' (length=12)
  37 => string 'xxxxxxxxxxxxxx' (length=18)
  38 => string 'xxxxxxxxxxxxxx' (length=21)
  39 => string 'xxxxxxxxxxxxxx' (length=9)
  40 => string 'xxxxxxxxxxxxxx' (length=9)
  41 => string 'xxxxxxxxxxxxxx' (length=18)
  42 => string 'xxxxxxxxxxxxxx' (length=21)
  43 => string 'xxxxxxxxxxxxxx' (length=15)
  44 => string 'xxxxxxxxxxxxxx' (length=12)
  45 => string 'xxxxxxxxxxxxxx' (length=6)
  46 => string 'xxxxxxxxxxxxxx' (length=12)
  47 => string 'xxxxxxxxxxxxxx' (length=22)
  48 => string 'xxxxxxxxxxxxxx' (length=22)
  49 => string '' (length=0)

內存占用:207.55 kb; 用時:9.5835480690002

原文地址:https://segmentfault.com/a/1190000015601758


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM