場景和痛點
說明
今天因為一個老同學找我,說自己公司的物流業務都是現在用excel處理,按月因為數據量大,一個excel差不多有百萬數據,文件有接近100M,打開和搜索就相當的慢聯想到場景:要導入數據,可能excel數據量很大,這里利用常用的一些方法比如phpexcel會常有時間和內存限制問題
下面我們就利用一個利用流處理的類庫SpreadsheetReader來做大excel的讀取
編寫過程
說明
關鍵具體在代碼里注釋
代碼
<?php
/**
* Created by PhpStorm.
* User: qkl
* Date: 2018/7/11
* Time: 15:14
*/
set_time_limit(0); // 設置腳本最大執行時間 為0 永不過期
//ini_set('memory_limit','200M'); // 臨時設置最大內存占用
function convert($size)
{
$unit = array('b', 'kb', 'mb', 'gb', 'tb', 'pb');
return @round($size / pow(1024, ($i = floor(log($size, 1024)))), 2) . ' ' . $unit[$i];
}
require '../vendor/autoload.php';
$start = memory_get_usage();
echo convert($start) . PHP_EOL;
//$inputFileName = './11111111.xlsx';
$inputFileName = './example1.xlsx';
// If you need to parse XLS files, include php-excel-reader
$startTime = microtime(true);
$Reader = new SpreadsheetReader($inputFileName);
//獲取當前文件所有的工作表
$sheets = $Reader->Sheets();
if (!$sheets) {
die("沒有工作表");
}
//改變當前處理的工作表
$Reader->ChangeSheet(0);
//打印當前所在工作表的當前所在行數據
var_dump($Reader->current());
//因為reader類集成了Iter所以可以用迭代方式處理
//這里提醒 如果文件超大,這邊的處理速度會過慢,不過不會引發內存性能問題
//$i = 0;
//foreach ($Reader as $Row)
//{
// if ($i>=3) {
// break;
// }
//
// echo $i . PHP_EOL;
// print_r($Row);
//
// $i++;
//}
$endTime = microtime(true);
$memoryUse = memory_get_usage();
echo "內存占用:" . convert($memoryUse) . "; 用時:" . ($endTime - $startTime) . PHP_EOL;
結果
測試說明
上面讀取的example1.xlsx文件有100M左右,讀寫過慢,測試只開了讀取當前默認工作表的當前所在行數據
因數據敏感,已做屏蔽
日志記錄內存使用率
147.77 kb
array (size=50)
0 => string 'xxxxxxxxxxxxxx' (length=25)
1 => string 'xxxxxxxxxxxxxx' (length=15)
2 => string 'xxxxxxxxxxxxxx' (length=18)
3 => string 'xxxxxxxxxxxxxx' (length=12)
4 => string 'xxxxxxxxxxxxxx' (length=12)
5 => string 'xxxxxxxxxxxxxx' (length=12)
6 => string 'xxxxxxxxxxxxxx' (length=24)
7 => string 'xxxxxxxxxxxxxx' (length=12)
8 => string 'xxxxxxxxxxxxxx' (length=27)
9 => string 'xxxxxxxxxxxxxx' (length=12)
10 => string 'xxxxxxxxxxxxxx' (length=15)
11 => string 'xxxxxxxxxxxxxx' (length=28)
12 => string 'xxxxxxxxxxxxxx' (length=9)
13 => string 'xxxxxxxxxxxxxx' (length=12)
14 => string 'xxxxxxxxxxxxxx' (length=9)
15 => string 'xxxxxxxxxxxxxx' (length=6)
16 => string 'xxxxxxxxxxxxxx' (length=9)
17 => string 'xxxxxxxxxxxxxx' (length=3)
18 => string 'xxxxxxxxxxxxxx' (length=6)
19 => string 'xxxxxxxxxxxxxx' (length=3)
20 => string 'xxxxxxxxxxxxxx' (length=15)
21 => string 'xxxxxxxxxxxxxx' (length=15)
22 => string 'xxxxxxxxxxxxxx' (length=19)
23 => string 'xxxxxxxxxxxxxx' (length=13)
24 => string 'xxxxxxxxxxxxxx' (length=19)
25 => string 'xxxxxxxxxxxxxx' (length=12)
26 => string 'xxxxxxxxxxxxxx' (length=12)
27 => string 'xxxxxxxxxxxxxx' (length=12)
28 => string 'xxxxxxxxxxxxxx' (length=6)
29 => string 'xxxxxxxxxxxxxx' (length=12)
30 => string 'xxxxxxxxxxxxxx' (length=6)
31 => string 'xxxxxxxxxxxxxx' (length=15)
32 => string 'xxxxxxxxxxxxxx' (length=24)
33 => string 'xxxxxxxxxxxxxx' (length=18)
34 => string 'xxxxxxxxxxxxxx' (length=18)
35 => string 'xxxxxxxxxxxxxx' (length=24)
36 => string 'xxxxxxxxxxxxxx' (length=12)
37 => string 'xxxxxxxxxxxxxx' (length=18)
38 => string 'xxxxxxxxxxxxxx' (length=21)
39 => string 'xxxxxxxxxxxxxx' (length=9)
40 => string 'xxxxxxxxxxxxxx' (length=9)
41 => string 'xxxxxxxxxxxxxx' (length=18)
42 => string 'xxxxxxxxxxxxxx' (length=21)
43 => string 'xxxxxxxxxxxxxx' (length=15)
44 => string 'xxxxxxxxxxxxxx' (length=12)
45 => string 'xxxxxxxxxxxxxx' (length=6)
46 => string 'xxxxxxxxxxxxxx' (length=12)
47 => string 'xxxxxxxxxxxxxx' (length=22)
48 => string 'xxxxxxxxxxxxxx' (length=22)
49 => string '' (length=0)
內存占用:207.55 kb; 用時:9.5835480690002