解析超大JSON文件
1、需求
最近項目中需要將一個一個大於800M的JSON文件導出到Excel中,試過普通的按行讀取文件和JSONReader流讀取文件,由於JSON文件實在過於龐大,導致OOM問題
2、解決方案
每個json數組中包含的json對象太多,導致用流和按行讀取時加載到內存會導致內存溢出。.
最終采用了JsonToken的解決方案。
package com.godfrey.poi.util;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.MappingJsonFactory;
import java.io.File;
/**
* @author godfrey
* @since 2021-12-05
*/
public class ParseJsonUtil {
public static void main(String[] args) throws Exception {
JsonFactory f = new MappingJsonFactory();
JsonParser jp = f.createJsonParser(new File("F:/FeaturesToJSON.json"));
JsonToken current;
current = jp.nextToken();
if (current != JsonToken.START_OBJECT) {
System.out.println("Error: root should be object: quiting.");
return;
}
while (jp.nextToken() != JsonToken.END_OBJECT) {
String fieldName = jp.getCurrentName();
// move from field name to field value
current = jp.nextToken();
if ("features".equals(fieldName)) {
if (current == JsonToken.START_ARRAY) {
// For each of the records in the array
while (jp.nextToken() != JsonToken.END_ARRAY) {
// read the record into a tree model,
// this moves the parsing position to the end of it
JsonNode node = jp.readValueAsTree();
// And now we have random access to everything in the object
System.out.println("field1: " + node.get("field1").asText());
System.out.println("field2: " + node.get("field2").asText());
}
} else {
System.out.println("Error: records should be an array: skipping.");
jp.skipChildren();
}
} else {
System.out.println("Unprocessed property: " + fieldName);
jp.skipChildren();
}
}
}
}
代碼中使用流和樹模型解析的組合讀取此文件。 每個單獨的記錄都以樹形結構讀取,但文件永遠不會完整地讀入內存,因此JVM內存不會爆炸。最終解決了讀取超大文件的問題。