官方提供了下面的樣例進行嵌套json扁平化:
{ "timestamp": "2015-09-12T12:10:53.155Z", "dim1": "qwerty", "dim2": "asdf", "dim3": "zxcv", "ignore_me": "ignore this", "metrica": 9999, "foo": {"bar": "abc"}, "foo.bar": "def", "nestmet": {"val": 42}, "hello": [1.0, 2.0, 3.0, 4.0, 5.0], "mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}], "world": [{"hey": "there"}, {"tree": "apple"}], "thing": {"food": ["sandwich", "pizza"]} }
我對這個樣例進行了批量,並傳輸至kakfa中,截取一小段:
{"timestamp": "2018-12-20T14:12:39","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:39","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:40","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:40","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:40","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:41","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:41","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}} {"timestamp": "2018-12-20T14:12:41","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
官方給的解析方式:
"parseSpec": { "format": "json", "flattenSpec": { "useFieldDiscovery": true, "fields": [ { "type": "root", "name": "dim1" }, "dim2", { "type": "path", "name": "foo.bar", "expr": "$.foo.bar" }, { "type": "root", "name": "foo.bar" }, { "type": "path", "name": "path-metric", "expr": "$.nestmet.val" }, { "type": "path", "name": "hello-0", "expr": "$.hello[0]" }, { "type": "path", "name": "hello-4", "expr": "$.hello[4]" }, { "type": "path", "name": "world-hey", "expr": "$.world[0].hey" }, { "type": "path", "name": "worldtree", "expr": "$.world[1].tree" }, { "type": "path", "name": "first-food", "expr": "$.thing.food[0]" }, { "type": "path", "name": "second-food", "expr": "$.thing.food[1]" }, { "type": "jq", "name": "first-food-by-jq", "expr": ".thing.food[1]" }, { "type": "jq", "name": "hello-total", "expr": ".hello | sum" } ] }, "dimensionsSpec" : { "dimensions" : [], "dimensionsExclusions": ["ignore_me"] }, "timestampSpec" : { "format" : "auto", "column" : "timestamp" } }
生成數據源后,發現不讀取kafka數據,檢查發現拉取數據的進程失敗了,原因是有相同字段field出現:
{ "type": "path", "name": "foo.bar", "expr": "$.foo.bar" }, { "type": "root", "name": "foo.bar" },
修改為這樣重啟:
{ "type": "path", "name": "foo-bar", "expr": "$.foo.bar" }, { "type": "root", "name": "foo.bar" },
重啟后日志仍然報錯,原因是jq沒有sum函數:
"type": "jq", "name": "hello-total", "expr": ".hello | sum" }
去掉后重啟恢復正常。
找到原因后,又測試了3層,4層嵌套,都能扁平化flatten,沒找到不確定長度數組怎么添加field key。
jq的函數沒找到怎么數組sum求和。
jackson-jq github: https://github.com/eiiches/jackson-jq
jackson-jq 官網:https://stedolan.github.io/jq/
json-path github:https://github.com/json-path/JsonPath
官方路徑:http://druid.io/docs/latest/ingestion/flatten-json.html