druid json格式數據源,多層嵌套官方樣例測試,發現兩處語法錯誤bug


官方提供了下面的樣例進行嵌套json扁平化:

{
 "timestamp": "2015-09-12T12:10:53.155Z",
 "dim1": "qwerty",
 "dim2": "asdf",
 "dim3": "zxcv",
 "ignore_me": "ignore this",
 "metrica": 9999,
 "foo": {"bar": "abc"},
 "foo.bar": "def",
 "nestmet": {"val": 42},
 "hello": [1.0, 2.0, 3.0, 4.0, 5.0],
 "mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],
 "world": [{"hey": "there"}, {"tree": "apple"}],
 "thing": {"food": ["sandwich", "pizza"]}
}
我對這個樣例進行了批量,並傳輸至kakfa中,截取一小段:
{"timestamp": "2018-12-20T14:12:39","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:39","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:40","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:40","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:40","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:41","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:41","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
{"timestamp": "2018-12-20T14:12:41","dim1": "qwerty","dim2": "asdf","dim3": "zxcv","ignore_me": "ignore this","metrica": 9999,"foo": {"bar": "abc"},"foo.bar": "def","nestmet":{"val": 42},"hello": [1.0, 2.0, 3.0, 4.0, 5.0],"mixarray": [1.0, 2.0, 3.0, 4.0, {"last": 5}],"world": [{"hey": "there"}, {"tree": "apple"}],"thing": {"food": ["sandwich", "pizza"]}}
官方給的解析方式:
"parseSpec": {
  "format": "json",
  "flattenSpec": {
    "useFieldDiscovery": true,
    "fields": [
      {
        "type": "root",
        "name": "dim1"
      },
      "dim2",
      {
        "type": "path",
        "name": "foo.bar",
        "expr": "$.foo.bar"
      },
      {
        "type": "root",
        "name": "foo.bar"
      },
      {
        "type": "path",
        "name": "path-metric",
        "expr": "$.nestmet.val"
      },
      {
        "type": "path",
        "name": "hello-0",
        "expr": "$.hello[0]"
      },
      {
        "type": "path",
        "name": "hello-4",
        "expr": "$.hello[4]"
      },
      {
        "type": "path",
        "name": "world-hey",
        "expr": "$.world[0].hey"
      },
      {
        "type": "path",
        "name": "worldtree",
        "expr": "$.world[1].tree"
      },
      {
        "type": "path",
        "name": "first-food",
        "expr": "$.thing.food[0]"
      },
      {
        "type": "path",
        "name": "second-food",
        "expr": "$.thing.food[1]"
      },
      {
        "type": "jq",
        "name": "first-food-by-jq",
        "expr": ".thing.food[1]"
      },
      {
        "type": "jq",
        "name": "hello-total",
        "expr": ".hello | sum"
      }
    ]
  },
  "dimensionsSpec" : {
   "dimensions" : [],
   "dimensionsExclusions": ["ignore_me"]
  },
  "timestampSpec" : {
   "format" : "auto",
   "column" : "timestamp"
  }
}

生成數據源后,發現不讀取kafka數據,檢查發現拉取數據的進程失敗了,原因是有相同字段field出現:

      {
        "type": "path",
        "name": "foo.bar",
        "expr": "$.foo.bar"
      },
      {
        "type": "root",
        "name": "foo.bar"
      },

修改為這樣重啟:

      {
        "type": "path",
        "name": "foo-bar",
        "expr": "$.foo.bar"
      },
      {
        "type": "root",
        "name": "foo.bar"
      },

重啟后日志仍然報錯,原因是jq沒有sum函數:

        "type": "jq",
        "name": "hello-total",
        "expr": ".hello | sum"
      }

去掉后重啟恢復正常。

找到原因后,又測試了3層,4層嵌套,都能扁平化flatten,沒找到不確定長度數組怎么添加field key。

jq的函數沒找到怎么數組sum求和。

jackson-jq  github: https://github.com/eiiches/jackson-jq

jackson-jq  官網:https://stedolan.github.io/jq/

json-path github:https://github.com/json-path/JsonPath

官方路徑:http://druid.io/docs/latest/ingestion/flatten-json.html

 

 

  


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM