最近公司要從聚石塔上抽取數據,其中有JSON格式數據,所以學習一下Kettle解析JSON,碰到小小問題,記錄一下:
(1)
2015/07/15 15:22:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 - The data structure is not the same inside the resource! We found 2 values for json path [$..title], which is different that the number returned for path [$..buyer_alipay_no] (1 values). We MUST have the same number of values for all paths.
2015/07/15 15:22:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 - The data structure is not the same inside the resource! We found 2 values for json path [$..title], which is different that the number returned for path [$..buyer_alipay_no] (1 values). We MUST have the same number of values for all paths.
2015/07/15 15:22:48 - trade_detail.0 -
原因是由於JSON嵌套內的對象還有同樣的title,所以無法解析。
方法:
在外面這層用javascript解析,然后重命名,繼續往下用JSON控件解析...
JAVASCRIPT代碼:
var data = JSON.parse(trade);
var alipay_id = data.alipay_id;
var buyer_alipay_no = data.buyer_alipay_no;
var seller_nick = data.seller_nick;
var orders1 = data.orders;
var title1 = data.title;
var logistics_company1 = data.title;
這樣只能解決2個同名的JSON結構,當JSON存在3個同名的,1個在外層,2個嵌套在不同的對象里面,還是會報錯。
既然是path問題,那就看看是否可以修改path。
官網上又這樣的路徑,依樣畫葫蘆,OK!
XPath | JSONPath | Description |
/ | $ | the root object/element |
. | @ | the current object/element |
/ | . or [] | child operator |
.. | n/a | parent operator |
// | .. | recursive descent. JSONPath borrows this syntax from E4X. |
* | * | wildcard. All objects/elements regardless their names. |
@ | n/a | attribute access. JSON structures don't have attributes. |
[] | [] | subscript operator. XPath uses it to iterate over element collections and for predicates. In Javascript and JSON it is the native array operator. |
| | [,] | Union operator in XPath results in a combination of node sets. JSONPath allows alternate names or array indices as a set. |
n/a | [start:end:step] | array slice operator borrowed from ES4. |
[] | ?() | applies a filter (script) expression. |
n/a | () | script expression, using the underlying script engine. |
() | n/a | grouping in Xpath |
XPath | JSONPath | Result |
/store/book/author |
$.store.book[*].author |
the authors of all books in the store |
//author |
$..author |
all authors |
/store/* |
$.store.* |
all things in store, which are some books and a red bicycle. |
/store//price |
$.store..price |
the price of everything in the store. |
//book[3] |
$..book[2] |
the third book |
//book[last()] |
$..book[(@.length-1)] $..book[-1:] |
the last book in order. |
//book[position()<3] |
$..book[0,1] $..book[:2] |
the first two books |
//book[isbn] |
$..book[?(@.isbn)] |
filter all books with isbn number |
//book[price<10] |
$..book[?(@.price<10)] |
filter all books cheapier than 10 |
//* |
$..* |
all Elements in XML document. All members of JSON structure. |
官網關於path的URL:http://wiki.pentaho.com/display/EAI/JSON+Input
根據官網提示鏈接到: http://goessner.net/articles/JsonPath/
(2)
2015/07/15 15:30:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:30:48 - trade_detail.0 - org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 -
2015/07/15 15:22:48 - trade_detail.0 -
2015/07/15 15:22:48 - trade_detail.0 -We can not find and data with path [$..logistics_company]
原因:JSON中存在不完整的字段,無法解析。
方法:
1、用javascript腳本讀取內容
2、勾選json控件-->內容-->設置忽視不完整的路徑
(不過這個方式對嵌套多層的無效)
javascript腳本例子:
var json={"status":1,"info":"u83b7u53d6u6210u529f","data":[{ "name":"dingding","information":"電視迷","Region_id":"3","type":"1"},{"id":"5","name":"kenwong","information":"you are ok","Region_id":"5","type":"100"}]}
var data=json.data;
var str1 ="";
var str2 ="";
var str3 ="";
for(i=0;i<data.length;i++)
{
str1 += data[i].name +","
str2 += data[i].information +",";
str3 += data[i].id +",";
}