https://blog.csdn.net/feiying0canglang/article/details/121562890
http://www.manongjc.com/detail/26-asnfhftlcafxjai.html
網上看了很多,發現對於Skywalking支持哪些指標名稱metrics,官方文檔跟博客幾乎都是指明了一個路徑,沒有人詳細的解釋,支持哪些指標,這些指標的作用又有什么作用,導致大家自定義指標的時候有很多困難。
所以這里給大家總結下,如有錯誤,及時指正:
Skywalking的oap指標存放在:/apache-skywalking-apm-bin-es78/config/oal/*.oap 目錄下
先來看第一個oap文件:
core.oal
1 / All scope metrics 2 all_percentile = from(All.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99 3 all_heatmap = from(All.latency).histogram(100, 20); // 4 5 // Service scope metrics 服務 6 service_resp_time = from(Service.latency).longAvg(); // 服務的平均響應時間 7 service_sla = from(Service.*).percent(status == true); // 服務的請求成功率 8 service_cpm = from(Service.*).cpm(); //服務的每分鍾調用次數 9 service_percentile = from(Service.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99 10 service_apdex = from(Service.latency).apdex(name, status); // 服務的應用性能指標,apdex的衡量的是衡量滿意的響應時間與不滿意的響應時間的比率,默認的請求滿意時間是500ms 11 12 // Service relation scope metrics for topology 服務與服務間調用的調用度量指標 13 service_relation_client_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();//在客戶端檢測到的每分鍾調用次數 14 service_relation_server_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();//在服務端檢測到的每分鍾調用的次數 15 service_relation_client_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);//在客戶端檢測到成功率 16 service_relation_server_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);//在服務端檢測到的成功率 17 service_relation_client_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();//在客戶端檢測到的平均響應時間 18 service_relation_server_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();//在服務端檢測到的平均響應時間 19 service_relation_client_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99 20 service_relation_server_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99 21 22 // Service Instance relation scope metrics for topology 服務實例與服務實例之間的調用度量指標 23 service_instance_relation_client_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();//在客戶端實例檢測到的每分鍾調用次數 24 service_instance_relation_server_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();//在服務端實例檢測到的每分鍾調用次數 25 service_instance_relation_client_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);//在客戶端實例檢測到的成功率 26 service_instance_relation_server_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);//在服務端實例檢測到的成功率 27 service_instance_relation_client_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();//在客戶端實例檢測到的平均響應時間 28 service_instance_relation_server_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();//在服務端實例檢測到的平均響應時間 29 service_instance_relation_client_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99 30 service_instance_relation_server_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99 31 32 // Service Instance Scope metrics 33 service_instance_sla = from(ServiceInstance.*).percent(status == true);//服務實例的成功率 34 service_instance_resp_time= from(ServiceInstance.latency).longAvg();//服務實例的平均響應時間 35 service_instance_cpm = from(ServiceInstance.*).cpm();//服務實例的每分鍾調用次數 36 37 // Endpoint scope metrics 38 endpoint_cpm = from(Endpoint.*).cpm();//端點的每分鍾調用次數 39 endpoint_avg = from(Endpoint.latency).longAvg();//端口平均響應時間 40 endpoint_sla = from(Endpoint.*).percent(status == true);//端點的成功率 41 endpoint_percentile = from(Endpoint.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99 42 43 // Endpoint relation scope metrics 44 endpoint_relation_cpm = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();//在服務端端點檢測到的每分鍾調用次數 45 endpoint_relation_resp_time = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).longAvg();//在服務端檢測到的rpc調用的平均耗時 46 endpoint_relation_sla = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);//在服務端檢測到的請求成功率 47 endpoint_relation_percentile = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99 48 49 database_access_resp_time = from(DatabaseAccess.latency).longAvg();//數據庫的處理平均響應時間 50 database_access_sla = from(DatabaseAccess.*).percent(status == true);//數據庫的請求成功率 51 database_access_cpm = from(DatabaseAccess.*).cpm();//數據庫的每分鍾調用次數 52 database_access_percentile = from(DatabaseAccess.latency).percentile(10);
java-agent.oal
// JVM instance metrics instance_jvm_cpu = from(ServiceInstanceJVMCPU.usePercent).doubleAvg();//jvm 平均cpu耗時百分比 instance_jvm_memory_heap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == true).longAvg();//jvm 堆空間的平均使用空間 instance_jvm_memory_noheap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == false).longAvg();//jvm 非堆空間的平均使用空間 instance_jvm_memory_heap_max = from(ServiceInstanceJVMMemory.max).filter(heapStatus == true).longAvg();//jvm 最大堆內存的平均值 instance_jvm_memory_noheap_max = from(ServiceInstanceJVMMemory.max).filter(heapStatus == false).longAvg();//jvm 最大非堆內存的平均值 instance_jvm_young_gc_time = from(ServiceInstanceJVMGC.time).filter(phrase == GCPhrase.NEW).sum();//年輕代gc的耗時 instance_jvm_old_gc_time = from(ServiceInstanceJVMGC.time).filter(phrase == GCPhrase.OLD).sum();//老年代gc的耗時 instance_jvm_young_gc_count = from(ServiceInstanceJVMGC.count).filter(phrase == GCPhrase.NEW).sum();//年輕代gc的次數 instance_jvm_old_gc_count = from(ServiceInstanceJVMGC.count).filter(phrase == GCPhrase.OLD).sum();//老年代gc的次數 instance_jvm_thread_live_count = from(ServiceInstanceJVMThread.liveCount).longAvg();//存活的線程數 instance_jvm_thread_daemon_count = from(ServiceInstanceJVMThread.daemonCount).longAvg();//守護線程數 instance_jvm_thread_peak_count = from(ServiceInstanceJVMThread.peakCount).longAvg();//峰值線程數
告警的設置
rules: # 告警規則 名稱唯一 必須以_rule 結尾 service_resp_time_rule: # 度量名稱,只支持int long double metrics-name: service_resp_time # 操作符 op: ">" # 閾值 ms threshold: 1000 # 評估度量的時間長度 period: 10 # 度量有多少次符合告警條件后,才會觸發告警 count: 2 # 靜默時間 默認情況下,它和周期一樣,在同一個周期內只會觸發一次。 silence-period: 10 message: 服務【{name}】的平均響應時間在最近10分鍾內有2分鍾超過1秒 service_sla_rule: metrics-name: service_sla op: "<" threshold: 8000 period: 10 count: 2 silence-period: 10 message: 服務【{name}】的成功率在最近10分鍾內有2分鍾低於80% composite-rules: # 規則名稱:在告警信息中顯示的唯一名稱,必須以_rule結尾 comp_rule: # 指定如何組成規則,支持&&, ||, ()操作符 expression: service_resp_time_rule && service_sla_rule message: 服務【{name}】在最近10分鍾內有2分鍾平均響應時間超過1秒並且成功率低於80%
本文介紹SkyWalking的OAL語法的用法。
官網
OAL介紹
https://github.com/apache/skywalking/blob/master/docs/en/guides/backend-oal-scripts.md
OAL規則語法:https://github.com/apache/skywalking/blob/master/docs/en/concepts-and-designs/oal.md
范圍和字段:https://github.com/apache/skywalking/blob/master/docs/en/concepts-and-designs/scope-definitions.md
OAL簡介
SkyWalking從8.0.0開始支持OAL腳本,它所在路徑為:config/oal/*.oal。我們可以修改它,比如:添加過濾條件或者新的衡量標准,重啟OAP生效。
Apache SkyWalking告警是由一組規則驅動,這些規則定義在config/alarm-settings.yml文件中,alarm-settings.yml中的rules.xxx_rule.metrics-name對應的是config/oal路徑下的配置文件中的詳細規則:core.oal、event.oal,java-agent.oal, browser.oal。
endpoint 規則相比 service、instance 規則耗費更多內存及資源。
OAL(Observability Analysis Language):觀測分析語言。
在流模式(Streaming mode)下,SkyWalking 提供了OAL來分析流入的數據。OAL 聚焦於服務,服務實例以及端點的度量指標,因此 OAL 非常易於學習和使用。
6.3版本以后,OAL引擎嵌入在OAP服務器運行時中,稱為oal-rt(OAL運行時)。OAL腳本現在位於/config文件夾,用戶可以簡單地改變和重新啟動服務器,使其有效。
但是,OAL腳本仍然是編譯語言,OAL運行時動態生成Java代碼。您可以在系統環境上設置SW_OAL_ENGINE_DEBUG=Y,查看生成了哪些類。
配置示例
// 計算Endpoint1 和 Endpoint2 的 p99。
endpoint_p99 = from(Endpoint.latency).filter(name in ("Endpoint1", "Endpoint2")).summary(0.99)
// 計算以“serv”開頭的端點名字的 p99。
serv_Endpoint_p99 = from(Endpoint.latency).filter(name like "serv%").summary(0.99)
// 計算每個端點的響應平均時長
endpoint_avg = from(Endpoint.latency).avg()
// 計算每個端點 p50,p75,p90,p95 and p99 的延遲柱狀圖,每隔 50 毫秒一條柱
endpoint_percentile = from(Endpoint.latency).percentile(10)
// 統計每個服務響應狀態為 true 的百分比
endpoint_success = from(Endpoint.*).filter(status == true).percent()
// 計算每個服務的響應碼為[404, 500, 503]的總和
endpoint_abnormal = from(Endpoint.*).filter(responseCode in [404, 500, 503]).count()
// 計算每個服務的請求類型為[PRC, gRPC]的總和
endpoint_rpc_calls_sum = from(Endpoint.*).filter(type in [RequestType.PRC, RequestType.gRPC]).sum()
// 計算每個端點的端點名稱為["/v1", "/v2"]的總和
endpoint_url_sum = from(Endpoint.*).filter(endpointName in ["/v1", "/v2"]).sum()
// 統計每個服務的調用總量
endpoint_calls = from(Endpoint.*).count()
// 計算每個服務的GET方法的CPM。值的組成為:`tagKey:tagValue`.
// 方案1, 使用`tags contain`.
service_cpm_http_get = from(Service.*).filter(tags contain "http.method:GET").cpm()
// 方案2, 使用 `tag[key]`.
service_cpm_http_get = from(Service.*).filter(tag["http.method"] == "GET").cpm();
// 計算每個服務的除了GET的方法的CPM。值的組成為:`tagKey:tagValue`.
service_cpm_http_other = from(Service.*).filter(tags not contain "http.method:GET").cpm()
// 計算瀏覽應用的錯誤率。分子是FIRST_ERROR,分母是NORMAL
browser_app_error_rate = from(BrowserAppTraffic.*).rate(trafficCategory == BrowserAppTrafficCategory.FIRST_ERROR, trafficCategory == BrowserAppTrafficCategory.NORMAL);
disable(segment);
disable(endpoint_relation_server_side);
disable(top_n_database_statement);
默認的配置
config/oal/core.oal
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// For services using protocols HTTP 1/2, gRPC, RPC, etc., the cpm metrics means "calls per minute",
// for services that are built on top of TCP, the cpm means "packages per minute".
// All scope metrics
all_percentile = from(All.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
all_heatmap = from(All.latency).histogram(100, 20);
// Service scope metrics
service_resp_time = from(Service.latency).longAvg();
service_sla = from(Service.*).percent(status == true);
service_cpm = from(Service.*).cpm();
service_percentile = from(Service.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_apdex = from(Service.latency).apdex(name, status);
// Service relation scope metrics for topology
service_relation_client_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();
service_relation_server_cpm = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
service_relation_client_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);
service_relation_server_call_sla = from(ServiceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
service_relation_client_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();
service_relation_server_resp_time = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();
service_relation_client_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_relation_server_percentile = from(ServiceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99
// Service Instance relation scope metrics for topology
service_instance_relation_client_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).cpm();
service_instance_relation_server_cpm = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
service_instance_relation_client_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.CLIENT).percent(status == true);
service_instance_relation_server_call_sla = from(ServiceInstanceRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
service_instance_relation_client_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).longAvg();
service_instance_relation_server_resp_time = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).longAvg();
service_instance_relation_client_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.CLIENT).percentile(10); // Multiple values including p50, p75, p90, p95, p99
service_instance_relation_server_percentile = from(ServiceInstanceRelation.latency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99
// Service Instance Scope metrics
service_instance_sla = from(ServiceInstance.*).percent(status == true);
service_instance_resp_time= from(ServiceInstance.latency).longAvg();
service_instance_cpm = from(ServiceInstance.*).cpm();
// Endpoint scope metrics
endpoint_cpm = from(Endpoint.*).cpm();
endpoint_avg = from(Endpoint.latency).longAvg();
endpoint_sla = from(Endpoint.*).percent(status == true);
endpoint_percentile = from(Endpoint.latency).percentile(10); // Multiple values including p50, p75, p90, p95, p99
// Endpoint relation scope metrics
endpoint_relation_cpm = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).cpm();
endpoint_relation_resp_time = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).longAvg();
endpoint_relation_sla = from(EndpointRelation.*).filter(detectPoint == DetectPoint.SERVER).percent(status == true);
endpoint_relation_percentile = from(EndpointRelation.rpcLatency).filter(detectPoint == DetectPoint.SERVER).percentile(10); // Multiple values including p50, p75, p90, p95, p99
database_access_resp_time = from(DatabaseAccess.latency).longAvg();
database_access_sla = from(DatabaseAccess.*).percent(status == true);
database_access_cpm = from(DatabaseAccess.*).cpm();
database_access_percentile = from(DatabaseAccess.latency).percentile(10);
config/oal/event.oal
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
event_total = from(Event.*).count();
event_normal_count = from(Event.*).filter(type == "Normal").count();
event_error_count = from(Event.*).filter(type == "Error").count();
event_start_count = from(Event.*).filter(name == "Start").count();
event_shutdown_count = from(Event.*).filter(name == "Shutdown").count();
config/oal/java-agent.oal
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// JVM instance metrics
instance_jvm_cpu = from(ServiceInstanceJVMCPU.usePercent).doubleAvg();
instance_jvm_memory_heap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == true).longAvg();
instance_jvm_memory_noheap = from(ServiceInstanceJVMMemory.used).filter(heapStatus == false).longAvg();
instance_jvm_memory_heap_max = from(ServiceInstanceJVMMemory.max).filter(heapStatus == true).longAvg();
instance_jvm_memory_noheap_max = from(ServiceInstanceJVMMemory.max).filter(heapStatus == false).longAvg();
instance_jvm_young_gc_time = from(ServiceInstanceJVMGC.time).filter(phrase == GCPhrase.NEW).sum();
instance_jvm_old_gc_time = from(ServiceInstanceJVMGC.time).filter(phrase == GCPhrase.OLD).sum();
instance_jvm_young_gc_count = from(ServiceInstanceJVMGC.count).filter(phrase == GCPhrase.NEW).sum();
instance_jvm_old_gc_count = from(ServiceInstanceJVMGC.count).filter(phrase == GCPhrase.OLD).sum();
instance_jvm_thread_live_count = from(ServiceInstanceJVMThread.liveCount).longAvg();
instance_jvm_thread_daemon_count = from(ServiceInstanceJVMThread.daemonCount).longAvg();
instance_jvm_thread_peak_count = from(ServiceInstanceJVMThread.peakCount).longAvg();
instance_jvm_thread_runnable_state_thread_count = from(ServiceInstanceJVMThread.runnableStateThreadCount).longAvg();
instance_jvm_thread_blocked_state_thread_count = from(ServiceInstanceJVMThread.blockedStateThreadCount).longAvg();
instance_jvm_thread_waiting_state_thread_count = from(ServiceInstanceJVMThread.waitingStateThreadCount).longAvg();
instance_jvm_thread_timed_waiting_state_thread_count = from(ServiceInstanceJVMThread.timedWaitingStateThreadCount).longAvg();
instance_jvm_class_loaded_class_count = from(ServiceInstanceJVMClass.loadedClassCount).longAvg();
instance_jvm_class_total_unloaded_class_count = from(ServiceInstanceJVMClass.totalUnloadedClassCount).longAvg();
instance_jvm_class_total_loaded_class_count = from(ServiceInstanceJVMClass.totalLoadedClassCount).longAvg();
config/oal/browser.oal
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*
*/
// browser app
browser_app_pv = from(BrowserAppTraffic.count).filter(trafficCategory == BrowserAppTrafficCategory.NORMAL).sum();
browser_app_error_rate = from(BrowserAppTraffic.*).rate(trafficCategory == BrowserAppTrafficCategory.FIRST_ERROR,trafficCategory == BrowserAppTrafficCategory.NORMAL);
browser_app_error_sum = from(BrowserAppTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).sum();
// browser app single version
browser_app_single_version_pv = from(BrowserAppSingleVersionTraffic.count).filter(trafficCategory == BrowserAppTrafficCategory.NORMAL).sum();
browser_app_single_version_error_rate = from(BrowserAppSingleVersionTraffic.trafficCategory).rate(trafficCategory == BrowserAppTrafficCategory.FIRST_ERROR,trafficCategory == BrowserAppTrafficCategory.NORMAL);
browser_app_single_version_error_sum = from(BrowserAppSingleVersionTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).sum();
// browser app page
browser_app_page_pv = from(BrowserAppPageTraffic.count).filter(trafficCategory == BrowserAppTrafficCategory.NORMAL).sum();
browser_app_page_error_rate = from(BrowserAppPageTraffic.*).rate(trafficCategory == BrowserAppTrafficCategory.FIRST_ERROR,trafficCategory == BrowserAppTrafficCategory.NORMAL);
browser_app_page_error_sum = from(BrowserAppPageTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).sum();
browser_app_page_ajax_error_sum = from(BrowserAppPageTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).filter(errorCategory == BrowserErrorCategory.AJAX).sum();
browser_app_page_resource_error_sum = from(BrowserAppPageTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).filter(errorCategory == BrowserErrorCategory.RESOURCE).sum();
browser_app_page_js_error_sum = from(BrowserAppPageTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).filter(errorCategory in [BrowserErrorCategory.JS,BrowserErrorCategory.VUE,BrowserErrorCategory.PROMISE]).sum();
browser_app_page_unknown_error_sum = from(BrowserAppPageTraffic.count).filter(trafficCategory != BrowserAppTrafficCategory.NORMAL).filter(errorCategory == BrowserErrorCategory.UNKNOWN).sum();
// browser performance metrics
browser_app_page_redirect_avg = from(BrowserAppPagePerf.redirectTime).longAvg();
browser_app_page_dns_avg = from(BrowserAppPagePerf.dnsTime).longAvg();
browser_app_page_ttfb_avg = from(BrowserAppPagePerf.ttfbTime).longAvg();
browser_app_page_tcp_avg = from(BrowserAppPagePerf.tcpTime).longAvg();
browser_app_page_trans_avg = from(BrowserAppPagePerf.transTime).longAvg();
browser_app_page_dom_analysis_avg = from(BrowserAppPagePerf.domAnalysisTime).longAvg();
browser_app_page_fpt_avg = from(BrowserAppPagePerf.fptTime).longAvg();
browser_app_page_dom_ready_avg = from(BrowserAppPagePerf.domReadyTime).longAvg();
browser_app_page_load_page_avg = from(BrowserAppPagePerf.loadPageTime).longAvg();
browser_app_page_res_avg = from(BrowserAppPagePerf.resTime).longAvg();
browser_app_page_ssl_avg = from(BrowserAppPagePerf.sslTime).longAvg();
browser_app_page_ttl_avg = from(BrowserAppPagePerf.ttlTime).longAvg();
browser_app_page_first_pack_avg = from(BrowserAppPagePerf.firstPackTime).longAvg();
browser_app_page_fmp_avg = from(BrowserAppPagePerf.fmpTime).longAvg();
browser_app_page_fpt_percentile = from(BrowserAppPagePerf.fptTime).percentile(10);
browser_app_page_ttl_percentile = from(BrowserAppPagePerf.ttlTime).percentile(10);
browser_app_page_dom_ready_percentile = from(BrowserAppPagePerf.domReadyTime).percentile(10);
browser_app_page_load_page_percentile = from(BrowserAppPagePerf.loadPageTime).percentile(10);
browser_app_page_first_pack_percentile = from(BrowserAppPagePerf.firstPackTime).percentile(10);
browser_app_page_fmp_percentile = from(BrowserAppPagePerf.fmpTime).percentile(10);
// Disable unnecessary hard core stream, targeting @Stream#name
/
//disable(browser_error_log);
OAL語法
OAL 腳本文件應該以 .oal 為后綴。
// Declare the metrics.
METRICS_NAME = from(SCOPE.(* | [FIELD][,FIELD ...]))
[.filter(FIELD OP [INT | STRING])]
.FUNCTION([PARAM][, PARAM ...])
// Disable hard code
disable(METRICS_NAME);
域(Scope)
域包括全局(All)、服務(Service)、服務實例(Service Instance)、端點(Endpoint)、服務關系(Service Relation)、服務實例關系(Service Instance Relation)、端點關系(Endpoint Relation)。
當然還有一些字段,他們都屬於以上某個域。
過濾器(Filter)
使用在使用過濾器的時候,通過指定字段名或表達式來構建字段值的過濾條件。
表達式可以使用 and,or 和 () 進行組合。
操作符包含==,!=,>,<,>=,<=,in [...],like %...,like ...%,like %...%,他們可以基於字段類型進行類型檢測,
如果類型不兼容會在編譯/代碼生成期間報錯。
聚合函數(Aggregation Function)
默認的聚合函數由 SkyWalking OAP 核心實現。並可自由擴展更多函數。
提供的函數:
longAvg:某個域實體所有輸入的平均值,輸入字段必須是 long 類型。
instance_jvm_memory_max = from(ServiceInstanceJVMMemory.max).longAvg();
在上面的例子中,輸入是 ServiceInstanceJVMMemory 域的每個請求,平均值是基於字段 max 進行求值的。
doubleAvg:某個域實體的所有輸入的平均值,輸入的字段必須是 double 類型。
instance_jvm_cpu = from(ServiceInstanceJVMCPU.usePercent).doubleAvg();
在上面的例子中,輸入是 ServiceInstanceJVMCPU 域的每個請求,平均值是基於 usePercent 字段進行求值的。
percent:對於輸入中匹配指定條件的百分比數.
endpoint_percent = from(Endpoint.*).percent(status == true);
在上面的例子中,輸入是每個端點的請求,條件是 endpoint.status == true。
rate:對於條件匹配的輸入,比率以100的分數表示。
browser_app_error_rate = from(BrowserAppTraffic.*).rate(trafficCategory == BrowserAppTrafficCategory.FIRST_ERROR, trafficCategory == BrowserAppTrafficCategory.NORMAL);
在上面的例子中,所有的輸入都是每個瀏覽器應用流量的請求。分子的條件是trafficCategory == BrowserAppTrafficCategory.FIRST_ERROR,分母的條件是trafficCategory == BrowserAppTrafficCategory.NORMAL。
其中,第一個參數是分子的條件,第二個參數是分母的條件。
sum:某個域實體的調用總數。
service_calls_sum = from(Service.*).sum();
在上面的例子中,統計每個服務的調用數。
histogram:熱力圖 更多詳見Heatmap in WIKI。
all_heatmap = from(All.latency).histogram(100, 20);
在上面的例子中,計算了所有傳入請求的熱力學熱圖。
第一個參數是計算延遲的精度,在上面的例子中,在101-200ms組中,113ms和193ms被認為是相同的。
第二個參數是分組數量,在上面的例子中,一共有21組數據分別為0-100ms,101-200ms......1901-2000ms,2000ms以上.
apdex:應用性能指數(Application Performance Index)
service_apdex = from(Service.latency).apdex(name, status);
在上面的例子中,計算了所有服務的應用性能指數。
第一個參數是服務名稱,該名稱的Apdex閾值在配置文件service-apdex-threshold.yml中定義。
第二個參數是請求狀態,狀態(成功或失敗)影響Apdex的計算。
P99,P95,P90,P75,P50:百分位 更多詳見Percentile in WIKI
百分位是自7.0版本引入的第一個多值度量。由於有多個值,可以通過getMultipleLinearIntValuesGraphQL查詢進行查詢。
all_percentile = from(All.latency).percentile(10);
在上面的例子中,計算了所有傳入請求的 P99,P95,P90,P75,P50。參數是百分位計算的精度,在上例中120ms和124被認為是相同的。
度量指標名稱(Metrics Name)
存儲實現,告警以及查詢模塊的度量指標名稱,SkyWalking 內核支持自動類型推斷。
組(Group)
所有度量指標數據都會使用 Scope.ID 和最小時間桶(min-level time bucket) 進行分組。
在端點的域中,Scope.ID 為端點的 ID(基於服務及其端點的唯一標志)。
強制轉換(Cast)
源的字段是靜態類型。在一些情況下,過濾語句和聚合語句所需要的字段類型和源的字段類型不匹配,例如:源的tag的值是String類型,大部分的聚合計算需要是數字類型。強制轉換表達式就是用來解決這個的。
用法
(str->long) or (long), cast string type into long.
(str->int) or (int), cast string type into int.
示例:
mq_consume_latency = from((str->long)Service.tag["transmission.latency"]).longAvg(); // the value of tag is string type.
強制轉換表達式支持如下位置:
From statement. from((cast)source.attre).
Filter expression. .filter((cast)tag["transmission.latency"] > 0)
Aggregation function parameter. .longAvg((cast)strField1== 1, (cast)strField2)
禁用(Disable)
Disable是OAL中的高級語句,只在特定情況下使用。
一些聚合和度量是通過核心硬代碼定義的,這個Disable語句是設計用來讓它們停止活動的,
比如segment, top_n_database_statement。
在默認情況下,沒有被禁用的。
————————————————
版權聲明:本文為CSDN博主「IT利刃出鞘」的原創文章,遵循CC 4.0 BY-SA版權協議,轉載請附上原文出處鏈接及本聲明。
原文鏈接:https://blog.csdn.net/feiying0canglang/article/details/121562890