零、標題&摘要
1、標題:
Real-Time Human Objects Tracking for Smart Surveillance at the Edge
應用於邊緣智能監控的實時人體目標跟蹤
2、摘要:
Abstract—— Allowing computation to be performed at the edge of a network, edge computing has been recognized as a promising approach to address some challenges in the cloud computing paradigm, particularly to the delay-sensitive and mission-critical applications like real-time surveillance. Prevalence of networked cameras and smart mobile devices enable video analytics at the network edge. However, human objects detection and tracking are still conducted at cloud centers, as real-time, online tracking is computationally expensive. In this paper, we investigated the feasibility of processing surveillance video streaming at the network edge for real-time, uninterrupted moving human objects tracking. Moving human detection based on Histogram of Oriented Gradients (HOG) and linear Support Vector Machine (SVM) is illustrated for features extraction, and an efficient multiobject tracking algorithm based on Kernelized Correlation Filters (KCF) is proposed. Implemented and tested on Raspberry Pi 3, our experimental results are very encouraging, which validated the feasibility of the proposed approach toward a real-time surveillance solution at the edge of networks.
摘要--允許在網絡邊緣執行計算,邊緣計算被認為是解決雲計算范式中一些挑戰的一種有希望的方法,特別是對於延遲敏感和任務關鍵的應用,如實時監視。網絡攝像機和智能移動設備的普及使視頻分析在網絡邊緣得以實現。然而,由於實時、在線的跟蹤計算量大,人體目標的檢測和跟蹤仍然是在雲中心進行的。本文研究了在網絡邊緣處理監控視頻流以實現實時、不間斷的運動目標跟蹤的可行性。提出了一種基於方向梯度直方圖(HOG)和線性支持向量機(SVM)的運動人體檢測方法,並提出了一種基於核相關濾波器(KCF)的多目標跟蹤算法。在樹莓派3代上實現並測試,實驗結果令人鼓舞,驗證了該方法在網絡邊緣實時監控解決方案中的可行性。
3、專有名詞:
Cloud computing paradigm(雲計算范式):雲計算(cloud computing)是分布式計算的一種,指的是通過網絡“雲”將巨大的數據計算處理程序分解成無數個小程序,然后,通過多部服務器組成的系統進行處理和分析這些小程序得到結果並返回給用戶。
Histogram of Oriented Gradients(方向梯度直方圖):方向梯度直方圖(Histogram of Oriented Gradient, HOG)特征是一種可以快速描述物體局部梯度特征的描述子。它首先將把窗口划分成若干個塊(blocks),然后把每一個塊中划分若干個元胞(cells),然后統計每個元胞內部的梯度方向直方圖作為該元胞的特征向量,然后把每一個元胞的特征向量相連接作為一個塊的特征向量,最后把塊的特征向量相連接,即為該窗口的 HOG 特征描述子。
linear Support Vector Machine(線性支持向量機):支持向量機(Support Vector Machine, SVM)是一類按監督學習(supervised learning)方式對數據進行二元分類的廣義線性分類器(generalized linear classifier),其決策邊界是對學習樣本求解的最大邊距超平面(maximum-margin hyperplane)。
Kernelized Correlation Filters(核相關濾波算法):其實KCF算法僅僅是在MOSSE算法上引入了核技巧和多通道特征的處理,他的核心思想是和MOSSE一樣的。而且KCF算法是通過循環矩陣生成多樣本然后利用每一個樣本進行回歸訓練,仔細其實是和MOSSE算法直接用濾波器與基樣本求相關是一樣的過程。只不過是一種處理的兩種層面的理解,原理是統一的:時域的卷積可以用頻域的點乘表示。
Raspberry Pi 3(樹莓派3代):Raspberry Pi 推出一款適用於學習、編碼和創建項目的絕佳器件 Raspberry Pi 3。 Raspberry Pi 3 包含了集成 802.11 b/g/n 無線 LAN、傳統藍牙、低功耗藍牙 (BLE) 功能。該系列還包括更快的運行速率達 1.2 GHz 的四核 Cortex® A53 處理器。 Raspberry Pi 3 完全兼容 Raspberry Pi 2,也就是說,幾乎所有以前的 Raspberry Pi 2 配件均兼容 Raspberry Pi 3。
4、關鍵字:
Keywords—Edge Computing, Human Detection, Object Tracking, Smart Surveillance.
關鍵字--邊緣計算,人體檢測,目標跟蹤,智能監控。
一、導言
The concept of Smart Cities becomes feasible thanks to the advanced information and communication technologies (ICT) that link cyber-physical systems and social objects. It provides high-value services that improve the life quality of residents. One of the most actively researched smart city topics is the intelligent surveillance [22]. It enables a broad spectrum of promising applications, including access control in areas of interest, human identity or behavior recognition, crowd flux statistics and congestion analysis, detection of anomalous behaviors, and interactive surveillance using multiple cameras [10]. Due to the onerous computation requirement of big data contextual tasks, many of those smart surveillance applications are design to use a cloud computing framework that possesses abundant computation power, excellent flexibility, and scalability.
由於先進的信息和通信技術(ICT)將網絡物理系統和社會對象連接起來,智能城市的概念變得可行。它提供高價值的服務,提高居民的生活質量。智能城市最活躍的研究課題之一是智能監控[22]。它能夠實現廣泛的有希望的應用,包括感興趣區域的訪問控制、人類身份或行為識別、人群流量統計和擁塞分析、異常行為檢測和使用多個攝像頭的交互式監視[10]。由於大數據上下文任務的繁雜計算需求,許多智能監控應用程序設計使用具有豐富計算能力、出色靈活性和可擴展性的雲計算框架。
However, in practice, the cloud computing based smart surveillance applications face significant challenges. Although they require real-time object detection and tracking by processing of video streams collected from widely distributed data sources, such as networked cameras and smart mobile devices, transferring the massive amount of raw frame data to cloud centers not only incurs uncertainty in timing but also poses extra workload to the communication networks. Also, the remote data transmission may cause the data security and privacy issues by allowing more exploring opportunities to the attackers. Consequently, the surveillance video streams are often considered as a measure for afterward forensics analysis instead of a proactive tool to deter suspicious activities before damages are caused. Hence, the technologies of devolving many time critical and security sensitive tasks to the local processing are actively searched [1].
然而,在實際應用中,基於雲計算的智能監控應用面臨着巨大的挑戰。盡管它們需要通過處理從廣泛分布的數據源(如網絡攝像機和智能移動設備)收集的視頻流來進行實時目標檢測和跟蹤,但將大量原始幀數據傳輸到雲中心不僅會帶來時間上的不確定性,而且會給通信帶來額外的工作量網絡。同時,遠程數據傳輸可能會給攻擊者帶來更多的探索機會,從而導致數據安全和隱私問題。因此,監控視頻流通常被認為是事后取證分析的一種手段,而不是在造成損害之前阻止可疑活動的一種主動工具。因此,積極探索將許多時間關鍵和安全敏感的任務下放到本地處理的技術[1]。
The recent Internet of Things (IoTs) technology is leading us to the post-cloud era. Thousands of connected smart “things” immersing into our daily life generate a huge amount of data as well as perform data processing on the edge of the network [17]. Hence, edge computing over IoT has been widely considered as a promising solution for addressing the cloud computing challenges [11], [17]. Potential advantages of edge computing over cloud computing are summarized as follows:
最近的物聯網技術正引領我們進入后雲時代。數千個融入我們日常生活的互聯智能“事物”產生大量數據,並在網絡邊緣執行數據處理[17]。因此,基於物聯網的邊緣計算被廣泛認為是解決雲計算挑戰的一個有希望的解決方案[11],[17]。邊緣計算相對於雲計算的潛在優勢總結如下:
• Real-time response: Since applications or services are directly performed at the edge computing devices that are close to data sources. Information extracting and data analyzing are executed “on-site” to meet the requirement of fast response for delay sensitive tasks;
• 實時響應:因為應用程序或服務直接在靠近數據源的邊緣計算設備上執行。在現場進行信息提取和數據分析,以滿足對延遲敏感任務的快速響應要求;
• Lower network workload: Raw data that is generated by sensors and monitors will be consumed at the edge of the network instead of outsourcing to a remote cloud server for processing and analysis. Since only extracted information will be sent to cloud server, the workload of the communication network is significantly reduced;
• 降低網絡工作量:傳感器和監視器生成的原始數據將在網絡邊緣消耗,而不是外包給遠程雲服務器進行處理和分析。由於只將提取的信息發送到雲服務器,大大減少了通信網絡的工作量;
• Lower energy consumption: Most of the edge devices are energy constraint system, producing and consuming data locally on the edge will also effectively reduce energy consumed by data transmission;
• 低能耗:大多數邊緣設備都是能量約束系統,在邊緣本地生成和消耗數據也將有效降低數據傳輸所消耗的能量;
• Data security and privacy: The less data is sent, the fewer opportunities are available to attackers who have access to the communication networks; on the other hand, it is easier to enforce security and privacy policies at local comparing to requesting collaboration among multiple network domains under different administrations.
• 數據安全和隱私:發送的數據越少,有權訪問通信網絡的攻擊者獲得的機會就越少;另一方面,與在不同管理下請求多個網絡域之間的協作相比,在本地實施安全和隱私策略更容易。
In this paper, we validated the feasibility of conducting real-time, uninterrupted moving human objects tracking task leveraging the edge computing paradigm. Selected algorithms are implemented at the edge of the network to process raw video streams for identifying human objects as well as for automatically tracking the human moving patterns. Our major contribution lies in two folds: 1) a three-layer automatic surveillance system architecture is proposed, which pushes the computing tasks even closer to the data source such that the detection and tracking tasks are executed on the embedded edge devices; and 2) a concept-proof prototype is implemented and tested using Raspberry PI as the edge computing engines. An experimental study has been conducted using real-world surveillance video streams. In our experiments, the edge device can process 12.2 frames per second, which successfully met the requirements of real-time performance. And the obtained accuracy is decent with the detection rate varying from 60% to 83.3% depending on the number of human objects in a single frame and the complexity of the background. It laid a solid foundation to detect suspicious behaviors or activities and generate alerts earlier proactively.
在本文中,我們驗證了利用邊緣計算范式進行實時、不間斷運動人體目標跟蹤任務的可行性。選定的算法在網絡邊緣實現,以處理原始視頻流,用於識別人體對象以及自動跟蹤人體運動模式。我們的主要貢獻有兩個方面:1)提出了一個三層的自動監控系統架構,它將計算任務推到離數據源更近的地方,使得檢測和跟蹤任務在嵌入式邊緣設備上執行;2)實現了一個概念驗證原型,並用樹莓PI作為測試工具進行了測試邊緣計算引擎。一項實驗研究是利用真實世界的監控視頻流進行的。在我們的實驗中,邊緣設備每秒可以處理12.2幀,成功地滿足了實時性的要求。根據單個幀中的人類對象的數量和背景的復雜度,檢測率從60%變化到83.3%,得到的精度是合適的。它為檢測可疑行為或活動奠定了堅實的基礎,並主動地提前發出警報。
The rest of the paper is organized as follows: Section II discusses some closely related work on video surveillance. Section III introduces the architecture of our proposed smart surveillance system. Section IV briefly describes the Histogram of Oriented Gradients (HOG) and the linear Support Vector Machine (SVM) algorithms that are adapted for human object detection, along with a multi-object tracking scheme based on Kernelized Correlation Filters (KCF) algorithm. Section V reports the experimental results with discussions. Finally, Section VI wraps up this paper with the conclusions along with a view of our on-going efforts.
論文的其余部分安排如下:第二節討論了與視頻監控密切相關的一些工作。第三節介紹了我們提出的智能監控系統的體系結構。第四節簡要介紹了面向梯度直方圖(HOG)和適用於人體目標檢測的線性支持向量機(SVM)算法,以及基於核相關濾波(KCF)算法的多目標跟蹤方案。第五節報告實驗結果並進行討論。最后,第六節對本文進行了總結,並對我們的工作進行了展望。