Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware論文閱讀筆記

本文轉載自查看原文 2020-10-22 19:00 351 論文筆記

於這一周閱讀了ccs2020最新論文Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware，做了一些筆記。

Enhancing State-of-the-art Classifiers with API Semantics to Detect Evolved Android Malware (用 API 語義加強現有的分類器以檢測不斷改進的安卓惡意軟件)

performance degrades 當 malware evolution

使用 online learning, retraining, active learning 等技術需要大量新的惡意軟件標記並消耗大量人力資源

提出 APIGraph (a framwork)

使用 API 語義的相似性，即為（similarity information among evolved Android Malware）來減緩 performance degrades

similarity information: 語義上等價或類似的API使用

找到不同實現上的語義的相似性 (semantic similarity despite the different implementation)

建立一個 relation graph:

隨后從圖中提取出 API 語義 (將每個 API entity 轉變為一個 embedding)，並將類似語義的進行分組為 API clusters (APIGraph 的 result)

使用指標 AUT (area under time) (在 TESSERACT 中提出：TESSERACT 使用 active learning 選擇一小部分具有代表性的改進的安卓惡意軟件)

作用： 1. 減少了人工標記所需要的勞動量

2.減緩了模型老化，即 performance degrades

將 APIGraph 應用於四個安卓惡意軟件檢測器上並進行測試，分別為 MAMADROID, DROIDEVOLVER (通過 online learning 持續引入新的惡意軟件樣本), BREBIN, DREBIN-DL

分為兩部分：

建立 API Relation Graph: collecting Android API Documents related to a certain API level 並提取 entities 和 relations
使用 API Relation Graph 去加強現有的惡意軟件檢測技術
- 將所有 entities 轉換為 vector (使用 graph embedding algorithm),
- 兩個 entities 之間的 vector difference in the embedding space 即為兩個 entities 之間的語義
- 使用優化使兩個有相同關系的 entities 的 vector 變為類似的 (similar)
- 聚類語義類似的 APIs 生成 clusters

G = < E, R> (entities, relations)

entity types: method, class, package, permission

relation types: ten types

使用 API reference document (有明顯的分層結構)

從 API 文件中提取 entity，API 文件以 class 為分類 (organized in class)

1.從每個 per-class document file 提取 class entity

2.從完整 class name 中拆分出 package name (i.e. package entity)

3.phase per-class document files 為 Document Object Model (DOM) 並從中抽取屬於某一個 class 的 method entity

4.phase the manifest file 中的所有 permissions 並從中抽取 permission entity

將圖中的 API 轉換為 embedding representation (即 vector), 並將這些 embeddings 分類為 clusters

使用 TransE 進行轉換

首先提取出 permission entity, 並且添加基於 permission 的新的 relation
將圖中分實體 e 和關聯 r 分別用向量 Le和 Lr表示
使用 TransE 算法對每個三元組 (h,r,t)最小化 \({||Lh + Lr - Lt||}^2_{2}\) (h,t 為 entities, r 為 relation)

使用 K-Means 算法將 embeddings 分組，使用每個 cluster 中心的 embedding 來表示這個 cluster，並用 Elbow 算法決定 cluster 的數目

對於這四種 classifier, 將其中的 API feature format 替換為 cluster 進行改進

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 【論文筆記系列】AutoML：A Survey of State-of-the-art （下）論文閱讀-Editing in Style: Uncovering the Local Semantics of GANs Android API 人臉檢測（Face Detect）論文閱讀筆記（一）FCN GAN 論文閱讀筆記【論文閱讀筆記】《DCGAN》 DenseNet 論文閱讀筆記《Enhancing Graph Neural Network-based Fraud Detectors against Camouflaged Fraudsters》閱讀筆記閱讀論文筆記方法 Swin Transformer論文閱讀筆記