C/C++源碼掃描系列- Fortify 篇

本文轉載自查看原文 2021-03-28 16:38 1351

首發於

https://xz.aliyun.com/t/9276

概述

Fortify是一款商業級的源碼掃描工具，其工作原理和codeql類似，甚至一些規則編寫的語法都很相似，其工作示意圖如下：

首先Fortify對源碼進行分析（需要編譯），然后提取出相關信息保存到某個位置，然后加載規則進行掃描，掃描的結果保存為 .fpr 文件，然后用戶使用 GUI 程序對結果進行分析，排查漏洞。

環境搭建

本文的分析方式是在 Linux 上對源碼進行編譯、掃描，然后在 Windows 平台對掃描結果進行分析，所以涉及 Windows 和 Linux 兩個平台的環境搭建。

Windows搭建

首先雙擊 Fortify_SCA_and_Apps_20.1.1_windows_x64.exe 安裝

安裝完成后，把 fortify-common-20.1.1.0007.jar 拷貝 Core\lib 進行破解，然后需要把 rules 目錄的規則文件拷貝到安裝目錄下的 Core\config\rules 的路徑下，該路徑下保存的是Fortify的默認規則庫。

ExternalMetadata 下的文件也拷貝到 Core\config\ExternalMetadata 目錄即可

最后執行 auditworkbench.cmd 即可進入分析源碼掃描結果的IDE.

Linux搭建

解壓下載的壓縮包，然后執行 ./Fortify_SCA_and_Apps_19.2.1_linux_x64.run 按照引導完成安裝即可，安裝完成后進入目錄執行sourceanalyzer來查看是否安裝完成

$ ./bin/sourceanalyzer -version
Fortify Static Code Analyzer 19.2.1.0008 (using JRE 1.8.0_181)

然后將 rules 和 ExternalMetadata 拷貝到對應的目錄中完成規則的安裝。

工具使用

本節涉及代碼

https://github.com/hac425xxx/sca-workshop/tree/master/fortify-example

Fortify的工作原理和codeql類似，首先會需要使用Fortify對目標源碼進行分析提取源代碼中的信息，然后使用規則從源碼信息中查詢出匹配的代碼。

首先下載代碼然后使用 sourceanalyzer 來分析源碼

/home/hac425/sca/fortify/bin/sourceanalyzer -b fortify-example make

其中

-b 指定這次分析的 id
后面是編譯代碼時使用的命令，這里是 make

分析完代碼后再次執行 sourceanalyzer 對源碼進行掃描

/home/hac425/sca/fortify/bin/sourceanalyzer -b fortify-example -scan -f fortify-example.fpr

其中

-b 指定掃描的 id 和之前分析源碼時的 id 對應
-scan 表示這次是采用規則對源碼進行掃描
-f 指定掃描結果輸出路徑，掃描結果可以使用 auditworkbench.cmd 進行可視化的分析。

生成 .fpr 結果后可以使用 auditworkbench 加載分析

system命令執行檢測

本節涉及代碼

https://github.com/hac425xxx/sca-workshop/tree/master/fortify-example/system_rules

漏洞代碼如下

int call_system_example()
{
    char *user = get_user_input_str();
    char *xx = user;
    system(xx);
    return 1;
}

首先通過 get_user_input_str 獲取外部輸入，然后傳入 system 執行。

下面介紹如何編寫 Fortify 規則來識別這個漏洞，規則文件是一個xml文件，其主要結構如下

<?xml version="1.0" encoding="UTF-8"?>
<RulePack xmlns="xmlns://www.fortifysoftware.com/schema/rules">
    <RulePackID>EA6AEBB1-F11A-44AD-B5DD-F4F66907184E</RulePackID>
    <Version>1.0</Version>
    <Description><![CDATA[]]></Description>
    <Rules version="20.1">
        <RuleDefinitions>
        
            <DataflowSourceRule formatVersion="3.2" language="cpp">
                ....
            </DataflowSourceRule>
            
        </RuleDefinitions>
    </Rules>
</RulePack>

RulePackID 表示這個規則文件的 ID，設置符合格式的唯一字符串即可
RuleDefinitions 里面是這個xml文件中的所有規則，每個規則作為RuleDefinitions的子節點存在，比如示例中的 DataflowSourceRule 節點，表示這是一個 DataflowSource 規則，用於指定數據流跟蹤的 source

我們開發規則實際也只需要在 RuleDefinitions 中新增對應的規則節點即可。

Fortify 也支持污點跟蹤功能，下面就介紹如何定義 Fortify 的污點跟蹤規則，首先我們需要定義 source ，DataflowSourceRule 規則用於定義污點源，不過這個只支持定義函數的一些屬性作為污點源，比如返回值、參數等，代碼如下

<DataflowSourceRule formatVersion="3.2" language="cpp">
    <RuleID>AEFA1FBF-3137-4DD8-A65F-774350C97427</RuleID>
    <FunctionIdentifier>
        <FunctionName>
            <Value>get_user_input_str</Value>
        </FunctionName>
    </FunctionIdentifier>
    <OutArguments>return</OutArguments>
</DataflowSourceRule>

這條規則的作用是告知Fortify的數據流分析引擎 get_user_input_str 的返回值是污點數據，規則的解釋如下：

首先 RuleID 用於唯一標識一條規則
FunctionIdentifier 用於匹配一個函數，其中包含一個 FunctionName 子節點，表示通過函數名進行匹配，這里就是匹配 get_user_input_str 函數
然后 OutArguments 用於定義污點源， return 表示該函數的返回值是污點數據，如果該節點的值為數字 n ，則表示第 n 個參數為污點數據，n 從0開始計數。

定義好 source 點后，還需要定義 sink 點，DataflowSinkRule 規則用於定義 sink 點

<DataflowSinkRule formatVersion="3.2" language="cpp">
    <RuleID>AA212456-92CD-48E0-A5D5-E74CC26ADDF</RuleID>
    <Description><![CDATA[]]></Description>
    <VulnCategory>Command Injection</VulnCategory>
    <DefaultSeverity>4.0</DefaultSeverity>
    <Sink>
        <InArguments>0</InArguments>
    </Sink>
    <FunctionIdentifier>
        <FunctionName>
            <Value>system</Value>
        </FunctionName>
    </FunctionIdentifier>
</DataflowSinkRule>

這條規則的作用是設置 system 的第 0 個參數為 sink 點，規則解釋如下：

VulnCategory 是一個字符串，會在掃描結果中呈現
FunctionIdentifier 用於匹配一個函數，這里就是匹配 system 函數
Sink 和 InArguments 用於表示函數的第 0 個參數為 sink 點

規則編寫完后，保存成一個 xml 文件，然后在對源碼進行掃描時通過 -rules 指定自定義的規則文件即可

/home/hac425/sca/fortify/bin/sourceanalyzer -rules system.xml -b fortify-example -scan -f fortify-example.fpr -no-default-rules

ps: -no-default-rules 表示不使用Fortify的默認規則，這里主要是在自己開發規則時避免干擾。

掃描的結果如下

由於我們沒有考慮 clean_data 函數對外部輸入的過濾，所以會導致誤報

int call_system_safe_example()
{
    char *user = get_user_input_str();
    char *xx = user;
    if (!clean_data(xx))
        return 1;
    system(xx);
    return 1;
}

可以使用 DataflowCleanseRule 規則來定義這類會對輸入進行過濾的函數

<DataflowCleanseRule formatVersion="3.2" language="cpp">
    <RuleID>3EC057A4-AE7A-42C4-BAA0-3ACB36C8AB4B</RuleID>
    <FunctionIdentifier>
        <FunctionName>
            <Value>clean_data</Value>
        </FunctionName>
    </FunctionIdentifier>
    <OutArguments>0</OutArguments>
</DataflowCleanseRule>

規則表示 clean_data 函數執行后其第 0 個參數就是干凈的（不再是污點值），此時就可以把外部數據被過濾的場景從查詢結果中剔除掉了。

此時的掃描還會漏報 call_our_wrapper_system_custom_memcpy_example ，因為其中使用了custom_memcpy這個外部函數來進行內存拷貝，這樣Fortify在進行污點跟蹤的時候就會導致污點數據丟失，從而漏報。

int custom_memcpy(char *dst, char *src, int sz);

int call_our_wrapper_system_custom_memcpy_example()
{

    char *user = get_user_input_str();

    char *tmp = malloc(strlen(user) + 1);

    custom_memcpy(tmp, user, strlen(user));

    our_wrapper_system(tmp);
    return 1;
}

我們可以使用 DataflowPassthroughRule 規則來對這個函數進行建模

<DataflowPassthroughRule formatVersion="3.2" language="cpp">
    <RuleID>C929ED5F-9E6A-4CB5-B8AE-AAAAD3C20BDC</RuleID>
    <FunctionIdentifier>
        <FunctionName>
            <Pattern>custom_memcpy</Pattern>
        </FunctionName>
    </FunctionIdentifier>
    <InArguments>1</InArguments>
    <OutArguments>0</OutArguments>
</DataflowPassthroughRule>

規則作用是告知 Fortify 調用 custom_memcpy 函數時，第 1 個參數的污點數據會傳播到第 0 個參數，結果如下

system命令執行檢測 # 2

除了使用 DataflowSourceRule 、DataflowSinkRule 等規則來定義污點跟蹤相關的屬性外，Fortify還支持使用 CharacterizationRule 來定義污點跟蹤相關的特性。

其中對應關系如下圖所示：

根據文檔的使用示例，修修改改很快就可以使用 CharacterizationRule 來搜索出涉及 system 命令執行的代碼，代碼路徑如下

https://github.com/hac425xxx/sca-workshop/blob/master/fortify-example/system_rules/system_CharacterizationRule.xml

介紹具體的 CharacterizationRule 規則實現之前，先介紹一下 StructuralRule 規則，因為 CharacterizationRule 就是通過 StructuralRule 的語法來匹配代碼中的語法結構。

StructuralRule 官方文檔中的內容如下

The Structural Analyzer operates on a model of the program source code called the structural tree. The structural tree is made up of a set of nodes that represent program constructs such as classes, functions, fields, code blocks, statements, and expressions.

Fortify在編譯/分析代碼過程中會把代碼中的元素（代碼塊、類、表達式、語句等）通過樹狀結構體組裝起來形成一顆 structural tree，然后掃描的時候使用 Structural Analyzer 來解析 StructuralRule ，最后輸出匹配的代碼。

下面以一個簡單的示例看看 structural tree 的結構，示例代碼如下

class C { 
	private int f; 
    void func() {
    }
}

代碼對應的樹結構如下

搜索上述代碼的 StructuralRule 的代碼如下

Field field: field.name == "f" and field.enclosingClass is [Class class: class.name == "C"]

其中 Field field: 類似於聲明變量， : 后面試前面變量需要滿足的條件，比如

field.name == "f"

這個就表示 field 的 name 為 f ，規則后續使用 and 表示與條件，然后通過 field.enclosingClass 獲取到這個字段位於的class，[...] 類似於定義一個變量，其返回值為滿足條件的對象

[Class class: class.name == "C"]

上面的語句表示 [] 會返回類名為 C 的 Class 對象

field.enclosingClass is [Class class: class.name == "C"]

這條語句的作用就是限制 field 所在的類的類名為 C ，其實 StructuralRule 的作用和使用方式和Codeql非常相似，主要就是利用邏輯表達式（and, or...）來匹配代碼的特定元素。

下面介紹CharacterizationRule的使用，首先定義 source 點

<CharacterizationRule formatVersion="19.10" language="cpp">
    <RuleID>EE5D-4B1D-A798-4D1B5E081112</RuleID>
    <StructuralMatch>
        <![CDATA[
            FunctionCall fc:
                fc.name contains "get_user_input_str" 
        ]]>                
    </StructuralMatch>
    <Definition>
        <![CDATA[
            TaintSource(fc, {+PRIVATE})
        ]]>                
    </Definition>
</CharacterizationRule>

其中 StructuralMatch 使用 StructuralRule 的語法來匹配代碼，然后在 Definition 里面可以使用一些API（比如TaintSource）和匹配到的代碼元素來標記污點跟蹤相關的熟悉，比如污點源、Sink點等，這里要注意一點：Definition 中可以訪問到 StructuralMatch 中聲明的所有變量，不論是通過 : 還是通過 [] 聲明。

上述規則的作用就是

首先通過 StructuralMatch 匹配到 get_user_input_str 的函數調用對象 fc.
然后在 Definition ，使用 TaintSource 設置 fc 為污點源，污點標記為 PRIVATE.

sink 點的設置

<CharacterizationRule formatVersion="3.12" language="cpp">
    <RuleID>EE905D4B-A03D-49B2-83E4-4EE043411223</RuleID>
    <VulnKingdom>Input Validation and Representation</VulnKingdom>
    <VulnCategory>System RCE</VulnCategory>
    <DefaultSeverity>4.0</DefaultSeverity>
    <Description><![CDATA[]]></Description>
    <StructuralMatch>
        <![CDATA[
            FunctionCall fc:
                fc.name contains "system" and  fc.arguments[0] is [Expression e:]
        ]]>                
    </StructuralMatch>
    <Definition>
        <![CDATA[
            TaintSink(e, [PRIVATE])
        ]]>                
    </Definition>
</CharacterizationRule>

規則解釋如下：

首先使用 StructuralMatch 匹配 fc 為 system 的函數調用， e 為 fc 的第0個參數
然后在 Definition 使用 TaintSink 設置 e 為sink點，污點標記為 PRIVATE.

這樣就表示如果 system 函數調用的第 0 個參數為污點數據且污點數據中包含 PRIVATE 標記，就會把這段代碼爆出來。

其他的規則（函數建模，clean_data函數）也是類似這里不再介紹，最終掃描結果如下圖：

在開發 Structural 相關規則時可以在分析時使用 -Ddebug.dump-structural-tree 把代碼的 structural tree 打印出來，然后我們根據樹的結構就可以比較方便的開發規則，完整命令行如下

/home/hac425/sca/fortify/bin/sourceanalyzer -no-default-rules -rules hello_array.xml -b fortify-example -scan -f fortify-example.fpr -D com.fortify.sca.MultithreadedAnalysis=false  -Ddebug.dump-structural-tree 2> tree.tree

打印出來的示例如下

根據樹狀結構就可以寫出匹配 global_array[user] 的代碼如下：

ArrayAccess aa: aa.index  is [VariableAccess va:va.name == "user"]

引用計數漏洞

本節相關代碼

https://github.com/hac425xxx/sca-workshop/blob/master/fortify-example/ref_rules/

漏洞代碼

int ref_leak(int *ref, int a, int b)
{
    ref_get(ref);
    if (a == 2)
    {
        puts("error 2");
        return -1;
    }
    ref_put(ref);
    return 0;
}

int ref_dec_error(int *ref, int a, int b)
{
    ref_get(ref);
    if (a == 2)
    {
        puts("ref_dec_error 2");
        ref_put(ref);
    }
    ref_put(ref);
    return 0;
}

ref_leak 的漏洞是當 a=2 時會直接返回沒有調用 ref_put 對引用計數減一，漏洞模型：在某些存在 return 的條件分支中沒有調用 ref_put 釋放引用計數。

ref_dec_error 的漏洞是在特定條件下會對引用計數減多次。

這種類型漏洞適合使用 ControlflowRule 來查詢對應的漏洞，對於規則如下

<ControlflowRule formatVersion="3.4" language="cpp">
    <RuleID>1650899A-908F-4301-B67A-D08E8E331122</RuleID>
    <VulnKingdom>API Abuse</VulnKingdom>
    <VulnCategory>Ref Leak</VulnCategory>
    <DefaultSeverity>3.0</DefaultSeverity>
    <Description><![CDATA[]]></Description>
    
    <FunctionIdentifier id="ref_put">
        <FunctionName>
            <Pattern>ref_put</Pattern>
        </FunctionName>
    </FunctionIdentifier>
    
    <FunctionIdentifier id="ref_get">
        <FunctionName>
            <Pattern>ref_get</Pattern>
        </FunctionName>
    </FunctionIdentifier>
    
    <PrimaryState>ref_add</PrimaryState>
    <Definition>
        <![CDATA[
            state start (start);
            state ref_add;
            state ref_dec;
            state no_leak;
            state checked;
            state leak (error) : "ref.leak";
            state double_dec (error): "ref dec 2";

            var p;
            start -> ref_add { $ref_get(p) }

            ref_add -> ref_dec { $ref_put(p) }
            ref_dec -> double_dec { $ref_put(p) }
            ref_dec -> checked { #return() }
            ref_add -> leak { #return() }
        ]]>
    </Definition>
</ControlflowRule>

首先 FunctionIdentifier 匹配 ref_put 和 ref_get 兩個函數，然后通過 Definition 定義規則

state start (start);
state ref_add;
state ref_dec;
state no_leak;
state checked;
state leak (error) : "ref.leak";
state double_dec (error): "ref dec 2";

var p;
start -> ref_add { $ref_get(p) }

ref_add -> ref_dec { $ref_put(p) }
ref_dec -> double_dec { $ref_put(p) }
ref_dec -> checked { #return() }
ref_add -> leak { #return() }

規則的解釋如下：

首先通過 state xxx 定義一些狀態，其中 (start) 表示狀態時初始狀態，(error) 表示對應狀態為錯誤狀態，只要代碼進入了錯誤狀態就會在掃描結果中呈現， var 用於定義一個臨時變量。
在規則中使用 $func_id 來引用之前使用 FunctionIdentifier 匹配到的函數。
start -> ref_add { $ref_get(p) } 表示從 start 狀態進入 ref_add 狀態的條件是調用了 ref_get 函數，入參為 p
ref_add -> leak { #return() } 表示從 ref_add 狀態進入 leak 狀態的條件是函數直接 return 返回了。
ref_add -> ref_dec { $ref_put(p) } 表示代碼在 ref_add 狀態情況下對 p 調用了 ref_put 后就會進入 ref_dec，即對引用計數減1.
如果在 ref_dec 狀態從函數返回，就表示函數沒有問題。
如果在 ref_dec 狀態下再次調用 ref_put(p) 則會進入 double_dec，會在掃描結果中呈現。

其他的tips

Fortify自帶的規則是加密過的的，我們可以根據已有的一些研究對其解密，然后參考官方的規則來開發新的規則

https://www.52pojie.cn/thread-783946-1-1.html

可以查看 fortify-sca-20.1.1.0007.jar 里面的 com.fortify.sca.nst.nodes 包里面的類，這些類表示的是fortify語法樹的各個節點，可以通過對應類的方法知道在結構化規則中可以訪問的方法和函數。

總結

Fortify相比codeql的優勢在於：

商用工具，擁有許多預設規則，比較成熟。
規則開發模式比較局限，但是對於某些特定場景的規則開發相對簡單。
適合大規模規則的掃描。

codeql 的語法非常靈活，可以靈活運用匹配出各種代碼片段，支持對大部分語法元素應用污點分析，比如支持設置數組索引位置為Sink點，經過各種嘗試，發現fortify不支持。

參考

https://bbs.huaweicloud.com/blogs/104311

https://tech.esvali.com/mf_manuals/html/sca_ssc/

https://paper.seebug.org/papers/old_sebug_paper/books/Fortify/rules-schema/

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 C/C++源碼掃描系列- codeql 篇 C/C++源碼掃描系列- Joern 篇 C++源碼學習 C和C++從零開始系列(一) C++核心篇 c++ 端口掃描程序 c++實現"掃描檢測硬件改動" 代碼掃描工具fortify C++/CLI入門系列第二篇：封裝C++ dll庫，提供接口給C#調用 C++ 系列：C++ 內存布局