clang9適配一階段總結

本文轉載自查看原文 2021-12-06 16:11 116 c++/ 技術/ clang

1. 概述

截止2021年11月25日，clang9完成sdk/gtest/dsopt模塊的編譯。

參照下面的腳本下載了所有[TR-16607] clang9交叉編譯工具鏈制作和驗證 - Enflame Company JIRA相關的修改，包含merged和當前還是open狀態的修改：

怎么從gerrit批量導出詳細的patch - 周榮華_Ronghua - enflame wiki

特地說明一下，gerrit的query命令里面不能有括號，所以實際如果存在多個條件的復雜聯合時，默認是AND運算，如果想使用OR運算的話，需要把多個可選表達式用OR連接起來。

簡單統計了一下，新增3924行代碼，刪除4164行代碼：

PS D:\code> grep "^+[^+]" .\diffrecord.txt |wc
   3924   24785  152346
PS D:\code> grep "^-[^-]" .\diffrecord.txt |wc
   4164   23159  147430

前期修改的時候，由於打開了-Werr選項，所以有一些是不太重要的告警，由於告警實在太多，后期將-Werr臨時先關閉了，只保留了部分特定的Werr選項。

另外，由於tops下面的代碼中從大的整型向小的整型隱式轉換的非常多，后面還用-Wno-c++11-narrowing臨時關閉了相關告警。

2. 問題發現和解決的方法

如果每次發現一個問題之后，修改完之后，再走全量編譯，通常非常耗時，下面的方法可以獲取單個的編譯或者鏈接命令，便於針對性驗證。

2.1. cmake的編譯命令獲取

cmake有編譯字典，在cmake_build(敲cmake命令的目錄，可能是其他目錄)目錄下會生成一個“compile_commands.json”文件，里面記錄了所有.c/.cc/.cpp生成.o的目錄和完整命令，例如想知道

hlir_utils_test.cc的編譯命令，可以用下面的途徑獲取：

grep hlir_utils_test.cc compile_commands.json
  "command": "/opt/efb/clang9/bin/clang++  -DLLVM_DISABLE_ABI_BREAKING_CHECKS_ENFORCING -D_GLIBCXX_USE_CXX11_ABI=0 -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/sdksrc/include/_virtual_includes/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/sdksrc/include/_virtual_includes/include/dtu -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/lib/umd/include/_virtual_includes/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/ef_log/include/_virtual_includes/include -I/home/ronghua.zhou/clang1_build/tops/sdk -I/home/ronghua.zhou/clang1_build/tops/sdk/lib -I/home/ronghua.zhou/clang1_build/tops/sdk/lib/cpu_ops -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/llvm-project/llvm/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/llvm-project/mlir/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/org_tensorflow -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/eigen_archive -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/com_google_absl -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/external/com_google_protobuf/src -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/external/dtu_sdk/bazel-bin -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/external/llvm-project/llvm/utils/unittest/googlemock/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/external/com_googlesource_code_re2 -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/lib -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/org_tensorflow -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/llvm-project/llvm/include -I/home/ronghua.zhou/clang1_build/tops/../test_sdk_build/execroot/dtu_sdk/bazel-out/k8-opt/bin/external/llvm-project/mlir/include -isystem /home/ronghua.zhou/clang1_build/tops/3rdparty/googletest/include -isystem /home/ronghua.zhou/clang1_build/tops/3rdparty/googletest  -O3 -g0 -DNDEBUG -fPIE   -m64 -march=x86-64 -mtune=generic -Werror=array-bounds -Werror=empty-body -Werror=format-extra-args -Werror=incompatible-pointer-types -Werror=array-bounds-pointer-arithmetic -Werror=c++-compat -Werror=shift-count-overflow -Werror=sizeof-pointer-memaccess -Werror=for-loop-analysis -Werror=unused-label -Werror=delete-incomplete -Werror=empty-translation-unit -Werror=unused-local-typedef -Werror=gnu-case-range -Werror=mismatched-new-delete -Werror=infinite-recursion -Werror=unreachable-code -Werror=sometimes-uninitialized -Werror=c++14-binary-literal -Werror=implicit-fallthrough -Werror=constant-logical-operand -Werror=exceptions -fcxx-exceptions -Werror=extra-tokens -Werror=format -Werror=format-security -Werror=header-guard -Werror=literal-conversion -Werror=null-conversion -Werror=pointer-bool-conversion -Werror=shift-overflow -Werror=tautological-constant-out-of-range-compare -Werror=tautological-pointer-compare -Werror=varargs -Wdouble-promotion -Wno-error=extern-c-compat -Wall -Wno-c++11-narrowing -Wextra -fsanitize=address -fno-omit-frame-pointer -std=gnu++14 -std=gnu++14 -o sdk/tests/hlir/cc_tests/CMakeFiles/hlir_utils_test.dir
hlir_utils_test.cc.o -c /home/ronghua.zhou/clang1_build/tops/sdk/tests/hlir/cc_tests/hlir_utils_test.cc",
  "file": "/home/ronghua.zhou/clang1_build/tops/sdk/tests/hlir/cc_tests/hlir_utils_test.cc"

2.2. bazel的編譯命令獲取

•https://github.com/vincent-picaud/Bazel_and_CompileCommands

上面這個開源項目提到可以用–experimental_action_listener=//tools/actions:generate_compile_commands_listener到bazel命令的方式來實現接收編譯命令，但我用了幾次沒有成功，最終改為在編譯過程中用原始的ps命令來獲取，例如想獲取hlir_utils_test.ccbian編譯命令可以用下面的命令：

ps -elf |grep hlir_utils_test.cc

另外，bazel命令后面加上-s參數也可以達到獲取后續編譯命令的效果。

2.3. 鏈接命令的獲取

如果知道鏈接的具體目標文件，可以參照2.2的方法用ps命令獲取，例如要鏈接libdtu_sdk.so，可以用下面命令獲取鏈接命令：

ps -elf |grep libdtu_sdk.so

如果不清楚鏈接的具體目標，在鏈接對象不多的情況下可以用“ps -elf”獲取一個全集，從全集里面可以看到很多“ld @/tmp/response-xxx.txt”的進程，將當前所有的/tmp/response*拷貝到別的目錄下，研究下這些文件用來鏈接生成什么目標的，這些文件里面會有完整的鏈接命令和參數，通過這個文件可以得到鏈接命令。

3. 實際修改分類

3.1. 編譯選項的修改

3.1.1. 增加的選項

-fcxx-exceptions ：因為dsopt使用了異常，clang的異常處理默認關閉，需要打開。

-Wno-c++11-narrowing ：tops下面的代碼中從大的整型向小的整型隱式轉換的非常多，臨時關閉，等各個組件消除了相關問題之后再打開，clang里面把從大整型到小整型的隱式轉換當做錯誤處理。

3.1.2. 刪除的選項

-Werror ：告警實在太多，要求消除所有告警不現實，臨時先刪除該選項。

3.1.3. 修改的選項

set (CMAKE_CXX_STANDARD 14) ：原來的默認標准是17，和TensorFlow的默認標准14沖突，也和gcc的默認標准14沖突，改成c++14。

-fno-canonical-system-headers ：這個參數僅gcc支持，clang不支持，所以把它從所有編譯器都打開，改到僅gcc打開。

3.1.4. bazel的選項說明

bazel的編譯選項分copt/cxxopt/conlyopt，其中copt是c和c++公用的選項，cxxopt是僅c++才是用的選項，conlyopt是僅c才有的選項，如果用錯了，會出現很多告警。

3.1.5. CMAKE的CMAKE_TOOLCHAIN_FILE變量在rerun的時候，有一定概率會把搜索路徑下的工具鏈配置文件加上全路徑，導致直接STREQUAL判斷失敗

解決方案是用MATCHES代替STREQUAL，通配是否增加全路徑的情況：

CMakeLists.txt Expand source

3.2. 模板相關錯誤

3.2.1. use 'template' keyword to treat 'cast' as a dependent template name

clang里面對在一個模板實例化后的對象中調用一個需要動態翻譯的函數，需要使用template顯示說明，否則會報錯。參照ISO C++03 14.2/4：

When the name of a member template specialization appears after . or -> in a postfix-expression, or after nested-name-specifier in a qualified-id, and the postfix-expression or qualified-id explicitly depends on a template-parameter (14.6.2), the member template name must be prefixed by the keyword template. Otherwise the name is assumed to name a non-template.

例如hlir的SinkTransposeWithScalarBroadcast類里面調用了mlir::RankedTensorType、mlir::ShapedType的cast方法

diff --git a/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc b/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc
index c82fa217a21..9952ddbc470 100644
--- a/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc
+++ b/sdk/lib/hlir/transforms/TopsInferenceHlirPass/HlirTransposeMoverExt.cc
@@ -237,11 +237,14 @@ struct SinkTransposeWithScalarBroadcast : public mlir::OpRewritePattern<T> {
     }
     llvm::SmallVector<mlir::Value, 4> new_operands(root->getNumOperands(), {});
     for (auto& it : broadcast_ops) {
-      auto transposedTy = getTransposedType(std::get<1>(it)
-                                                ->getResult(0)
-                                                .getType()
-                                                .cast<mlir::RankedTensorType>(),
-                                            prePermutation);
+      // fix error:
+      // use 'template' keyword to treat 'cast' as a dependent template name
+      auto transposedTy =
+          getTransposedType(std::get<1>(it)
+                                ->getResult(0)
+                                .getType()
+                                .template cast<mlir::RankedTensorType>(),
+                            prePermutation);
       auto new_attr = llvm::cast<HlirOp::BroadcastInDimOp>(std::get<1>(it))
                           .broadcast_dimensionsAttr();
       if (new_attr) {
@@ -251,7 +254,7 @@ struct SinkTransposeWithScalarBroadcast : public mlir::OpRewritePattern<T> {
           new_data[i] = layout[data[i]];
         }
         new_attr = mlir::DenseIntElementsAttr::get(
-            new_attr.getType().cast<mlir::RankedTensorType>(),
+            new_attr.getType().template cast<mlir::RankedTensorType>(),
             llvm::makeArrayRef(new_data));
       }
       mlir::Operation* transpose_bs_op =
@@ -274,7 +277,7 @@ struct SinkTransposeWithScalarBroadcast : public mlir::OpRewritePattern<T> {
     mlir::Operation* ret_transpose = rewriter.create<HlirOp::TransposeOp>(
         root->getLoc(), root->getResult(0).getType(), new_root->getResult(0),
         mlir::DenseIntElementsAttr::get(
-            permutation.getType().cast<mlir::ShapedType>(), layout));
+            permutation.getType().template cast<mlir::ShapedType>(), layout));
     root->replaceAllUsesWith(ret_transpose);
   }

注意，如果不是模板實例化的函數，不需要加template，同一個類里面也存在不需要處理的函數調用，例如同一個文件里面的ss對象是非模板實例化的，類型是固定的mlir::Operation*，ss在調用存在多態的cast函數時就不需要使用temple進行前置聲明：

mlir::Operation* ss = op.getOperation();
auto new_operand_ty = getTransposedType(operand_ty, prePermutation);
auto new_source_ty = getTransposedType(source_ty, prePermutation);
auto new_result_ty = getTransposedType(
    ss->getResult(0).getType().cast<mlir::RankedTensorType>(),
    prePermutation);

同樣的問題也存在於factor模塊的factor_profiler_pass.cc中：

diff --git a/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc b/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc
index 43419fd305a..ad23a709f20 100644
--- a/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc
+++ b/sdk/lib/factor/codegen/passes/factor_profiler_pass.cc
@@ -55,11 +55,11 @@ mlir::Value getFirstOperand<mlir::Value>(mlir::Value op) {
  
 template <typename T>
 int getSrcCompressed(T op) {
-  return op.template dma_src_compressedAttr().getInt();
+  return op.dma_src_compressedAttr().getInt();
 }
 template <typename T>
 int getDstDecompressed(T op) {
-  return op.template dma_dst_decompressAttr().getInt();
+  return op.dma_dst_decompressAttr().getInt();
 }
  
 #define DISABLE_DMA_COMPRESS_ATTR_GETTER(OP) \
@@ -84,11 +84,11 @@ DISABLE_DMA_COMPRESS_ATTR_GETTER(mlir::factor::FactorDeSliceOp)
  
 template <typename T>
 int getReverseLr(T op) {
-  return op.template dma_reverse_lrAttr().getInt();
+  return op.dma_reverse_lrAttr().getInt();
 }
 template <typename T>
 int getReverseTb(T op) {
-  return op.template dma_reverse_tbAttr().getInt();
+  return op.dma_reverse_tbAttr().getInt();
 }
  
 #define DISABLE_REVERSE_ATTR_GETTER(OP) \
@@ -114,7 +114,7 @@ DISABLE_REVERSE_ATTR_GETTER(mlir::factor::FactorDeSliceOp)
  
 template <typename T>
 int getDmaType(T op) {
-  return op.template dma_typeAttr().getInt();
+  return op.dma_typeAttr().getInt();
 }
  
 #define DISABLE_DMA_TYPE_GETTER(OP) \
@@ -142,8 +142,8 @@ std::string formatDmaAttrs(int direction, int src_compressed,
 template <typename T>
 void extractDmaMetaInfoTo(T op, dtu_activity_data &data) {
   auto &args = data.args;
-  mlir::Value from = getFirstOperand(op.template from());
-  mlir::Value to = getFirstOperand(op.template to());
+  mlir::Value from = getFirstOperand(op.from());
+  mlir::Value to = getFirstOperand(op.to());
   auto engine_type = getDmaType(op);
   auto direction = op.dma_directionAttr().getInt();

3.2.2. 二義性

部分模板實例化的時候，如果同一個調用用模板函數A和模板函數B都能正常匹配到，clang會報二義性錯誤，gcc不報錯。

例如下面的EraseHelp，原來的版本定義了兩種原型，其實對存在多個模板類型需要使用TypeSequence進行原型定義的時候，編譯器其實不知道是該先把Last抽出來計算，還是先把Inner抽出來計算，如果這2個函數的實現邏輯不一樣的話，在gcc里面居然沒報錯，不知道是隨機找到一個匹配的原型就調用，還是用第一個或者最后一個原型來調用。

constexpr static auto EraseHelp(TypeSequence<Left...>, TypeSequence<Last>);

constexpr static auto EraseHelp(TypeSequence<Left...>, TypeSequence<Inner, Right...>);

diff --git a/sdk/lib/hlir/ir/type_utils.h b/sdk/lib/hlir/ir/type_utils.h
index 3cf2bc7994a..0e645fd1e7e 100644
--- a/sdk/lib/hlir/ir/type_utils.h
+++ b/sdk/lib/hlir/ir/type_utils.h
@@ -157,12 +157,9 @@ struct EraseSeqIf {
     using type = decltype(EraseHelp(LeftSeq(), TypeSequence<Right...>()));
     return type();
   }
-  template <typename... Left, typename Last>
-  constexpr static auto EraseHelp(TypeSequence<Left...>, TypeSequence<Last>) {
-    using type = typename std::conditional<!Pred<Last>::value,
-                                           TypeSequence<Left..., Last>,
-                                           TypeSequence<Left...>>::type;
-    return type();
+  template <typename... Left>
+  constexpr static auto EraseHelp(TypeSequence<Left...>, TypeSequence<>) {
+    return TypeSequence<Left...>();
   }
   using type = decltype(EraseHelp(TypeSequence<>(), TypeSequence<T...>()));
 };

3.3. 類型不匹配

3.3.1. 大整型向小整型的隱式轉換

例如sdk/tests/llir/dataflow1_pingpang_buffer_test.cc里面定義的func_entry是int64_t類型，但實際調用函數的時候，函數原型要求的入參是uint32_t，會觸發int64_t → uint32_t的隱式轉換：

diff --git a/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc b/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc
index fa824f03d9a..70298b1fb59 100644
--- a/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc
+++ b/sdk/tests/llir/dataflow1_pingpang_buffer_test.cc
@@ -522,7 +522,7 @@ TEST(Pavo2xCDMAPattern1Test, Pavo2xCDMAPattern1WithPingpangTest) {
                              {{0}, {1}, {2}, {3}, {4}, {5}}, 1, 1, 1, -1, -1,
                              output_queues_l1);
  
-    int64_t func_entry = 0;
+    uint32_t func_entry = 0;
     // trigger sip
     for (uint64_t idx = 0; idx < SIP_COUNT; ++idx) {
       std::string sip_name = std::string("sip") + std::to_string(idx);

其他類似的有：

sdk/tests/llir/dataflow1_test.cc

sdk/tests/llir/dataflow2_test.cc

sdk/tests/llir/dataflow3_test.cc

sdk/tests/llir/dataflow5_test.cc

sdk/tests/llir/dataflow5_test_1xcdma.cc

sdk/tests/llir/dataflow7_test.cc

sdk/tests/llir/llir2assembler_leo_test.cc

sdk/tests/llir/utils/llir_test_util.cc

sdk/tests/llir/utils/llir_test_util.h

3.3.2. 有符號向無符號的隱式轉換

-1轉換為無符號整型：

diff --git a/sdk/lib/hlir/ir/type_utils.h b/sdk/lib/hlir/ir/type_utils.h
index 0e645fd1e7e..f84360269f3 100644
--- a/sdk/lib/hlir/ir/type_utils.h
+++ b/sdk/lib/hlir/ir/type_utils.h
@@ -122,10 +122,9 @@ struct FindIf<Pred, T, R...> {
  
 template <template <typename N> typename Pred, typename T>
 struct FindIf<Pred, T> {
-  using type =
-      typename std::conditional<Pred<T>::value,
-                                std::integral_constant<size_t, 0>,
-                                std::integral_constant<size_t, -1>>::type;
+  using type = typename std::conditional<
+      Pred<T>::value, std::integral_constant<size_t, 0>,
+      std::integral_constant<size_t, static_cast<size_t>(-1)>>::type;
 };

其他主要體現在迭代器定義的是int類型，但實際使用過程中需要和很多uint32_t進行比較，導致了隱式的int → uint32的轉換：

diff --git a/sdk/lib/umd/tests/sample/launch_code.cc b/sdk/lib/umd/tests/sample/launch_code.cc
index 1152a283052..708b1f44e7d 100644
--- a/sdk/lib/umd/tests/sample/launch_code.cc
+++ b/sdk/lib/umd/tests/sample/launch_code.cc
@@ -719,11 +716,10 @@ static void _launch_code_for_eight_sip(int cid, bool check_result) {
   dtu_mem_handle param = cluster_mem[cid];
   u64 param_off = A_B_SIZE + EIGHT_C_SIZE;
   u64 param_size = PARAM_TRUE_SIZE;
-  u16 launch_entry = 0;
   dtu_sip_mode_cfg_st mode;
   mode.mode_dw = 0x5070f10;
   LaunchKernelParameter parameter[8];
-  for (int i = 0; i < run_sip_count; i++) {
+  for (u32 i = 0; i < run_sip_count; i++) {
     parameter[i] =
         LaunchKernelParameter(sip[i], param, param_off + i * ONE_PARAM_SIZE,
                               param_size, 0, mode, 0, false, false, "op_0");

其他文件：

sdk/lib/spm/src/buddy_policy.c

system_test/tools/vpd_cycle/vpd_cycle.c

sdk/lib/spm/include/spm.h

sdk/tests/llir/llir2assembler_leo_test.cc

sdk/tests/llir/dataflow5_test_1xcdma.cc

sdk/tests/llir/dataflow5_test.cc

sdk/tests/llir/llir2assembler_leo_test.cc

sdk/tests/llir/utils/llir_test_util.cc

sdk/tests/llir/utils/llir_test_util.h

對sdk/lib/umd/tools/kernel_code_processor/dturt.inc的修改要麻煩一點，___leo_runtime___和___x_runtime___定義的時候是char[]，但初始化有可能大於127，會導致溢出，但使用該變量的函數，以及二級引用的函數，都要求它是char[]，最終修改是定義改成unsigned char[]，但在一級引用的函數中做一次強制轉換。

diff --git a/sdk/lib/umd/tools/kernel_code_processor/dturt.inc b/sdk/lib/umd/tools/kernel_code_processor/dturt.inc
index 1f22b52d8af..d1ed30a049d 100644
--- a/sdk/lib/umd/tools/kernel_code_processor/dturt.inc
+++ b/sdk/lib/umd/tools/kernel_code_processor/dturt.inc
@@ -1,4 +1,4 @@
-static const char ___leo_runtime___[] = {
+static const unsigned char ___leo_runtime___[] = {
     0x21, 0x3C, 0x61, 0x72, 0x63, 0x68, 0x3E, 0x0A, 0x2F, 0x20, 0x20, 0x20,
     0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
     0x30, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
@@ -2885,7 +2885,7 @@ static const char ___leo_runtime___[] = {
     0x00, 0x00,
 };
 static const int ___leo_runtime_size___ = sizeof(___leo_runtime___);
-static const char ___x_runtime___[] = {
+static const unsigned char ___x_runtime___[] = {
     0x21, 0x3C, 0x61, 0x72, 0x63, 0x68, 0x3E, 0x0A, 0x2F, 0x20, 0x20, 0x20,
     0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
     0x30, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,

diff --git a/sdk/lib/umd/tools/kernel_code_processor/kernel_code.h b/sdk/lib/umd/tools/kernel_code_processor/kernel_code.h
index d193a8823ac..f61d048cfd6 100644
--- a/sdk/lib/umd/tools/kernel_code_processor/kernel_code.h
+++ b/sdk/lib/umd/tools/kernel_code_processor/kernel_code.h
@@ -128,7 +128,10 @@ class Kernel {
   struct __target__ : public KernelCode<__target__>, public Kernel {         \
     using KernelCode<__target__>::KernelCode;                                \
     static const llvm::StringRef GetArch() { return #__arch__; }             \
-    static const char* GetRTBuffer() { return ___##__arch__##_runtime___; }  \
+    static const char* GetRTBuffer() {                                       \
+      return static_cast<char*>(static_cast<void*>(                          \
+          const_cast<unsigned char*>(___##__arch__##_runtime___)));          \
+    }                                                                        \
     static int GetRTBufferSize() { return ___##__arch__##_runtime_size___; } \
   };                                                                         \
   template class KernelCode<__target__>

3.3.3. 浮點向整型的隱式轉換

小數點直接轉沒了，非0值立即成了0值：

diff --git a/sdk/tests/tops/tops_bnForwardTrainingEx_integration_test.cc b/sdk/tests/tops/tops_bnForwardTrainingEx_integration_test.cc
index 73712aba4ad..df82dadfa65 100644
--- a/sdk/tests/tops/tops_bnForwardTrainingEx_integration_test.cc
+++ b/sdk/tests/tops/tops_bnForwardTrainingEx_integration_test.cc
@@ -1890,7 +1890,7 @@ TEST_F(TopsTest, topsConvolutionForward_BatchNorm_RELU_UV) {
   int k_c = 1;
   int k_h = 3;
   int k_w = 3;
-  int epsilon = 0.01;
+  float epsilon = 0.01;
  
   int input_size = n * c * h * w;
   int kernel_size = k_n * k_c * k_h * k_w;
@@ -2181,7 +2181,7 @@ TEST_F(TopsTest, topsConvolutionForward_BatchNorm_RELU_SV) {
   int k_c = 1;
   int k_h = 3;
   int k_w = 3;
-  int epsilon = 0.01;
+  float epsilon = 0.01;
  
   int input_size = n * c * h * w;
   int kernel_size = k_n * k_c * k_h * k_w;

其他類似修改：

sdk/tests/op/hlir/pavo/bert/hlir_div_test.cc

3.3.4. double向float的隱式轉換

diff --git a/sdk/lib/umd/tests/sample/launch_code.cc b/sdk/lib/umd/tests/sample/launch_code.cc
index 1152a283052..708b1f44e7d 100644
--- a/sdk/lib/umd/tests/sample/launch_code.cc
+++ b/sdk/lib/umd/tests/sample/launch_code.cc
@@ -783,8 +779,8 @@ static void _launch_code_for_eight_sip(int cid, bool check_result) {
     float *result = (float *)((u64)dtu_mem_get_cpu_ptr(host_mem) + A_B_SIZE);
     for (u32 i = 0; i < (run_sip_count * DTU_ALIGN(DATA_BUFF_SIZE, 128)) / 4;
          i++) {
-      if (result[i] - ((2 * i) % (DATA_BUFF_SIZE / 2)) > 0.01 ||
-          ((2 * i) % (DATA_BUFF_SIZE / 2)) - result[i] > 0.01) {
+      if (result[i] - ((2 * i) % (DATA_BUFF_SIZE / 2)) > 0.01f ||
+          ((2 * i) % (DATA_BUFF_SIZE / 2)) - result[i] > 0.01f) {
         dtu_command_queue_destroy(queue);
         dtu_mem_free_hbm(hbm_mem);
         dtu_mem_free_host(host_mem);
@@ -425,7 +425,7 @@ static void launch_code_for_one_sip(void) {
  
   float *result = (float *)((u64)dtu_mem_get_cpu_ptr(host_mem) + A_B_SIZE);
   for (int i = 0; i < DATA_BUFF_SIZE / 4; i++) {
-    if (result[i] - (2 * i) > 0.01 || (2 * i) - result[i] > 0.01) {
+    if (result[i] - (2 * i) > 0.01f || (2 * i) - result[i] > 0.01f) {
       dtu_command_queue_destroy(queue);
       dtu_mem_free_hbm(hbm_mem);
       dtu_mem_free_host(host_mem);
@@ -605,8 +605,8 @@ static void launch_one_sip_twice(void) {
  
   float *result = (float *)((u64)dtu_mem_get_cpu_ptr(host_mem) + A_B_SIZE);
   for (int i = 0; i < 2 * DATA_BUFF_SIZE / 4; i++) {
-    if (result[i] - ((2 * i) % (DATA_BUFF_SIZE / 2)) > 0.01 ||
-        ((2 * i) % (DATA_BUFF_SIZE / 2)) - result[i] > 0.01) {
+    if (result[i] - ((2 * i) % (DATA_BUFF_SIZE / 2)) > 0.01f ||
+        ((2 * i) % (DATA_BUFF_SIZE / 2)) - result[i] > 0.01f) {
       dtu_command_queue_destroy(queue);
       dtu_mem_free_hbm(hbm_mem);
       dtu_mem_free_host(host_mem);

其他類似修改：

sdk/tests/op/hlir/pavo/resnet50/hlir_general_resize_test.cc

3.3.5. 指針向bool的隱式轉換

diff --git a/system_test/tools/vpd_cycle/vpd_cycle.c b/system_test/tools/vpd_cycle/vpd_cycle.c
index 31d57fa0f9c..ccc9f71b827 100644
--- a/system_test/tools/vpd_cycle/vpd_cycle.c
+++ b/system_test/tools/vpd_cycle/vpd_cycle.c
@@ -75,14 +83,14 @@ static int ProcessDB(const char *path) {
   char *name = strdup(path);
   char *base = basename(name);
   char *p;
-  if (p = strrchr(base, '.')) *p = '\0';
+  if ((p = strrchr(base, '.')) != NULL) *p = '\0';
   fprintf(output_fp, "%s,%lu\n", base, end - start);
   free(name);

3.3.6. 不同類型隱式轉換

fixed_size_mem_pool.h直接將dtu_status和int相互賦值，雖然dtu_status是個enum類型，和int類型很類似，但clang是強類型檢查，直接報錯。

diff --git a/sdk/runtime/lib/top_scheduler/fixed_size_mem_pool.h b/sdk/runtime/lib/top_scheduler/fixed_size_mem_pool.h
index 73d02f3b1f4..b7be6ee39c4 100644
--- a/sdk/runtime/lib/top_scheduler/fixed_size_mem_pool.h
+++ b/sdk/runtime/lib/top_scheduler/fixed_size_mem_pool.h
@@ -118,7 +118,7 @@ class DeviceFixedSizeMemPool final
   ~DeviceFixedSizeMemPool() {}
  
   Status Init(dtu_umd::MemoryMgr *mgr, uint32_t mc, uint32_t flags) override {
-    dtu_status status = 0;
+    dtu_status status = DTU_SUCCESS;
     status =
         mgr->AllocDevice(NODE_NUMBER * NODE_SIZE, mc, flags, &(this->mem_));
     if (status) {
--

dtu_status的定義：

typedef enum dtu_status_code {
  DTU_SUCCESS = 0,
  DTU_ERROR_INVALID_PARAMETER = -100,
  DTU_ERROR_INVALID_MEM_TYPE = -101,
  DTU_ERROR_OUT_OF_MEMORY = -102,
  DTU_ERROR_OUT_OF_RESOURCES = -103,
  DTU_ERROR_NOT_INITIALIZED = -104,
  DTU_ERROR_INVALID_CTX_OBJ = -105,
  DTU_ERROR_INVALID_CLUSTER_OBJ = -106,
  DTU_ERROR_INVALID_SIP_OBJ = -107,
  DTU_ERROR_INVALID_MEM_OBJ = -108,
  DTU_ERROR_INVALID_CMD_QUEUE_OBJ = -109,
  DTU_ERROR_INVALID_CMD_DESC_OBJ = -110,
  DTU_ERROR_INVALID_PROGRAM_OBJ = -111,
  DTU_ERROR_INVALID_FUNCTION_OBJ = -112,
  DTU_ERROR_INVALID_EVENT_OBJ = -113,
  DTU_ERROR_CLUSTER_BUSY = -114,
  DTU_ERROR_SIP_BUSY = -115,
  DTU_ERROR_IN_DRM = -116,
  DTU_ERROR_IN_IOCTRL = -117,
  DTU_ERROR_GEM_CREATE = -118,
  DTU_ERROR_GEM_CLOSE = -119,
  DTU_ERROR_GEM_MMAP = -120,
  DTU_ERROR_GEM_UNMMAP = -121,
  DTU_ERROR_CMD_QUEUE_SYNC = -122,
  DTU_ERROR_CMD_QUEUE_EMIT = -123,
  DTU_ERROR_CLUSTER_ACQUIRE = -124,
  DTU_ERROR_CLUSTER_RELEASE = -125,
  DTU_ERROR_NOT_MATCH = -126,
  DTU_ERROR_NOT_RELEASE_REF = -127,
  DTU_ERROR_GET_DEVICE_HDL = -128,
  DTU_ERROR_ALLOC_HOST = -129,
  DTU_ERROR_ALLOC_HBM = -130,
  DTU_ERROR_ALLOC_CLUSTER = -131,
  DTU_ERROR_FREE_HOST = -132,
  DTU_ERROR_FREE_HBM = -133,
  DTU_ERROR_FREE_CLUSTER = -134,
  DTU_ERROR_CMD_QUEUE_EMITED = -135,
  DTU_ERROR_OPEN_FILE = -136,
  DTU_ERROR_READ_FILE = -137,
  DTU_ERROR_WRITE_FILE = -138,
  DTU_ERROR_INVALID_BIN_TYPE = -139,
  DTU_ERROR_LOAD_BIN_FILE = -140,
  DTU_ERROR_LOAD_BIN_IMAGE = -141,
  DTU_ERROR_FUNCTION_NOT_FOUND = -142,
  DTU_ERROR_INVALID_OPERATION = -143,
  DTU_ERROR_EVENT_GET_ID = -144,
  DTU_ERROR_EVENT_WAIT_STATUS = -145,
  DTU_ERROR_EVENT_SIGNAL_STATUS = -146,
  DTU_ERROR_EVENT_TYPE = -147,
  DTU_ERROR_EVENT_NOT_SUBMIT = -148,
  DTU_ERROR_EVENT_DESTROYED = -149,
  DTU_ERROR_EVENT_SIGNAL_TWICE = -150,
  DTU_ERROR_MEMORY_OVERLAP = -151,
  DTU_ERROR_THREAD_POOL_QUEUE_OVERFLOW = -152,
  DTU_ERROR_PCI_BUS_SCAN = -153,
  DTU_ERROR_ALLOC_USERPTR = -154,
  DTU_ERROR_FREE_USERPTR = -155,
  DTU_ERROR_DUMP_CMEM = -156,
  DTU_ERROR_LOAD_CMEM = -157,
  DTU_ERROR_DUMP_SMEM = -158,
  DTU_ERROR_LOAD_SMEM = -159,
  DTU_ERROR_READ_REGISTERS = -160,
  DTU_ERROR_WRITE_REGISTERS = -161,
  DTU_ERROR_ALLOC_SIP = -162,
  DTU_ERROR_FREE_SIP = -163,
  DTU_ERROR_UNKNOWN = -164,
  DTU_ERROR_ALLOC_HUGE = -165,
  DTU_ERROR_INVALID_USR_IRQ_OBJ = -166,
  DTU_ERROR_LINK_CCIX_IO = -167,
  DTU_ERROR_PLACEHOLDER_NOT_FEED = -168,
  DTU_ERROR_LAUNCH_DMA = -169,
  DTU_ERROR_INVALID_PROFILE_MAGIC = -170,
  DTU_ERROR_INVALID_TIMESTAMP = -180,
  DTU_ERROR_INVALID_CONFIG = -181,
  DTU_ERROR_CHILD_NOT_SUBMIT = -182,
  DTU_ERROR_ALREADY_FORKED = -183,
  DTU_ERROR_LABEL_USED = -184,
  DTU_ERROR_LABEL_NOT_VALIDATED = -185,
  DTU_ERROR_COMMAND_TYPE_MISMATCH = -186,
  DTU_ERROR_VECTOR_NUMBER = -187,
  DTU_ERROR_VECTOR_FLAG_MISMATCH = -188,
  DTU_ERROR_DEVICE_RESET = -189,
  DTU_ERROR_EXECUTABLE_CRC_VERIFY = -190,
  DTU_ERROR_EXECUTABLE_DEVICE_VERIFY = -191,
  DTU_ERROR_INVALID_TS_OBJ = -192,
  DTU_ERROR_ALLOC_VDEV = -193,
  DTU_ERROR_FREE_VDEV = -194,
  DTU_ERROR_VDEV_BUSY = -195,
} dtu_status;

NULL和0的值雖然一樣，但前者的類型是void*，后者類型是int，差別很大的。

diff --git a/sdk/lib/umd/tests/sample/sample_run.cc b/sdk/lib/umd/tests/sample/sample_run.cc
index c5a3557c2a5..23e9563859b 100644
--- a/sdk/lib/umd/tests/sample/sample_run.cc
+++ b/sdk/lib/umd/tests/sample/sample_run.cc
@@ -35,7 +35,7 @@ void usage() {
  
 dtu_context ctx;
 dtu_cluster cluster[4] = {NULL};
-u32 cluster_id[4] = {NULL};
+u32 cluster_id[4] = {0};
 dtu_mem_handle cluster_mem[4] = {NULL};
 dtu_sip sip[32] = {NULL};

3.3.7. 函數原型中的const隱式轉換

diff --git a/sdk/lib/cpu/cpu_func_manager.cc b/sdk/lib/cpu/cpu_func_manager.cc
index 940bde5d91a..ec8967203c2 100644
--- a/sdk/lib/cpu/cpu_func_manager.cc
+++ b/sdk/lib/cpu/cpu_func_manager.cc
@@ -31,7 +31,7 @@ struct FunctionInvoker {
   }
   template <size_t... idx>
   void unpack(std::index_sequence<idx...> seq, const void* func, char** argvs) {
-    (*reinterpret_cast<void (*)(...)>(func))(argvs[idx]...);
+    (*reinterpret_cast<void (*)(...)>(const_cast<void*>(func)))(argvs[idx]...);
   }
 };

3.3.8. void向char的隱式轉換

很多模塊直接對void*指針多算術運算，void*指向的對象大小是未知的，一般如果把它作為地址進行+或者-運算，實際上是自己先做了一次隱式的void* → char*的轉換，clang中不允許這樣做：

diff --git a/sdk/tests/hlir/cc_tests/hlir_4c_add_test.cc b/sdk/tests/hlir/cc_tests/hlir_4c_add_test.cc
index 5b9c2dcc98f..7b568f934bb 100644
--- a/sdk/tests/hlir/cc_tests/hlir_4c_add_test.cc
+++ b/sdk/tests/hlir/cc_tests/hlir_4c_add_test.cc
@@ -66,8 +66,8 @@ static void Add4CTest(SimpleModuleOpBuilder::ShapeType &shape,
   executor.run(false);
  
   auto output_hanlde = executor.get_output(0);
-  T* result =
-      static_cast<T*>(output_hanlde->CPUPtr() + output_hanlde->offset());
+  T* result = static_cast<T*>(static_cast<void*>(
+      static_cast<char*>(output_hanlde->CPUPtr()) + output_hanlde->offset()));
   for (size_t i = 0; i < l_data.size(); ++i) {
     EXPECT_EQ(result[i], out_data[i]);
   }

其他類似修改：

sdk/tests/hlir/cc_tests/hlir_corner_test.cc

sdk/tests/hlir/cc_tests/hlir_press_test.cc

sdk/tests/tops/tops_dot_test.cc

sdk/tests/op/hlir/pavo/bert/hlir_broadcast_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_transpose_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_test_header.h

sdk/tests/op/hlir/pavo/resnet50/hlir_slice_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_select_and_scatter_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_select_and_scatter_non4c_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_pad_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_dynamic_update_slice_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_dynamic_slice_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_concat_test.cc

sdk/tests/op/hlir/pavo/resnet50/hlir_broadcast_test.cc

sdk/tests/op/hlir/pavo/dnn/hlir_test_header.h

sdk/tests/op/hlir/hlir_test_header.h

sdk/tests/runtime/executable_test.cc

3.3.9. string類型到char*的隱式轉換

diff --git a/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc b/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc
index 7e366337561..41fb573a562 100644
--- a/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc
+++ b/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc
@@ -23,7 +23,7 @@ KernelCode<T>::KernelCode(StringRef file)
     : compiled_(false), name_(file), module_("KernelModule", context_) {
   auto mb_or_err = MemoryBuffer::getFile(file);
   if (auto ec = mb_or_err.getError()) {
-    EF_PRINT(UmdMsg::UMD_CANNOT_OPEN_MSG, file.data(), ec.message());
+    EF_PRINT(UmdMsg::UMD_CANNOT_OPEN_MSG, file.data(), ec.message().c_str());
     EF_THROW_WITH << -1 << std::endl;
   }

3.4. switch中break缺失

3.4.1. 語義上確實需要break的場景，增加break

例如parser.hpp里面在最后的default分支之前沒有加break，雖然由於default分支當前是空的，所以實際上不影響功能，但萬一后面default分支增加了任何處理，就會出問題：

diff --git a/3rdparty/inja/include/inja/parser.hpp b/3rdparty/inja/include/inja/parser.hpp
index 6266c4a0f74..466499ecc8b 100644
--- a/3rdparty/inja/include/inja/parser.hpp
+++ b/3rdparty/inja/include/inja/parser.hpp
@@ -296,7 +296,7 @@ class Parser {
           operator_stack.pop();
           function_stack.pop();
         }
-      }
+      } break;
       default:
         break;
       }

其他類似修改：

sdk/sdk.bzl

sdk/third_party/inja.patch

3.4.2. 語義上確實不需要break的場景，增加編譯指示，讓編譯器忽略檢查

這樣的問題比較普遍。

diff --git a/sdk/tests/runtime/chunk_allocator_test.cc b/sdk/tests/runtime/chunk_allocator_test.cc
index e63568ddc63..78896778a21 100644
--- a/sdk/tests/runtime/chunk_allocator_test.cc
+++ b/sdk/tests/runtime/chunk_allocator_test.cc
@@ -552,6 +552,10 @@ TEST_F(ChunkAllocatorTest, copy_constructor_test) {
     uint64_t offset0 = 0;
     uint64_t offset1 = 0;
  
+#if defined(__clang__)
+#pragma clang diagnostic push
+#pragma clang diagnostic ignored "-Wimplicit-fallthrough"
+#endif
     switch (op) {
       case TestOpAllocTopDown:
       case TestOpAllocDownTop: {
@@ -590,6 +594,9 @@ TEST_F(ChunkAllocatorTest, copy_constructor_test) {
         }
       } break;
     }
+#if defined(__clang__)
+#pragma clang diagnostic pop
+#endif
   }
 }
 }  // namespace

其他類似修改：

sdk/tests/runtime/mem_manager_test.cc

sdk/tests/runtime/mem_pool_test.cc

sdk/tools/dtu_compiler/dtu_compiler.cc

sdk/lib/umd/tests/sample/tinyxmlparser.cc

另外，C++17開始支持fallthrough的attribute，可以比較簡單的告訴編譯器需要fallthrough:C++ attribute: fallthrough (since C++17) - cppreference.com

3.5. format不匹配問題

3.5.1. 不匹配，但實際上不影響功能

format的string和后面實際傳遞的參數不一致的情況下，有可能導致嚴重問題，不過tops下面的代碼很多是ll類型傳遞了64位數據，實際上對功能影響不大，但如果后面有128位處理器，可能ll就是實際上128位，就可能導致堆棧異常。

diff --git a/sdk/tests/runtime/chunk_allocator_test.cc b/sdk/tests/runtime/chunk_allocator_test.cc
index e63568ddc63..78896778a21 100644
--- a/sdk/tests/runtime/chunk_allocator_test.cc
+++ b/sdk/tests/runtime/chunk_allocator_test.cc
@@ -426,7 +426,7 @@ TEST_F(ChunkAllocatorTest, basic_stress_test) {
           if (allocated_size < allocated_size_pass) {
             char str_buf[256];
             snprintf(str_buf, sizeof(str_buf),
-                     "allocated_size: %llx, allocated_chunks.size(): %lu",
+                     "allocated_size: %lx, allocated_chunks.size(): %lu",
                      allocated_size, allocated_chunks.size());
             EXPECT_TRUE(false) << str_buf;
             break;

其他文件：

sdk/lib/umd/tests/sample/mm_test.cc

sdk/include/driver/mem_handle.h

sdk/include/runtime/command_packet.h

sdk/include/driver/mem_handle.h

sdk/tests/runtime/mem_pool_test.cc

sdk/lib/umd/tests/sample/performance_test.cc

sdk/tests/profile/test_zebu.cc

sdk/runtime/tests/top_scheduler/loop_task_utils.h

3.5.2. 不匹配，並且影響功能

下面本意是打印uint16_t*的指針指向的數據，錯誤傳遞成指針，相當於打印的是一個地址，而不是值，幸好只是一句打印，但實際上%hu對應的是32位，而入參指針在64位機器上是64位，還是會破壞堆棧：

diff --git a/sdk/include/runtime/command_packet.h b/sdk/include/runtime/command_packet.h
index a2d061e9117..5006a601cc0 100644
--- a/sdk/include/runtime/command_packet.h
+++ b/sdk/include/runtime/command_packet.h
@@ -362,7 +362,7 @@ struct CommandPacket {
    */
   static std::string MemberToString(uint16_t* p, std::string tab = "    ") {
     char buf[256];
-    snprintf(buf, sizeof(buf), "%hu", p);
+    snprintf(buf, sizeof(buf), "%hu", *p);
     return buf;
   }

3.6. 有定義無使用

3.6.1. 未使用變量

diff --git a/sdk/lib/umd/tests/sample/launch_code.cc b/sdk/lib/umd/tests/sample/launch_code.cc
index 1152a283052..708b1f44e7d 100644
--- a/sdk/lib/umd/tests/sample/launch_code.cc
+++ b/sdk/lib/umd/tests/sample/launch_code.cc
@@ -180,7 +180,6 @@ static void launch_code_with_cluster_check(void) {
   dtu_mem_handle param = cluster_mem[0];
   u64 param_off = A_B_SIZE + ONE_C_SIZE;
   u64 param_size = PARAM_TRUE_SIZE;
-  u16 launch_entry = 0;
   dtu_sip_mode_cfg_st mode;
   mode.mode_dw = 0x5070f10;
   LaunchKernelParameter parameter(sip[0], param, param_off, param_size, 0, mode,
@@ -363,7 +362,6 @@ static void launch_code_for_one_sip(void) {
   dtu_mem_handle param = cluster_mem[0];
   u64 param_off = A_B_SIZE + ONE_C_SIZE;
   u64 param_size = PARAM_TRUE_SIZE;
-  u16 launch_entry = 0;
   dtu_sip_mode_cfg_st mode;
   mode.mode_dw = 0x5070f10;
   LaunchKernelParameter parameter(sip[0], param, param_off, param_size, 0, mode,
@@ -537,7 +535,6 @@ static void launch_one_sip_twice(void) {
   dtu_mem_handle param = cluster_mem[0];
   u64 param_off = A_B_SIZE + TWO_C_SIZE;
   u64 param_size = PARAM_TRUE_SIZE;
-  u16 launch_entry = 0;
   dtu_sip_mode_cfg_st mode;
   mode.mode_dw = 0x5070f10;
   LaunchKernelParameter parameter[2];
@@ -719,11 +716,10 @@ static void _launch_code_for_eight_sip(int cid, bool check_result) {
   dtu_mem_handle param = cluster_mem[cid];
   u64 param_off = A_B_SIZE + EIGHT_C_SIZE;
   u64 param_size = PARAM_TRUE_SIZE;
-  u16 launch_entry = 0;
   dtu_sip_mode_cfg_st mode;
   mode.mode_dw = 0x5070f10;
   LaunchKernelParameter parameter[8];

其他文件：

sdk/tests/spm/basic.cc

sdk/lib/spm/src/best_fit_policy.c

3.6.2. 未使用參數

非常多，尤其涉及一些第三方組件，還要專門制作patch的方式修改，后面忍不住把Werr關掉主要也是因為這個告警：

diff --git a/sdk/lib/umd/tests/sample/callback_multi_ctx_test.cc b/sdk/lib/umd/tests/sample/callback_multi_ctx_test.cc
index abd50ad4e81..1b39f594406 100644
--- a/sdk/lib/umd/tests/sample/callback_multi_ctx_test.cc
+++ b/sdk/lib/umd/tests/sample/callback_multi_ctx_test.cc
@@ -16,6 +16,7 @@
 #include "dtu/umd/dtu.h"
 #include "dtu/umd/dtu_base_obj.h"
 #include "dtu/umd/dtu_log.h"
+#include "dtu/umd/dtu_utils.h"
 #include "lib/umd/src/dtu_memory.h"
 #include "lib/umd/tests/sample/sample.h"
 #include "lib/umd/tests/sample/sample_assert.h"
@@ -26,6 +27,7 @@ std::mutex mtx;
  
 void event_callback_func_1(dtu_callback callback, void *user_data,
                            u32 engine_id) {
+  MAYBE_UNUSED(callback);
   std::unique_lock<std::mutex> lock(mtx);
   *(int *)user_data = 1;
   DTU_ERROR_LOG(TEST, "event callback_1 call[%d]\n", engine_id);
@@ -33,6 +35,7 @@ void event_callback_func_1(dtu_callback callback, void *user_data,
  
 void event_callback_func_2(dtu_callback callback, void *user_data,
                            u32 engine_id) {
+  MAYBE_UNUSED(callback);
   std::unique_lock<std::mutex> lock(mtx);
   *(int *)user_data = 1;
   DTU_ERROR_LOG(TEST, "event callback_2 call[%d]\n", engine_id);

其他文件：

tools/logging/lib/logging/log.cc

tools/logging/lib/logging/to/file.cc

tools/logging/lib/logging/to/std_err.cc

tools/logging/lib/util/signal_handler.cc

tools/logging/tests/logging/log_to_test.h

sdk/lib/umd/include/dtu_utils.h

sdk/lib/umd/include/reference_obj.h

3rdparty/protobuf-3.8.0/src/google/protobuf/arena.h

sdk/lib/umd/tests/sample/device_reset.cc

sdk/lib/umd/tests/sample/usr_irq.cc

sdk/lib/umd/tests/sample/callback_test.cc

3rdparty/protobuf-3.8.0/src/google/protobuf/map_type_handler.h

3rdparty/protobuf-3.8.0/src/google/protobuf/parse_context.h

kmd/utils/ktest/kmd-test.cpp

sdk/lib/spm/src/buddy_policy.c

sdk/lib/umd/include/dtu_command_obj.h

sdk/lib/umd/include/dtu_context_obj.h

sdk/lib/umd/include/dtu_dqm_obj.h

sdk/lib/umd/include/dtu_driver.h

system_test/tools/vpd_cycle/vpd_cycle.c

sdk/lib/spm/src/buddy_policy.c

sdk/lib/spm/src/interface.c

sdk/lib/spm/src/rbtree.c

sdk/lib/umd/include/dtu_device.h

sdk/lib/umd/include/dtu_driver.h

另外，tools/logging/include/logging/check.h里面的未使用變量比較特殊，實際上是要用的，不過接口調用錯了，導致信息傳遞中丟失了：

diff --git a/tools/logging/include/logging/check.h b/tools/logging/include/logging/check.h
index eb856b7df85..67a667477f1 100644
--- a/tools/logging/include/logging/check.h
+++ b/tools/logging/include/logging/check.h
@@ -47,16 +47,16 @@
 #define EFCHECK_STRCASENE(s1, s2) EF_DTU_CHECK_STROP(strcasecmp, !=, false, s1, s2)
  
 #undef EFCHECK_NOTNULL
-#define EFCHECK_NOTNULL(val) \
-  ::ef_log::CheckNotNull(__FILE__, __LINE__, "'" #val "' Must be non NULL", (val))
-
+#define EFCHECK_NOTNULL(val)                                                \
+  ::ef_log::CheckNotNull(__FILE__, __LINE__, "'" #val "' Must be non NULL", \
+                         (val))
  
 namespace ef_log {
  
 template <typename T>
 T&& CheckNotNull(const char* file, int line, const char* exprtext, T&& t) {
   if (t == nullptr) {
-    EFLOG(FATAL) << std::string(exprtext);
+    ::ef_log::FatalLog(file, line) << std::string(exprtext);
   }
   return std::forward<T>(t);
 }

3.6.3. 未使用label

diff --git a/sdk/include/scheduler/cmd_packet_pass_util.h b/sdk/include/scheduler/cmd_packet_pass_util.h
index d56b5f362e8..1cc18ddc603 100644
--- a/sdk/include/scheduler/cmd_packet_pass_util.h
+++ b/sdk/include/scheduler/cmd_packet_pass_util.h
@@ -457,7 +457,6 @@ void MultiThreadDo(PacketGraph* graph, InitFuncS initf, ThreadFunc f,
   uninif(core_count);
  
   delete[] ptl;
-Exit0:
   return;
 }

3.6.4. 執行不到的代碼

下面代碼開發解釋是當前不支持，又不想刪除，先加個注釋：

diff --git a/sdk/runtime/tests/top_scheduler/TimerTest.cc b/sdk/runtime/tests/top_scheduler/TimerTest.cc
index cb1e2269dd4..5aea6c1f956 100644
--- a/sdk/runtime/tests/top_scheduler/TimerTest.cc
+++ b/sdk/runtime/tests/top_scheduler/TimerTest.cc
@@ -127,12 +127,12 @@ TEST_F(TimerTest, Timer) {
     L3DMA = EngineType::Type::ODMA;
   } else if (IsPavoT20() || IsPavoT21()) {
     return;  // Need TS FW;
-    assembler = new ExecutableAssembler(TargetType::PAVO);
-    L3DMA = EngineType::Type::CDMA_LITE;
+    // assembler = new ExecutableAssembler(TargetType::PAVO);
+    // L3DMA = EngineType::Type::CDMA_LITE;
   } else if (IsDoradoI20() || IsDoradoI21()) {
     return;  // Need TS FW;
-    assembler = new ExecutableAssembler(TargetType::DORADO);
-    L3DMA = EngineType::Type::CDMA;
+    // assembler = new ExecutableAssembler(TargetType::DORADO);
+    // L3DMA = EngineType::Type::CDMA;
   } else {
     return;
   }

sdk/tests/tops/tops_customop_upsample_nearest_test.cc也會報未使用代碼，主要是因為Co當前是固定值，導致第一層if判斷永遠未false，實際上后面這層循環也兼容了Co為1的場景，完全可以去掉：

diff --git a/sdk/tests/tops/tops_customop_upsample_nearest_test.cc b/sdk/tests/tops/tops_customop_upsample_nearest_test.cc
index 2b59c3fc0fc..cf19502b9b4 100644
--- a/sdk/tests/tops/tops_customop_upsample_nearest_test.cc
+++ b/sdk/tests/tops/tops_customop_upsample_nearest_test.cc
@@ -175,32 +175,17 @@ TEST_F(TopsTest, CustomCall_UpSample_Nearest_1) {
   int n_offset = Ho * Wo * Co;
   int h_offset = Wo * Co;
  
-  if (Co == 1) {
-    for (int n = 0; n < N; ++n) {
-      int n_offset = n * n_offset;
-      for (int h = 0; h < Ho; ++h) {
-        int h_index = h / scale_H;
-        for (int w = 0; w < Wo; ++w) {
-          int w_index = w / scale_W;
-          output_ref[n_offset + h * h_offset + w] =
-              image_data[n * Hi * Wi * Ci + h_index * Wi * Ci + w_index * Ci];
-        }
-      }
-    }
-
-  } else {
-    for (int n = 0; n < N; ++n) {
-      int n_offset = n * Ho * Wo * Co;
-      for (int h = 0; h < Ho; ++h) {
-        int h_index = h / scale_H;
-        for (int w = 0; w < Wo; ++w) {
-          int w_index = w / scale_W;
-          for (int c = 0; c < Co; ++c) {
-            int c_index = c / scale_C;
-            output_ref[n_offset + h * h_offset + w * Co + c] =
-                image_data[n * Hi * Wi * Ci + h_index * Wi * Ci + w_index * Ci +
-                           c_index];
-          }
+  for (int n = 0; n < N; ++n) {
+    int n_offset = n * Ho * Wo * Co;
+    for (int h = 0; h < Ho; ++h) {
+      int h_index = h / scale_H;
+      for (int w = 0; w < Wo; ++w) {
+        int w_index = w / scale_W;
+        for (int c = 0; c < Co; ++c) {
+          int c_index = c / scale_C;
+          output_ref[n_offset + h * h_offset + w * Co + c] =
+              image_data[n * Hi * Wi * Ci + h_index * Wi * Ci + w_index * Ci +
+                         c_index];
         }
       }
     }

3.6.5. 未被調用的inline函數

diff --git a/sdk/lib/umd/tests/sample/memcpy_odma.cc b/sdk/lib/umd/tests/sample/memcpy_odma.cc
index 3cfc9777934..2b565c5e55e 100644
--- a/sdk/lib/umd/tests/sample/memcpy_odma.cc
+++ b/sdk/lib/umd/tests/sample/memcpy_odma.cc
@@ -9,6 +9,7 @@
  
 #include "dtu/umd/dtu.h"
 #include "dtu/umd/dtu_interface.h"
+#include "dtu/umd/dtu_utils.h"
 #include "lib/umd/tests/sample/sample.h"
 #include "lib/umd/tests/sample/sample_assert.h"
  
@@ -991,6 +992,7 @@ static void memcpy_host_to_hbm_mc_scan_sync(void) {
 }
 MAKE_SAMPLE_FROM_FUNCTION(memcpy_host_to_hbm_mc_scan_sync);
  
+#if 0
 static int odma_copy(dtu_mem_handle dst_hdl, u64 dst_offset,
                      dtu_mem_handle src_hdl, u64 src_offset, u64 size,
                      u32 engine_id) {
@@ -1034,6 +1036,7 @@ static int odma_copy(dtu_mem_handle dst_hdl, u64 dst_offset,
   dtu_command_queue_destroy(queue);
   return 0;
 }
+#endif
  
 #define MB (1 * 1024 * 1024)
 #if 0

其他文件：

sdk/lib/spm/src/buddy_policy.c

sdk/lib/umd/tests/sample/mm_test.cc

3.6.6. 未使用的class聲明

diff --git a/dtu_backend/dtu_executor.h b/dtu_backend/dtu_executor.h
index 5149656537f..c361bb15e9d 100644
--- a/dtu_backend/dtu_executor.h
+++ b/dtu_backend/dtu_executor.h
@@ -50,7 +50,6 @@ class ClusterAllocation;
 }
  
 class DTUObject;
-class sr::TaskContext;
 class DTUExecutor : public ::xla::dtu::DTUExecutorInterface {
  public:
   typedef typename sr::TaskContext context_type;

3.6.7. 未使用的類型定義

diff --git a/sdk/tests/tops/tops_transform_parameter_test.cc b/sdk/tests/tops/tops_transform_parameter_test.cc
index ee7718e40f1..c7b99fb323d 100644
--- a/sdk/tests/tops/tops_transform_parameter_test.cc
+++ b/sdk/tests/tops/tops_transform_parameter_test.cc
@@ -483,7 +483,6 @@ TEST_P(TopsGraphTransformParameterTest, TopsConv) {
       break;
   }
  
-  typedef float D_TYPE;
   int inputdata_size = input_length * (sizeof(input_data[0]));
  
   topsMemory_t output_mem;

3.7. 重復定義

tops代碼棧里面各個模塊都分別定義的宏非常多，輪到大家相互include的時候就會有大量重復定義問題，解決這個問題的根本解決方案還是需要提取一些公共的頭文件，但各模塊當前又不希望相互間存在依賴，當前只能用ifndef來包起來臨時規避：

diff --git a/sdk/lib/umd/tests/sample/loop_task.cc b/sdk/lib/umd/tests/sample/loop_task.cc
index 2869f571029..099dcaf53f1 100644
--- a/sdk/lib/umd/tests/sample/loop_task.cc
+++ b/sdk/lib/umd/tests/sample/loop_task.cc
@@ -22,12 +22,14 @@
  
 using namespace std;
  
+#ifndef EFCHECK
 #define EFCHECK(__statement__)                                       \
   do {                                                               \
     sts = __statement__;                                             \
     if (sts != DTU_SUCCESS)                                          \
       failed_assertion("Failed:", __FILE__, __FUNCTION__, __LINE__); \
   } while (0)
+#endif
  
 template <int N>
 struct DataLayout {

其他文件：

sdk/lib/umd/tests/sample/sample_assert.h

system_test/tools/vpd_cycle/vpd_cycle.c

3.8. 入參初始化順序異常

這個就出現過一次：

diff --git a/sdk/include/factor/func.h b/sdk/include/factor/func.h
index af24b782ed1..50137f7c5dd 100644
--- a/sdk/include/factor/func.h
+++ b/sdk/include/factor/func.h
@@ -4163,10 +4163,10 @@ struct FACTOR_EXPORT ConvGenDescParams {
                     int64_t Co, int64_t R, int64_t S)
       : conv_type(conv_type),
         data_format(data_format),
-        stride(stride),
-        dailations(dailations),
         opt_level(opt_level),
         padding(padding),
+        stride(stride),
+        dailations(dailations),
         N(N),
         Hi(Hi),
         Wi(Wi),

其他修改的文件：

sdk/tests/tops/tops_convert_parameter_test.cc

3.9. 類型申明不全

clang對直接聲明一個class，但包含的頭文件里面找不到完整定義的會報錯。

要找到tf頭文件的定義順序是個非常麻煩的事情，幸好clang會自動搜索頭文件，所以用clang的宏包起來了。

diff --git a/sdk/lib/cpu/cpu_func_runtime_context.h b/sdk/lib/cpu/cpu_func_runtime_context.h
index 530b4def8ad..62ba4099e6e 100644
--- a/sdk/lib/cpu/cpu_func_runtime_context.h
+++ b/sdk/lib/cpu/cpu_func_runtime_context.h
@@ -23,6 +23,10 @@
 #include <tuple>
 #include <vector>
  
+#if defined(__clang__)
+#include "tensorflow/compiler/xla/service/cpu/simple_orc_jit.h"
+#endif
+
 namespace xla {
 namespace cpu {
 class SimpleOrcJIT;

3.10. 數組初始化

3.10.1. 確實必須是變長數組的使用new[]()和delete[]來申請和釋放內存

diff --git a/sdk/lib/cpu_ops/naive/dot.cc b/sdk/lib/cpu_ops/naive/dot.cc
index f4bb6b7d877..be7ddb0ab23 100755
--- a/sdk/lib/cpu_ops/naive/dot.cc
+++ b/sdk/lib/cpu_ops/naive/dot.cc
@@ -31,9 +31,9 @@ void vectorMul_4_4(const int64_t M, const int64_t N, const int64_t K, outT* out,
     int64_t m_stride = (M - m) >= stride ? stride : (M - m);
     for (int64_t n = 0; n < N;) {
       int64_t n_stride = (N - n) >= stride ? stride : (N - n);
-      register outT out_reg[m_stride * n_stride] = {0};
-      register lhsT lhs_reg[m_stride];
-      register rhsT rhs_reg[n_stride];
+      register outT* out_reg = new outT[m_stride * n_stride]();
+      register outT* lhs_reg = new outT[m_stride]();
+      register outT* rhs_reg = new outT[n_stride]();
       for (int64_t i = 0; i < K; i++) {
         for (auto idx = 0; idx < m_stride; idx++) {
           lhs_reg[idx] = ELEMENT(lhs, m + idx, i, K);
@@ -53,6 +53,9 @@ void vectorMul_4_4(const int64_t M, const int64_t N, const int64_t K, outT* out,
         }
       }
       n += n_stride;
+      delete[] rhs_reg;
+      delete[] lhs_reg;
+      delete[] out_reg;
     }
     m += m_stride;
   }

其他類似修改：

sdk/lib/umd/tests/sample/mm_test.cc

sdk/lib/cpu_ops/naive/dot.cc

sdk/lib/factor/codegen/macro_instruction/minst_conv2d_bpi.cc

3.10.2. 實際語義是定長數組的，通過加const修飾來解決

這種在test里面非常多，大家定義數組的時候都沒有習慣把數組的長度定義加上const修飾符，這樣不斷可以增加執行效率，也不容易出錯。

diff --git a/sdk/sample/batchnormalInference/tops_batchnormalInference.cc b/sdk/sample/batchnormalInference/tops_batchnormalInference.cc
index 8df67784dec..1db6440e22c 100644
--- a/sdk/sample/batchnormalInference/tops_batchnormalInference.cc
+++ b/sdk/sample/batchnormalInference/tops_batchnormalInference.cc
@@ -67,16 +67,16 @@ void topsBatchNormalInferenceNHWC() {
   topsTensorDescriptor_t xDesc;
   topsTensorDescriptor_t yDesc;
  
-  int x_c = 4;
-  int x_h = 2;
-  int x_n = 3;
-  int x_w = 2;
+  const int x_c = 4;
+  const int x_h = 2;
+  const int x_n = 3;
+  const int x_w = 2;
  
-  int scaleNums = x_c;
-  int y_c = x_c;
-  int y_h = x_h;
-  int y_n = x_n;
-  int y_w = x_w;
+  const int scaleNums = x_c;
+  const int y_c = x_c;
+  const int y_h = x_h;
+  const int y_n = x_n;
+  const int y_w = x_w;
  
   topsContext_t context;
   int clusters[] = {0};
@@ -90,7 +90,7 @@ void topsBatchNormalInferenceNHWC() {
   topsSetTensorDescriptor(yDesc, TOPS_TENSOR_NHWC, TOPS_DATA_FLOAT, y_n, y_c,
                           y_h, y_w);
  
-  int inputdata_num = x_c * x_h * x_n * x_w;
+  const int inputdata_num = x_c * x_h * x_n * x_w;
  
   D_TYPE InputData[inputdata_num] = {
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
@@ -170,16 +170,16 @@ void topsBatchNormalInferenceCHNW() {
   topsTensorDescriptor_t xDesc;
   topsTensorDescriptor_t yDesc;
  
-  int x_c = 4;
-  int x_h = 2;
-  int x_n = 3;
-  int x_w = 2;
+  const int x_c = 4;
+  const int x_h = 2;
+  const int x_n = 3;
+  const int x_w = 2;
  
-  int scaleNums = x_c;
-  int y_c = x_c;
-  int y_h = x_h;
-  int y_n = x_n;
-  int y_w = x_w;
+  const int scaleNums = x_c;
+  const int y_c = x_c;
+  const int y_h = x_h;
+  const int y_n = x_n;
+  const int y_w = x_w;
  
   topsContext_t context;
   int clusters[] = {0};
@@ -193,7 +193,7 @@ void topsBatchNormalInferenceCHNW() {
   topsSetTensorDescriptor(yDesc, TOPS_TENSOR_CHNW, TOPS_DATA_FLOAT, y_n, y_c,
                           y_h, y_w);
  
-  int inputdata_num = x_c * x_h * x_n * x_w;
+  const int inputdata_num = x_c * x_h * x_n * x_w;
  
   D_TYPE InputData[inputdata_num] = {
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
@@ -275,16 +275,16 @@ void topsBatchNormalInferenceBoundary() {
   topsTensorDescriptor_t xDesc;
   topsTensorDescriptor_t yDesc;
  
-  int x_c = 4;
-  int x_h = 2;
-  int x_n = 50;
-  int x_w = 2;
+  const int x_c = 4;
+  const int x_h = 2;
+  const int x_n = 50;
+  const int x_w = 2;
  
-  int scaleNums = x_c;
-  int y_c = x_c;
-  int y_h = x_h;
-  int y_n = x_n;
-  int y_w = x_w;
+  const int scaleNums = x_c;
+  const int y_c = x_c;
+  const int y_h = x_h;
+  const int y_n = x_n;
+  const int y_w = x_w;
  
   topsContext_t context;
   int clusters[] = {0};
@@ -298,7 +298,7 @@ void topsBatchNormalInferenceBoundary() {
   topsSetTensorDescriptor(yDesc, TOPS_TENSOR_CHNW, TOPS_DATA_FLOAT, y_n, y_c,
                           y_h, y_w);
  
-  int inputdata_num = x_c * x_h * x_n * x_w;
+  const int inputdata_num = x_c * x_h * x_n * x_w;
  
   D_TYPE InputData[inputdata_num];
   for (int i = 0; i < inputdata_num; i++) {
@@ -380,16 +380,16 @@ void topsBatchNormalInferenceScaleOffset() {
   topsTensorDescriptor_t xDesc;
   topsTensorDescriptor_t yDesc;
  
-  int x_c = 4;
-  int x_h = 2;
-  int x_n = 3;
-  int x_w = 2;
+  const int x_c = 4;
+  const int x_h = 2;
+  const int x_n = 3;
+  const int x_w = 2;
  
-  int scaleNums = x_c;
-  int y_c = x_c;
-  int y_h = x_h;
-  int y_n = x_n;
-  int y_w = x_w;
+  const int scaleNums = x_c;
+  const int y_c = x_c;
+  const int y_h = x_h;
+  const int y_n = x_n;
+  const int y_w = x_w;
  
   topsContext_t context;
   int clusters[] = {0};
@@ -403,7 +403,7 @@ void topsBatchNormalInferenceScaleOffset() {
   topsSetTensorDescriptor(yDesc, TOPS_TENSOR_NHWC, TOPS_DATA_FLOAT, y_n, y_c,
                           y_h, y_w);
  
-  int inputdata_num = x_c * x_h * x_n * x_w;
+  const int inputdata_num = x_c * x_h * x_n * x_w;
  
   D_TYPE InputData[inputdata_num] = {
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,

還有很多，僅列出文件名：

sdk/sample/batchnormalTraining/tops_batchnormalTraining.cc

sdk/sample/broadcast/tops_broadcast.cc

sdk/sample/resnet50/TopsOpApi.cc

sdk/tests/tops/tops_batchnormalBackward_test.cc

sdk/tests/tops/tops_batchnormalTraining_test.cc

sdk/tests/tops/tops_concat_test.cc

sdk/tests/tops/tops_convert_test.cc

sdk/tests/tops/tops_customop_test.cc

sdk/tests/tops/tops_scatter_test.cc

sdk/tests/tops/tops_bnForwardTrainingEx_unit_test.cc （這個文件修改了1800+行，逼得我單獨成了一個patch）

sdk/tests/tops/tops_broadcast_test.cc

sdk/tests/tops/tops_concat_test.cc

sdk/tests/tops/tops_convert_test.cc

sdk/tests/tops/tops_descriptor_test.cc

sdk/tests/tops/tops_pad_test.cc

sdk/tests/tops/tops_scatter_test.cc

3.11. 函數原型中的auto

clang禁止在函數原型中使用auto入參，我理解主要出於以下考慮：

1、如果該函數作為接口暴露接口出去，調用者應該用什么類型的實參？

2、如果多個調用，使用的實參類型不一樣，函數體類對入參進行處理時是否會觸發隱式的類型轉換？而clang對存在信息損耗的隱式的類型轉換是嚴格禁止的。

3、如果多個調用時，入參本身使用的存儲長度不一樣，是否會導致堆棧被破壞？例如有些用int，有些用long，函數具體編譯過程中是應該實例化出來2個實體，還是單個實體？

4、函數翻譯成C函數的時候，函數名稱應該怎么生成？C++函數名稱轉換為C函數名稱的時候，可沒有考慮auto入參的轉換規則。

auto入參的問題，主要體現在sdk/lib/tuner/pavo/和sdk/tests/factor/targets/pavo/dnn/conv/目錄中：

diff --git a/sdk/lib/tuner/pavo/pavo_conv_dataflow3_bpi_non4c_impl.cc b/sdk/lib/tuner/pavo/pavo_conv_dataflow3_bpi_non4c_impl.cc
index dba48418fc2..2c59bc4eda6 100644
--- a/sdk/lib/tuner/pavo/pavo_conv_dataflow3_bpi_non4c_impl.cc
+++ b/sdk/lib/tuner/pavo/pavo_conv_dataflow3_bpi_non4c_impl.cc
@@ -31,8 +31,8 @@ namespace factor {
 using namespace hlir;
  
 static std::vector<std::vector<int64_t>> build_dim(
-    std::vector<int64_t> dim_count, auto cores_on_dim, auto sip_cord,
-    int64_t sip_num) {
+    std::vector<int64_t> dim_count, std::vector<int64_t> cores_on_dim,
+    std::vector<std::vector<int64_t>> sip_cord, int sip_num) {
   std::vector<int64_t> dim_count1 = {
       dim_count[0] / cores_on_dim[0], dim_count[1] / cores_on_dim[1],
       dim_count[2] / cores_on_dim[2], dim_count[3] / cores_on_dim[3]};

其他函數的修改類似，僅列出文件名：

sdk/lib/tuner/pavo/pavo_conv_dataflow5_bpi_non4c_impl.cc
sdk/lib/tuner/pavo/pavo_conv_dataflow7_bpi_non4c_impl.cc

sdk/lib/tuner/pavo/pavo_conv_dataflow1_bpi_non4c_impl.cc

sdk/lib/tuner/pavo/pavo_conv_dataflow2_bpi_non4c_impl.cc

sdk/lib/tuner/pavo/pavo_conv_dataflow3_1_forward_non4c_impl.cc

sdk/lib/tuner/pavo/pavo_conv_dataflow5_1_forward_non4c_impl.cc

sdk/lib/tuner/pavo/pavo_conv_dataflow6_bpk_non4c_impl.cc

sdk/lib/tuner/pavo/pavo_conv_dataflow7_1_forward_non4c_impl.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_1c6s_bpi_dataflow1_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_bpk_1c4s_dataflow7_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_bpk_1c6s_dataflow6_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_ff_dataflow3_1_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_ff_dataflow5_1_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_ff_dataflow7_1_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_1c4s_bpi_dataflow2_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_1c4s_bpi_dataflow3_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_1c4s_bpi_dataflow5_template_test.cc

sdk/tests/factor/targets/pavo/dnn/conv/dnn_conv_gen_1c4s_bpi_dataflow7_template_test.cc

sdk/lib/ops/common/dtu_elementwise_fusion_impl.cc

sdk/tests/llir/dma_test/slice_dma_test.cc

sdk/tests/llir/dma_test/broadcast_dma_test.cc

sdk/tests/llir/dma_test/deslice_dma_test.cc

sdk/tests/llir/dma_test/mirror_dma_test.cc

sdk/tests/llir/dma_test/padding_dma_test.cc

sdk/tests/llir/dma_test/subsampling_dma_test.cc

sdk/tests/llir/dma_test/transpose_dma_test.cc

3.12. strlen返回值不作為常量類型的處理

clang里面把strlen返回值當做變量處理，如果要作為const來使用，需要自己定義函數：

diff --git a/sdk/lib/profile/topspti/reader/helper.h b/sdk/lib/profile/topspti/reader/helper.h
index c56502bef86..1d642e2431e 100644
--- a/sdk/lib/profile/topspti/reader/helper.h
+++ b/sdk/lib/profile/topspti/reader/helper.h
@@ -28,7 +28,6 @@
  
 #include <cstring>
 #include <string>
-
 #include "utils/utils.h"
  
 namespace topspti2 {
@@ -36,6 +35,10 @@ namespace topspti2 {
 #define TENSOR_MARK "!dtu.tensor<"
 #define TENSOR_MARK_SZ (sizeof(TENSOR_MARK) - 1)
  
+int constexpr CONSTEXPR_STRLEN(const char *str) {
+  return *str ? 1 + CONSTEXPR_STRLEN(str + 1) : 0;
+}
+
 static inline bool HasDPF(const std::string &product) {
   return (product != "" && product != "unknown" && product != "T10" &&
           product != "T11" && product != "T10s" && product != "I10");
@@ -123,7 +126,7 @@ static inline bool FastParseSizeFromTensor(const std::string &tensor,
   if (std::string::npos == pos) {
     return false;
   }
-  constexpr int tensor_mark_sz = strlen(TENSOR_MARK);
+  constexpr int tensor_mark_sz = CONSTEXPR_STRLEN(TENSOR_MARK);
   const char *data = tensor.c_str();
   while (pos != std::string::npos) {
     pos += tensor_mark_sz;
@@ -202,7 +205,7 @@ static inline bool FastParseSizeFromMemref(const std::string &memref,
   if (0 != pos) {
     return false;
   }
-  constexpr auto memref_mark_sz = strlen(MEMREF_MARK);
+  constexpr auto memref_mark_sz = CONSTEXPR_STRLEN(MEMREF_MARK);
   pos += memref_mark_sz;
   int64_t prod = 1;
   size_t lz = memref.size();
@@ -261,7 +264,7 @@ static inline bool ParseTensorInfoFromString(const std::string &input,
                                              TensorInfoValue &tiv) {
   tiv = TensorInfoValue();
   constexpr const char *const szstr = "size:";
-  constexpr int sz = strlen(szstr);
+  constexpr int sz = CONSTEXPR_STRLEN(szstr);
  
   if (input.size() > sz && !strncmp(input.c_str(), szstr, sz)) {
     tiv.size = stoll(input.substr(sz));

其他類似修改：

sdk/lib/profile/libprofile_defs.h

3.13. 其他語法問題

3.13.1. lambda語法問題

參見 Lambda expressions (since C++11) - cppreference.com，lambda表達式的capture用法如下：

a comma-separated list of zero or more captures, optionally beginning with a capture-default.

See below for the detailed description of captures.

A lambda expression can use a variable without capturing it if the variable

is a non-local variable or has static or thread local storage duration (in which case the variable cannot be captured), or
is a reference that has been initialized with a constant expression.

A lambda expression can read the value of a variable without capturing it if the variable

has const non-volatile integral or enumeration type and has been initialized with a constant expression, or
is constexpr and has no mutable members.

上面的描述是說，下面這幾種情況不需要指定capture：

1)非局部變量（全局變量）

2)static變量

3) thread local 變量（這種情況下不是不需要指定，是指定了也用不了）

4)常量表達式初始化的對象的引用

5)常量表達式初始化的非volatile整型或者枚舉類型（只讀訪問）

6)不帶可變成員的常量表達式（只讀訪問）

sdk/tests/hlir/cc_tests/hlir_pass_manager_test.cc里面使用的module_str是全局變量，不需要指定捕獲，原來的寫法在gcc5上可以編譯通過，但gcc7和clang下面會直接報錯：

diff --git a/sdk/tests/hlir/cc_tests/hlir_pass_manager_test.cc b/sdk/tests/hlir/cc_tests/hlir_pass_manager_test.cc
index 6f17f27ca4d..70506e38506 100644
--- a/sdk/tests/hlir/cc_tests/hlir_pass_manager_test.cc
+++ b/sdk/tests/hlir/cc_tests/hlir_pass_manager_test.cc
@@ -38,7 +38,7 @@ TEST(MTTest, PassMgr) {
   std::vector<std::thread> th_vec;
   th_vec.reserve(thread_count);
   for (size_t i = 0; i < thread_count; ++i) {
-    th_vec.emplace_back([&module_str]() {
+    th_vec.emplace_back([]() {
       mlir::MLIRContext context;
       mlir::OwningModuleRef module =
           mlir::parseSourceString(module_str, &context);

下面這個寫法由於this指針雖然指定了捕獲，但沒有使用，所以會有“expression result unused [-Wunused-value]”告警，設置了捕獲相當於在lambda函數里面做了一次聲明，如果未使用會有告警：

diff --git a/tools/logging/tests/logging/test_log_old_api.cc b/tools/logging/tests/logging/test_log_old_api.cc
index 638a84cac6b..91cbf6b2676 100644
--- a/tools/logging/tests/logging/test_log_old_api.cc
+++ b/tools/logging/tests/logging/test_log_old_api.cc
@@ -18,7 +18,7 @@ class OldLogTest : public testing::Test {
     Test::SetUp();
     RegisterLogTo(this->pLog);
     pLog->setCallback(
-        [this](const std::string &msg) { std::cerr << msg << std::endl; });
+        [](const std::string &msg) { std::cerr << msg << std::endl; });
     pLog->SetAutoClear(true);
   }

類似的，sdk/lib/ops/common/dtu_scatter_impl.cc里面將常量alignment在捕獲中定義也是錯誤的：

diff --git a/sdk/lib/ops/common/dtu_scatter_impl.cc b/sdk/lib/ops/common/dtu_scatter_impl.cc
index 032c19e6ef9..99058631d4c 100644
--- a/sdk/lib/ops/common/dtu_scatter_impl.cc
+++ b/sdk/lib/ops/common/dtu_scatter_impl.cc
@@ -92,7 +92,7 @@ bool predicate_func(int64_t i) {
  
 // alloc_ memory with alignment of 128 byte.
 const uint32_t alignment = 128;
-auto GetAlignedSize = [alignment](uint64_t size) {
+auto GetAlignedSize = [](uint64_t size) {
   return (size + alignment - 1) / alignment * alignment;
 };

3.13.2. return語句中的move調用

在return語句中使用std::move會使編譯器的copy elision失效，下面修改之前的代碼clang會上報告警“moving a local object in a return statement prevents copy elision [-Wpessimizing-move]”，什么是copy elision?

Copy elision - cppreference.com上的定義如下：Omits copy and move (since C++11) constructors, resulting in zero-copy pass-by-value semantics.

也就是說，如果不調用std::move，在return的過程中，編譯器會盡量省略對象的copy或者move操作，達到零拷貝的效果；如果調用了std::move，會強制要求編譯器調用對象的move構造函數。顯然，后者更昂貴。

diff --git a/tools/logging/tests/logging/log_to_test.h b/tools/logging/tests/logging/log_to_test.h
index de91f49b34d..0c92e2acd8a 100644
--- a/tools/logging/tests/logging/log_to_test.h
+++ b/tools/logging/tests/logging/log_to_test.h
@@ -21,7 +21,7 @@ class LogToString : public LogDestination {
     if (autoClear_) {
       Clear();
     }
-    return std::move(ret);
+    return ret;
   }
   void SetAutoClear(bool autoClear) { autoClear_ = autoClear; }
   void Clear() { str_.clear(); }

3.13.3. 使用未初始化的對象

sdk/tests/runtime/device_manager_test.cc在修改前的版本中，如果result.ok()為false，則cluster沒有機會初始化就會被后面的device->ClusterMemoryHandle()函數當做入參使用，會觸發很惡劣的影響：

diff --git a/sdk/tests/runtime/device_manager_test.cc b/sdk/tests/runtime/device_manager_test.cc
index cf9075367a7..0adf469da8b 100644
--- a/sdk/tests/runtime/device_manager_test.cc
+++ b/sdk/tests/runtime/device_manager_test.cc
@@ -109,14 +109,13 @@ TEST_F(DeviceManagerTest, ClusterMemoryHandle_SuccessFail) {
   dtu::driver::DeviceManager* device = dtu::driver::DeviceManager::instance();
   device->AcquireDevice(0);
   dtu::StatusOr<dtu_cluster> result = device->Cluster(0, 0);
-  dtu_cluster cluster;
   if (result.ok()) {
-    cluster = std::move(result.ValueOrDie());
-    EXPECT_NE(cluster, nullptr);
+    dtu_cluster cluster = std::move(result.ValueOrDie());
+    dtu::StatusOr<dtu_mem_handle> result1 =
+        device->ClusterMemoryHandle(cluster);
   } else {
     EFLOG(FATAL) << "Get ClusterIds error: " << result.status();
   }
-  dtu::StatusOr<dtu_mem_handle> result1 = device->ClusterMemoryHandle(cluster);
   EXPECT_EQ(result.ok(), true);
   EXPECT_NE(result.ValueOrDie(), nullptr);
   device->ReleaseCluster(0, 0);

3.13.4. clang禁止使用括號表達式初始化數組

下面的修改前的代碼clang會報錯"parenthesized initialization of a member array is a GNU extension [-Wgnu-array-member-paren-init]"，從gcc回報告警"list-initializer for non-class type must not be parenthesized"：

diff --git a/sdk/tests/tops/tops_broadcast_parameter_test.cc b/sdk/tests/tops/tops_broadcast_parameter_test.cc
index 831d9d23791..321c56171f3 100644
--- a/sdk/tests/tops/tops_broadcast_parameter_test.cc
+++ b/sdk/tests/tops/tops_broadcast_parameter_test.cc
@@ -139,14 +139,18 @@ class TopsBroadcastParameterTest
 };
  
 TopsBroadcastParameterTest::TopsBroadcastParameterTest()
-    : x_desc_dim({GetParam().x.h, GetParam().x.w}),
-      y_desc_dim(
-          {GetParam().y.n, GetParam().y.c, GetParam().y.h, GetParam().y.w}),
-      broadcast_dims(
-          {GetParam().broadcast_dim.dim_1, GetParam().broadcast_dim.dim_2}),
-      input_length(GetParam().x.h * GetParam().x.w),
+    : input_length(GetParam().x.h * GetParam().x.w),
       output_length(GetParam().y.n * GetParam().y.c * GetParam().y.h *
-                    GetParam().y.w) {}
+                    GetParam().y.w) {
+  x_desc_dim[0] = GetParam().x.h;
+  x_desc_dim[1] = GetParam().x.w;
+  y_desc_dim[0] = GetParam().y.n;
+  y_desc_dim[1] = GetParam().y.c;
+  y_desc_dim[2] = GetParam().y.h;
+  y_desc_dim[3] = GetParam().y.w;
+  broadcast_dims[0] = GetParam().broadcast_dim.dim_1;
+  broadcast_dims[1] = GetParam().broadcast_dim.dim_2;
+}
  
 void TopsBroadcastParameterTest::freeDebugInfo() {
   if (input_mem == nullptr) {

類似的修改還有：

sdk/tests/tops/tops_dot_parameter_test.cc

sdk/tests/tops/tops_pad_parameter_test.cc

3.13.5. clang的泛型函數的實例化必須有相關調用才會觸發

因為構造函數在sdk自身代碼里面沒有被調用，導致libdtu_sdk.so里面也沒有相關符號，但測試函數需要使用，不得已加了個樁函數來觸發構造函數實例化。

diff --git a/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc b/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc
index 7e366337561..41fb573a562 100644
--- a/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc
+++ b/sdk/lib/umd/tools/kernel_code_processor/kernel_code.cc
@@ -248,4 +248,13 @@ vector<string> KernelCode<T>::LinkArgs() {
   return args;
 }
  
+// stab function for undefined reference to
+// 'dtu_umd::KernelCode<dtu_umd::PavoKernel>::KernelCode(llvm::StringRef)'
+void kernel_code_stab() {
+  StringRef file_name = "stab_file";
+  KernelCode<PavoKernel> k_stab1(file_name);
+  KernelCode<DoradoKernel> k_stab2(file_name);
+  KernelCode<LeoKernel> k_stab3(file_name);
+}
+
 }  // namespace dtu_umd

3.13.6. clang的constexpr中不允許定義需要內存處理的復雜對象

下面的模板定義中需要新生成vector對象，該對象需要在構造函數中使用內存相關處理，不修改會報錯“variable of non-literal type 'std::vector<size_t>' (aka 'vector<unsigned long>') cannot be defined in a constexpr function”，將模板中的constexpr標識刪掉之后正常。

查看c++標准3.9/10可以看到literal type的定義（相當於常量或者簡單變量），unpack_seq_to_vector里面的vector不屬於簡單變量或者簡單變量的數組，如果換成array應該可以通過，不過調用這個函數的地方都要修改：

A type is a literal type if it is:

void; or
a scalar type; or
a reference type; or
an array of literal type; or
a class type (Clause 9) that has all of the following properties:
- it has a trivial destructor,
- it is an aggregate type (8.5.1) or has at least one constexpr constructor or constructor template that is not a copy or move constructor, and
- all of its non-static data members and base classes are of non-volatile literal types

diff --git a/sdk/tests/hlir/cc_tests/hlir_utils_test.cc b/sdk/tests/hlir/cc_tests/hlir_utils_test.cc
index bdc21e3f317..24e37191fc4 100644
--- a/sdk/tests/hlir/cc_tests/hlir_utils_test.cc
+++ b/sdk/tests/hlir/cc_tests/hlir_utils_test.cc
@@ -152,7 +152,7 @@ TEST(HlirUtilTest, ConstSplatValue) {
 }
  
 template <size_t... Idx>
-constexpr static auto unpack_seq_to_vector(hlir::IndexSeq<Idx...>) {
+static auto unpack_seq_to_vector(hlir::IndexSeq<Idx...>) {
   std::vector<size_t> ret = {Idx...};
   return ret;
 }

3.13.7. clang的虛函數的重載需要加上顯式的override關鍵字

diff --git a/tools/logging/include/logging/to/file.h b/tools/logging/include/logging/to/file.h
index bdda687afdc..c6d39779bde 100644
--- a/tools/logging/include/logging/to/file.h
+++ b/tools/logging/include/logging/to/file.h
@@ -18,7 +18,7 @@ class LogToFile : public LogDestination {
   DISALLOW_COPY_AND_ASSIGN(LogToFile);
  
   static pointer Create(const std::string &file_name);
-  void Message(int level, const std::string &message);
+  void Message(int level, const std::string &message) override;
   void Flush() override;
  
  private:

其他類似修改：

tools/logging/include/logging/to/std_err.h

3.13.8. alignas使用問題

alignas本意是定義結構體的時候，為了優化結構體的訪問效率，讓結構體的存放盡量靠近大的整數邊界，和c語言里面的pack不是一個概念。所以pack可以對所有對象強制指定pack(1)來確保內存訪問不移位，alignas的設置卻要求比結構體成員的最大長度要大：

The object or the type declared by such a declaration will have its alignment requirement equal to the strictest (largest) non-zero expression of all alignas specifiers used in the declaration, unless it would weaken the natural alignment of the type.

下面定義的結構體中有uint16_t的成員，理論上最小alignas是2，所以不能用alignas(1)來修飾：

diff --git a/sdk/lib/hlir/utils/types.h b/sdk/lib/hlir/utils/types.h
index 87aee25fe31..90cabe7bdb6 100644
--- a/sdk/lib/hlir/utils/types.h
+++ b/sdk/lib/hlir/utils/types.h
@@ -151,13 +151,13 @@ enum class CompareType {
  
 // define raw data type
 // lower to factor need raw data
-struct alignas(1) raw_bf16_ty {
+struct alignas(2) raw_bf16_ty {
   uint16_t data;
 };
 static_assert(sizeof(raw_bf16_ty) == 2, "");
  
 // half
-struct alignas(1) raw_fp16_ty {
+struct alignas(2) raw_fp16_ty {
   uint16_t data;
 };
 static_assert(sizeof(raw_fp16_ty) == 2, "");

3.14. 為了解決告警順帶做的一些優化

3.14.1. 冗余的計算

tools/logging/lib/logging/log_message.cc當時本來是為了解決變長數組的初始化問題，但自己閱讀發現把timeval的毫秒和秒先計算成一個總的毫秒之后並沒有使用，后面又直接換算成秒和毫秒再用的，所以這個換算實際上沒用，和代碼onwer確認之后刪掉相關冗余計算。

diff --git a/tools/logging/lib/logging/log_message.cc b/tools/logging/lib/logging/log_message.cc
index 77fa33fe129..8d4cd48b348 100644
--- a/tools/logging/lib/logging/log_message.cc
+++ b/tools/logging/lib/logging/log_message.cc
@@ -25,18 +25,15 @@ std::string LogMessage::GenerateMessage() {
   std::stringstream os;
   struct timeval tv;
   gettimeofday(&tv, nullptr);
-  uint64_t now_micros = static_cast<uint64_t>(tv.tv_sec) * 1000000 + tv.tv_usec;
-  time_t now_seconds = static_cast<time_t>(now_micros / 1000000);
-  int32_t micros_remainder = static_cast<int32_t>(now_micros % 1000000);
   const size_t time_buffer_size = 50;
-  struct tm now_time = {0};
-  char time_buffer[time_buffer_size];
-  localtime_r(&now_seconds, &now_time);
+  struct tm now_time = tm();
+  char time_buffer[time_buffer_size]={0};
+  localtime_r(&tv.tv_sec, &now_time);
   strftime(time_buffer, time_buffer_size, "%Y-%m-%d %H:%M:%S", &now_time);
  
   os << time_buffer << ".";
   os.width(6);
-  os << micros_remainder << ": ";
+  os << tv.tv_usec << ": ";
   os << "DIWEF"[severity_];
   if(msg_code_) {
     os << msg_code_;

3.14.2. 引用指針和空指針的冗余比較

對象的引用是指某個對象的地址，肯定不是空，所以將它和nullptr做比較沒有意義：

diff --git a/tools/logging/lib/logging/log_module.cc b/tools/logging/lib/logging/log_module.cc
index f40e13d6fea..3ea150b37a0 100644
--- a/tools/logging/lib/logging/log_module.cc
+++ b/tools/logging/lib/logging/log_module.cc
@@ -27,10 +27,6 @@ LogModuleMgr &LogModuleMgr::Instance() {
 }
  
 void LogModuleMgr::UpdateModuleMaskFromEnv(const std::string &env) {
-  if (&env == nullptr) {
-    return;
-  }
-
   EFLOG(DBG) << "Init Logging Module" << std::endl;
   EFLOG(DBG) << "ENFLAME_LOG_DEBUG_MOD = " << env << std::endl;
   auto tokens = strutil::split(env, ',');
@@ -91,4 +87,4 @@ void LogModuleMgr::SetModuleOff(EF_LOG_MOD module) {
   mod_status_[static_cast<int>(module)] = false;
 }
  
-} // namespace dtu
\ No newline at end of file
+} // namespace dtu

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 一階段總結11.29 第一階段演示總結第一階段沖刺總結第一階段沖刺總結第一階段沖刺成果總結第一階段成果介紹及總結第一階段團隊沖刺的總結 spring沖刺第一階段總結 SpringCloud全家桶學習之一階段總結（一）第一階段（六）

clang9適配一階段總結

1. 概述

2. 問題發現和解決的方法

2.1. cmake的編譯命令獲取

2.2. bazel的編譯命令獲取

2.3. 鏈接命令的獲取

3. 實際修改分類

3.1. 編譯選項的修改

3.1.1. 增加的選項

3.1.2. 刪除的選項

3.1.3. 修改的選項

3.1.4. bazel的選項說明

3.1.5. CMAKE的CMAKE_TOOLCHAIN_FILE變量在rerun的時候，有一定概率會把搜索路徑下的工具鏈配置文件加上全路徑，導致直接STREQUAL判斷失敗

3.2. 模板相關錯誤

3.2.1. use 'template' keyword to treat 'cast' as a dependent template name

3.2.2. 二義性

3.3. 類型不匹配

3.3.1. 大整型向小整型的隱式轉換

3.3.2. 有符號向無符號的隱式轉換

3.3.3. 浮點向整型的隱式轉換

3.3.4. double向float的隱式轉換

3.3.5. 指針向bool的隱式轉換

3.3.6. 不同類型隱式轉換

3.3.7. 函數原型中的const隱式轉換

3.3.8. void*向char*的隱式轉換

3.3.9. string類型到char*的隱式轉換

3.4. switch中break缺失

3.4.1. 語義上確實需要break的場景，增加break

3.4.2. 語義上確實不需要break的場景，增加編譯指示，讓編譯器忽略檢查

3.5. format不匹配問題

3.5.1. 不匹配，但實際上不影響功能

3.5.2. 不匹配，並且影響功能

3.6. 有定義無使用

3.6.1. 未使用變量

3.6.2. 未使用參數

3.6.3. 未使用label

3.6.4. 執行不到的代碼

3.6.5. 未被調用的inline函數

3.6.6. 未使用的class聲明

3.6.7. 未使用的類型定義

3.7. 重復定義

3.8. 入參初始化順序異常

3.9. 類型申明不全

3.10. 數組初始化

3.10.1. 確實必須是變長數組的使用new[]()和delete[]來申請和釋放內存

3.10.2. 實際語義是定長數組的，通過加const修飾來解決

3.11. 函數原型中的auto

3.12. strlen返回值不作為常量類型的處理

3.13. 其他語法問題

3.13.1. lambda語法問題

3.13.2. return語句中的move調用

3.13.3. 使用未初始化的對象

3.13.4. clang禁止使用括號表達式初始化數組

3.13.5. clang的泛型函數的實例化必須有相關調用才會觸發

3.13.6. clang的constexpr中不允許定義需要內存處理的復雜對象

3.13.7. clang的虛函數的重載需要加上顯式的override關鍵字

3.13.8. alignas使用問題

3.14. 為了解決告警順帶做的一些優化

3.14.1. 冗余的計算

3.14.2. 引用指針和空指針的冗余比較

免責聲明！

3.3.8. void向char的隱式轉換