小知識:TFA收集日志報錯空間不足


今天在某客戶環境下分析某節點驅逐的故障,發現有安裝TFA,所以使用一鍵收集包含故障時刻的日志

tfactl diagcollect -from "2020-08-14 03:00:00" -to "2020-08-14 05:00:00" -all

結果收集日志時報錯空間不足:

Not enough space in Repository or TFA_BASE to run collections

實際df查看對應目錄的空間充足,這實際上是受限TFA repository的Maximum Size (MB) 設置,默認一般是10GB,客戶的環境由於保存osw時間過長,導致已超出這個大小,進而使收集日志報錯空間不足。
根據MOS文檔:TFA Diagcollection Reports "Not enough space in Repository or TFA_BASE to run collections" (Doc ID 2300038.1)
有明確的解決方案:

  1. tfactl set reposizeMB=10240
  2. tfactl print repository

Notably, The repository location can be changed using tfactl set repositorydir=

根據MOS的方案,我們查看當前值,結合實際情況設置為合適的值,注意需要使用root用戶操作:

tfactl print repository
tfactl set reposizeMB=20480

甚至在極端場景下,目錄空間不夠,可以設置其他有空間剩余的目錄:

mkdir /tmp/repository
tfactl set repositorydir=/tmp/repository

再次嘗試TFA快速收集相關日志:

tfactl diagcollect -from "2020-08-14 03:00:00" -to "2020-08-14 05:00:00" -all

可以成功收集所需日志:

[root@db01 grid]# tfactl diagcollect -from "2020-08-14 03:00:00" -to "2020-08-14 05:00:00" -all
The -all switch is being deprecated as collection of all components is the default behavior. TFA will continue to collect all components.
Collecting data for all nodes
Scanning files from aug/14/2020 03:00:00 to aug/14/2020 05:00:00

Collection Id : 20200814235440db01

Detailed Logging at : /tmp/repository/collection_Fri_Aug_14_23_54_41_CST_2020_node_all/diagcollect_20200814235440_db01.log
2020/08/14 23:54:51 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2020/08/14 23:54:51 CST : Collection Name : tfa_Fri_Aug_14_23_54_41_CST_2020.zip
2020/08/14 23:54:51 CST : Collecting diagnostics from hosts : [db01, db02]
2020/08/14 23:54:52 CST : Scanning of files for Collection in progress...
2020/08/14 23:54:52 CST : Collecting additional diagnostic information...
2020/08/14 23:55:37 CST : Getting list of files satisfying time range [08/14/2020 03:00:00 CST, 08/14/2020 05:00:00 CST]
2020/08/14 23:55:50 CST : Collecting ADR incident files...
2020/08/14 23:56:49 CST : Completed collection of additional diagnostic information...
2020/08/14 23:56:50 CST : Completed Local Collection
2020/08/14 23:56:50 CST : Remote Collection in Progress...
.---------------------------------.
|        Collection Summary       |
+------+-----------+-------+------+
| Host | Status    | Size  | Time |
+------+-----------+-------+------+
| db02 | Completed | 803kB | 128s |
| db01 | Completed | 1.2MB | 118s |
'------+-----------+-------+------'

Logs are being collected to: /tmp/repository/collection_Fri_Aug_14_23_54_41_CST_2020_node_all
/tmp/repository/collection_Fri_Aug_14_23_54_41_CST_2020_node_all/db01.tfa_Fri_Aug_14_23_54_41_CST_2020.zip
/tmp/repository/collection_Fri_Aug_14_23_54_41_CST_2020_node_all/db02.tfa_Fri_Aug_14_23_54_41_CST_2020.zip

我這里是測試環境演示,沒什么太多信息所以日志比較小,實際生產環境,這個壓縮文件一般會大一些。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM