RankLib參數翻譯


寫在前面,metric2t指標詳解:

NDCG(Normalized discounted cumulative gain)即DCG/IDCG
CG(cumulative gain)
DCG(Discounted Cumulative Gain)
MAP(Mean Average Precision)
MRR(Mean Reciprocal Rank)

Usage: java -jar RankLib.jar <Params>
Params:
[+] Training (+ tuning and evaluation)
-train <file> 訓練數據
-ranker <type> 指定要使用的排名算法
0: MART (gradient boosted regression tree)
1: RankNet
2: RankBoost
3: AdaRank
4: Coordinate Ascent
6: LambdaMART
7: ListNet
8: Random Forests
[ -feature <file> ] 特征描述文件:列出要學習的特征,如果不指定,默認使用所有特征。
[ -metric2t <metric> ] Metric to optimize on the training data. Supported: MAP, NDCG@k, DCG@k, P@k, RR@k, ERR@k (default=ERR@10)
[ -gmax <label> ] Highest judged relevance label. It affects the calculation of ERR (default=4, i.e. 5-point scale {0,1,2,3,4})
[ -silent ] Do not print progress messages (which are printed by default)
[ -validate <file> ] 是否在驗證數據集上調整模型 Specify if you want to tune your system on the validation data (default=unspecified). If specified, the final model will be the one that performs best on the validation data
[ -tvs <x \in [0..1]> ] 訓練-驗證數據集的分割比例 Set train-validation split to be (x)(1.0-x)
[ -save <model> ] 學習的模型保存到指定文件 Save the learned model to the specified file (default=not-save)
[ -test <file> ] 是否要在數據上測試訓練的模型 Specify if you want to evaluate the trained model on this data (default=unspecified)
[ -tts <x \in [0..1]> ] 訓練-測試數據集的分割比例 Set train-test split to be (x)(1.0-x). -tts will override -tvs
[ -metric2T <metric> ] 默認與metric2t一樣 Metric to evaluate on the test data (default to the same as specified for -metric2t)
[ -norm <method>] 歸一化特征向量,方法包括求和歸一化,均值/標准差歸一化,最大值/最小值歸一化 Normalize feature vectors (default=no-normalization). Method can be:
sum: normalize each feature by the sum of all its values
zscore: normalize each feature by its mean/standard deviation
linear: normalize each feature by its min/max values
[ -kcv <k> ] 在訓練數據集上執行交叉驗證 Specify if you want to perform k-fold cross validation using ONLY the specified training data (default=NoCV)
-tvs can be used to further reserve a portion of the training data in each fold for validation
[ -kcvmd <dir> ] 交叉驗證訓練模型的目錄 Directory for models trained via cross-validation (default=not-save)
[ -kcvmn <model> ] Name for model learned in each fold. It will be prefix-ed with the fold-number (default=empty)
[-] RankNet-specific parameters RankNet特定參數
[ -epoch <T> ] 訓練迭代次數 The number of epochs to train (default=100)
[ -layer <layer> ] 隱含層個數 The number of hidden layers (default=1)
[ -node <node> ] 每層隱含節點個數 The number of hidden nodes per layer (default=10)
[ -lr <rate> ] 學習率 Learning rate (default=0.00005)
[-] RankBoost-specific parameters RankBoost特定參數
[ -round <T> ] 訓練迭代次數 The number of rounds to train (default=300)
[ -tc <k> ] 搜索的閾值候選個數 Number of threshold candidates to search. -1 to use all feature values (default=10)
[-] AdaRank-specific parameters AdaRank特定參數
[ -round <T> ] 訓練迭代次數 The number of rounds to train (default=500)
[ -noeq ] Train without enqueuing too-strong features (default=unspecified)
[ -tolerance <t> ] 連續兩輪學習之間的誤差 Tolerance between two consecutive rounds of learning (default=0.002)
[ -max <times> ] 一個特征可以被連續選擇而不改變性能的最大次數 The maximum number of times can a feature be consecutively selected without changing performance (default=5)
[-] Coordinate Ascent-specific parameters
[ -r <k> ] The number of random restarts (default=5)
[ -i <iteration> ] The number of iterations to search in each dimension (default=25)
[ -tolerance <t> ] Performance tolerance between two solutions (default=0.001)
[ -reg <slack> ] Regularization parameter (default=no-regularization)
[-] {MART, LambdaMART}-specific parameters LanbdaMART特定參數
[ -tree <t> ] 樹的個數 Number of trees (default=1000)
[ -leaf <l> ] 每個樹的葉子個數 Number of leaves for each tree (default=10)
[ -shrinkage <factor> ] 學習率 Shrinkage, or learning rate (default=0.1)
[ -tc <k> ] 樹分割時的候選特征個數 Number of threshold candidates for tree spliting. -1 to use all feature values (default=256)
[ -mls <n> ] 一個葉子最少的樣本個數 Min leaf support -- minimum #samples each leaf has to contain (default=1)
[ -estop <e> ] Stop early when no improvement is observed on validaton data in e consecutive rounds (default=100)
[-] Random Forests-specific parameters 隨機森林特定參數
[ -bag <r> ] Number of bags (default=300)
[ -srate <r> ] Sub-sampling rate (default=1.0)子集采樣率
[ -frate <r> ] Feature sampling rate (default=0.3) 特征采樣率
[ -rtype <type> ] Ranker to bag (default=0, i.e. MART)
[ -tree <t> ] Number of trees in each bag (default=1) 樹個數
[ -leaf <l> ] Number of leaves for each tree (default=100) 每個樹的葉節點個數
[ -shrinkage <factor> ] Shrinkage, or learning rate (default=0.1) 學習率
[ -tc <k> ] 樹分割時使用的候選特征閾值個數 Number of threshold candidates for tree spliting. -1 to use all feature values (default=256)
[ -mls <n> ] Min leaf support -- minimum #samples each leaf has to contain (default=1)
[ -estop <e> ] Stop early when no improvement is observed on validaton data in e consecutive rounds (default=100)
[+] Testing previously saved models 測試已經保存的模型
-load <model> 加載模型 The model to load
-test <file> 測試數據Test data to evaluate the model (specify either this or -rank but not both)
-rank <file> 對制定文件中的樣本排序,與-test不能同時指定 Rank the samples in the specified file (specify either this or -test but not both)
[ -metric2T <metric> ] Metric to evaluate on the test data (default=ERR@10)
[ -gmax <label> ] Highest judged relevance label. It affects the calculation of ERR (default=4, i.e. 5-point scale {0,1,2,3,4})
[ -score <file> ] Store ranker's score for each object being ranked (has to be used with -rank)
[ -idv ] 打印單個排名列表上的性能(必須與-test一起使用) Print model performance (in test metric) on individual ranked lists (has to be used with -test)
[ -norm ] 特征歸一化 Normalize feature vectors (similar to -norm for training/tuning)


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM