參考資料:
https://en.wikipedia.org/wiki/Inductive_bias
http://blog.sina.com.cn/s/blog_616684a90100emkd.html
Machine Learning. Tom M. Mitchell
下面我認為比較關鍵的內容都用紅色字體標注:
mokuram (mokuram) 於Tue Jan 4 05:22:24 2005)
提到:
就是學習器在學習的時候帶有的偏見。
(這個說法不很准確)
比如決策數分類器,很多決策數都采用 奧砍姆剃刀 原則 這樣的歸納偏置。
也就是說,在眾多能解決問題的決策數中,選擇最簡單的。
具體有關這個問題的探討,請參閱Tom的MACHINE LEARNING
中文版本國內有售
faiut (繁星滿天) 於Tue Jan 4 10:25:09 2005)
提到:
這個概念理解起來總是模模糊糊的,
還不會用自己的話描述。
jueww (覺·Hayek) 於Tue Jan 4 13:02:01 2005)
提到:
我喜歡用偏好這個單詞。
大概相對於model complexity之類。
mokuram (mokuram) 於Tue Jan 4 13:53:03 2005)
提到:
歸納偏置是隔標准的術語,英文是inductive bias
jueww (覺·Hayek) 於Tue Jan 4 20:16:58 2005)
提到:
但翻譯成中文用偏置不好吧。。。
bias and variance analysis里面翻譯成偏置、偏離才差不多。
mokuram (mokuram) 於Wed Jan 5 00:46:37 2005)
提到:
增華軍先生在翻譯TOM 的MACHINE LEARNING時,就是這樣翻譯的,
感覺MACHINE LEARNING時國外很著名的教材,
增先生的翻譯水平,也還不錯.
jueww (覺·Hayek) 於Wed Jan 5 10:05:22 2005)
提到:
怎么說都是各人偏好吧,反正沒人會把中文寫的論文當回事。
翻譯一個術語真的需要對這個行業的中文和英文非常懂才行。
偏置是一個電子行業的術語,容易產生誤解。
bias在ML中意思不止一個,用英語表達混亂了點,否則你也不會有這種疑問。
如果在中文時能夠將兩種意思用不同漢字表達,不是更好?
題歸正傳,我對BIAS的理解倒沒有像你這么看教材看得仔細,TOM那本東東我沒覺得
有多少用處,所以沒仔細看過。我完全是憑文獻中出現的上下文猜測這個單詞的意思得。
我覺得用“model complexity”或者說"representation ability“代替BIAS
好像一般沒什么問題,被你這么一問倒也發現真的不知道這個東東是說啥的了。。。
剛上網查了把,豁然開朗,嘿嘿:
Informally speaking, the inductive bias of a machine learning algorithm refers
to additional assumptions, that the learner will use to predict correct outpu
ts for situations that have not been encountered so far.
In machine learning one aims at the construction of algorithms, that are able
to learn to predict a certain target ouput. For this the learner will be prese
nted a limited number of training examples that demonstrate the intended relat
ion of input and output values. After successful learning, the learner is supp
osed to approximate the correct output, even for examples that have not been s
hown during training. Without any additional assumptions, this task cannot be
solved since unseen situations might have an arbitrary output value. The kind
of necessary assumptions about the nature of the target function are subsumed
in the term inductive bias. A classical example of an inductive bias is Occam'
s Razor, assuming that the simplest consistent hypothesis about the target fun
ction is actually the best. Here consistent means that the hypothesis of the l
earner yields correct ouputs for all of the examples that have been given to t
he algorithm.
Approaches to a more formal definition of inductive bias are based on mathemat
ical logic. Here, the inductive bias is a logical formula that, together with
the training data, logically entails the hypothesis generated by the learner.
Unfortunately, this strict formalism fails in many practical cases, where the
inductive bias can only be given as a rough description (e.g. in the case of n
eural networks).
跟我猜的意思基本一樣。。。
NeuroNetwork (刮開有獎:=>███████) 於Wed Jan 5 13:30:58 2005)
提到:
這兩個bias根本就不是一回事
NeuroNetwork (刮開有獎:=>███████) 於Wed Jan 5 14:26:11 2005)
提到:
DT的bias首先是disjunctive probability similarity,然后才是the shorter the better
ihappy (人似秋鴻來有信) 於Thu Jan 6 10:13:19 2005)
提到:
這個居然mark了?
不是誤人子弟嗎?
那段英文說的倒是沒錯的,"翻譯一個術語真的需要對這個行業的中文和英文非常懂才行。
"也沒錯,其他都錯了
bias和model complexity, representation ability完全是不同的東西。
jueww (覺·Hayek) 於Thu Jan 6 13:06:18 2005)
提到:
是不一樣啊。但我覺得就是差不多的東東。
本質想談的都是模型的推廣能力,
同一個東西換個角度表達的概念。
只不過bias跟具體分類算法相關時可以說得更加清楚點。
但如果是抽象的談bias,
我確實沒理解bias比model representation ability多了什么新東西,
請指教。
the inductive bias of a machine learning algorithm refers to additional assum
ptions, that the learner will use to predict correct outputs for situations th
at have not been encountered so far.
這個additional assumption我理解的就是模型的表達能力,只不過
bias是相對於learning algorithm上來說的,
而representation是相對於classification model來說的。
mitchell、dietteriech喜歡用bias,而vapnik喜歡用model complexity。
faiut (繁星滿天) 於Thu Jan 6 22:21:10 2005)
提到:
本來概念迷迷糊糊的,現在看了你的介紹豁然開朗。
3x
jueww (覺·Hayek) 於Thu Jan 6 22:28:13 2005)
提到:
呵呵。相互幫忙,何樂不為啊。再說真正搞過一樣東西的人,
都會碰到一樣的、很多書上沒有的東西的。。。只能靠自己領會了。
搞開發是這樣,搞所謂的研究估計也這樣。
ihappy (人似秋鴻來有信) 於Fri Jan 7 01:04:02 2005)
提到:
其實mitchell那本書這個部分講的很好啊。
首先,他舉了一個例子,說明任何bias-free的learner都是fruitless,不能用來
對任何unseen sample進行分類。換句話說,就是說,沒有bias的learner沒有任何
generalizability。 這個和model complexity是不同的,如果選擇了不合適的
model complexity,只是可能泛化能力變差而已,仍然有泛化能力。
所以,這個所謂的inductive bias是your PRIOR assumption about the learner.
這里英文用bias這個詞是合適的,至於中文應該翻譯成什么,我自己也沒有找到
什么合適的,似乎目前知道的,偏置這個翻譯可以用。
第二,inductive bias和occam razor有很大關系,因為通常大家的prior assumption,
就是inductive bias,會選擇occam razor,或者說,選擇合適的complexity比較小的
model,但是這兩者並不等價。比如說candidate elimination的inductive bias是
解存在(或者說version space不為空),decision tree的inductive bias是短
的樹(這個近似於model complxity),以及高information gain的屬性位置偏高
(這個就不是model complexity)
第三,inductive bias主要是個概念,實用性很差--除了有限的幾種簡單learner,
幾乎沒法說明其他learner的inductive bias是什么,而且對實際應用指導性很差。
但是對於machine learning的研究人員來說,這個概念是必須搞清楚的--以及他
和model complexity的區別
jueww (覺·Hayek) 於Fri Jan 7 01:23:55 2005)
提到:
領教了。不過還是不懂,也不覺得需要懂。。。
當文獻中用bias指向不同的分類器並進行比較時,我理解就是意在
比較它們之間complexity,representation ability,generalization ability,而你覺
得這些例子並不指它們在不同分類器之間比較。但現實是文獻中就是用bias來泛指各種
分類器。
下面是一篇文獻的題目和摘要。如果是prior,還能control嗎?反而model complexity來
代替的話,就很好理解了。
Control of inductive bias in supervised learning using evolutionary computatio
n: a wrapper-based approach
Source Data mining: opportunities and challenges table of contents
Pages: 27 - 54
Year of Publication: 2003
ISBN:1-59140-051-1
Author William H. Hsu Kansas State University
In this chapter, I discuss the problem of feature subset selection for supervi
sed inductive learning approaches to knowledge discovery in databases (KDD), a
nd examine this and related problems in the context of controlling inductive b
ias. I survey several combinatorial search and optimization approaches to this
problem, focusing on data-driven, validation-based techniques. In particular,
I present a wrapper approach that uses genetic algorithms for the search comp
onent, using a validation criterion based upon model accuracy and problem comp
lexity, as the fitness measure. Next, I focus on design and configuration of h
igh-level optimization systems (wrappers) for relevance determination and cons
tructive induction, and on integrating these wrappers with elicited knowledge
on attribute relevance and synthesis. I then discuss the relationship between
this model selection criterion and those from the minimum description length (
MDL) family of learning criteria. I then present results on several synthetic
problems on task-decomposable machine learning and on two large-scale commerci
al data-mining and decision-support projects: crop condition monitoring, and l
oss prediction for insurance pricing. Finally, I report experiments using the
Machine Learning in Java (MLJ) and Data to Knowledge (D2K) Java-based visual p
rogramming systems for data mining and information visualization, and several
commercial and research tools. Test set accuracy using a genetic wrapper is si
gnificantly higher than that of decision tree inducers alone and is comparable
to that of the best extant search-space based wrappers.