作者:桂。
時間:2017-03-16 20:30:20
鏈接:http://www.cnblogs.com/xingshansi/p/6561536.html
聲明:歡迎被轉載,記得注明出處~
前言
本文為曲線與分布擬合的一部分,主要介紹正態分布、拉普拉斯分布等常用分布擬合的理論推導以及代碼實現。
一、理論推導
假設數據獨立同分布。對於任意數據點$x_i$,對應概率密度為$f(x_i)$,最大似然函數:
$J = \mathop \prod \limits_{i = 1}^N f({x_i})$
表示成參數,並寫成對數形式:
$L\left( \theta \right) = \ln J\left( \theta \right) = \sum\limits_{i = 1}^N {f({x_i};\theta )} $
A-正態分布
對於正態分布:
$f(x) = \frac{1}{{\sqrt {2\pi } \sigma }}{e^{ - \frac{{{{(x - \mu )}^2}}}{{2{\sigma ^2}}}}}$
求偏導得參數估計:
$\hat \mu = \frac{{\sum\limits_{i = 1}^N {{x_i}} }}{N}$
${\hat \sigma ^2} = \frac{{\sum\limits_{i = 1}^N {{{\left( {{x_i} - \mu } \right)}^2}} }}{N} = \frac{{{{\left( {{\bf{x}} - \mu } \right)}^T}\left( {{\bf{x}} - \mu } \right)}}{N}$
B-拉普拉斯分布
對於拉普拉斯分布:
$f(x) = \frac{1}{{2b}}{e^{ - \frac{{\left| {x - \mu } \right|}}{b}}}$
由於其概率密度曲線為對稱分布,因此均值估計可用統計均值直接表示:
$\hat \mu = \frac{{\sum\limits_{i = 1}^N {{x_i}} }}{N}$
最大似然函數求偏導,得出$b$的估計:
$\hat b = \frac{{\sum\limits_{i = 1}^N {\left| {{x_i} - \mu } \right|} }}{N}$
C-對數正態分布
對數正態分布:
$f(x) = \frac{1}{{x\sqrt {2\pi } \sigma }}{e^{ - \frac{{{{(\ln x - \mu )}^2}}}{{2{\sigma ^2}}}}}$
事實上,令$t = lnx$,則參數求解與正態分布完全一致。
$\hat \mu = \frac{{\sum\limits_{i = 1}^N {{t_i}} }}{N}$
${\hat \sigma ^2} = \frac{{\sum\limits_{i = 1}^N {{{\left( {{t_i} - \mu } \right)}^2}} }}{N} = \frac{{{{\left( {{\bf{t}} - \mu } \right)}^T}\left( {{\bf{t}} - \mu } \right)}}{N}$
D-瑞利分布
瑞利分布:
$f(x) = \frac{x}{{{\sigma ^2}}}{e^{ - \frac{{{x^2}}}{{2{\sigma ^2}}}}}$
最大似然求導,得出參數估計:
${\hat \sigma ^2} = \frac{{\sum\limits_{i = 1}^N {x_i^2} }}{{2N}}$
二、代碼實現
A-正態分布
x = x(:); % should be column vectors ! N = length(x); u = sum(x)/N; sig2 = (x-u)'*(x-u)/N;
B-拉普拉斯分布
x = x(:); % should be column vectors ! N = length(x); u = sum( x )/N; b = sum(abs(x-u))/N;
C-對數正態分布
t = log(x(:)); % should be column vectors ! N = length(x); m = sum( t )/N; sig2 = (t-m)'*(t-m)/N;
D-瑞利分布
x = real(x(:)); % should be column vectors ! N = length(x); s = sum(x.^2)/(2*N);
三、應用舉例
以正態分布為例:
rng('default') % for reproducibility x = 3*randn(100000,1)-2; %fitting x = x(:); % should be column vectors ! N = length(x); u = sum(x)/N; sig2 = (x-u)'*(x-u)/N; %Plot figure; %Bar subplot 311 numter = [-15:.2:10]; [histFreq, histXout] = hist(x, numter); binWidth = histXout(2)-histXout(1); bar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on; %Fitting plot subplot 312 y = 1/sqrt(2*pi*sig2)*exp(-(numter-u).^2/2/sig2); plot(numter,y,'r','linewidth',2);grid on; %Fitting result subplot 313 bar(histXout, histFreq/binWidth/sum(histFreq)); hold on;grid on; plot(numter,y,'r','linewidth',2);
結果圖:
單個分布以本文為例。