谷歌中的數據挖掘應用->statisticians都在google干嘛


有博客園上的朋友問“領域目前的從業情況”@瘋狂的小風

各位大大如果有恰好是在公司任職做數據挖掘的,也請不吝分享下自己的工作,大家也可以交流下數據挖掘在業內的從業情況。

我先拋磚引玉下,在以前公司主要是做一些recommend system的搭建,主要包括各種分類用戶數據的抓取與過濾,調整算法參數和結果的一些指標評測以及其可視化;還有就是關於spam user的detect。前幾天接到EMC2的電話,發現他們也是做數據挖掘的,畢竟是號稱大數據的公司,不過具體干嘛沒問,說是以后再聯系。百度的數據挖掘應用則多如牛毛,做數據挖掘的進去了不愁找不到坑坐。淘寶的數據挖掘部門也是眾所周知的數據魔方,里面大概分了research組和其他的負責design的組,research組也是從評分到推薦什么都干。豆瓣的數據挖掘的人似乎是去的豆瓣算法組......目前里面的人士偏統計方向的。搜狗有一批是搞推薦和自然語言處理的,網易游戲也招數據挖掘分析師。還有就是一些投行和量化期貨小團隊的,比如今年赫赫有名的本科年薪120W+的dd學長就是去了香港的JaneStreet,具體工作是做quant。

所以看的出國內外公司對於搞數據挖掘還是有很大的需求量的,而且offer方面至少能保證我們衣食無憂,安心研究。不過我理想中的數據挖掘從業者應該是在理論上熟悉機器學習,統計學,神經網絡,數據挖掘標准化流程;在實踐上熟悉工程,會用hadoop,熟悉C/C++,python,R;在工具上熟悉多種數據挖掘分析軟件和開源包;在分析問題上一方面謹遵標准化流程,另一方面又有敏銳嗅覺的。

 

最近在Quora上看到的What do statisticians do at Google便是講數據挖掘領域在谷歌公司中的從業情況。

statistician,或者說Data Scientist和Quantitative Analyst等等,據在google工作的作為Statistician的Michael Hochster說,最大的關注點是搜索廣告
Michael Hochster在兩個領域都工作過。

 

搜索領域數據分析師着重於搜索的質量,谷歌的工程師為了讓搜索效果更好而工作,數據分析師則指出其是搜索否更好。他所知有幾個有統計學博士(a couple of people )的工作是提高搜索的質量,不過他們被稱為軟件工程師。

Michael Hochster現在工作在一個叫廣告指標(Ads Metrics)的中型group里,這個group基本上是由數據分析師組成。很多但不是所有是接收統計學訓練的。

他們的工作既有提高廣告服務(ads serve)

而在廣告則是提高提高廣告服務(ads serving)和廣告度量(measurement)上。這個測度的項目不僅僅是一個一時的分析結果,也是包含了開發工具和processes。比如進行和分析在大規模上的實驗。這里作者做了點模糊的說明,估計和項目的保密有關。

還有幾個組也做雇佣了數據分析師,比如搜索基礎建設組(分析谷歌的目錄),經濟組(做預測),量化市場組等。

 

總之我們可以看出,數據分析師在谷歌,不單單做分析的工作,也是要寫代碼實現的,可以說是RD結合吧,這也是提醒我們不但機器學習和統計學的功底要牢固,算法和acm什么的也不能落下,工程方面更要多做^_^

 

轉文如下,部分我覺得有用的地方用黑體標注出了,也希望能有所指導和啟發

原文鏈接:
http://www.quora.com/Google/What-do-statisticians-do-at-Google
Michael Hochster, Statistician at Google


A lot of different things. There is something called the Quantitative Analyst job ladder within Google which includes many different titles (Quantitative Analyst, Statistician, Data Scientist. etc) which as far as I can tell are all the same. Who gets which of these titles seems to be mostly a function of when the person was hired.

The largest concentrations of statisticians are in Search and Ads. I have worked in both these areas. In Search, statisticians concentrate on measuring search quality. Google engineers primarily work on making it better, statisticians work on figuring out whether it is better. I know a couple of people in search with PhDs in statistics who work on making search better, but they are called Software Engineers.

I now work in a medium-sized group called Ads Metrics, which is made up mostly of Quantitative Analysts. Many, but not all, are trained in statistics. We work both on building models to help improve ads serving, and measurement (i.e. how to measure whether a change to ads serving is a good thing) The measurement projects are not just one-at-a-time analysis questions but also involve development of tools and processes, for example for carrying out and analyzing experiments on a large scale. I'm being deliberately extremely vague about the details here, sorry.

There are several other groups that employ Quantitative Analysts. I don't know too much about most of them: Search Infrastructure (analysis relating to Google's index); Economics (forecasting), Quantitative Marketing (I know nothing at all about this).

For me, the joy of statistics is in answering interesting questions with data. Google abounds both with interesting (to me) questions, vast amounts of data, and powerful tools for working with it. It's a great place to be in my line of work.


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM