關於Facebook Chat的文章在InfoIQ已經出現很久很久了,正好Piaoger有看到了Facebook那位仁兄在Erlang-Factory上的一個PPT,結合起來看了看,還是有些用。
# Keywords
Realtime messaging, C++, Erlang, Long-polling, Thrift
# Challenges
▪ How does synchronous messaging work on the Web?
▪ “Presence” is hard to scale
▪ Need a system to queue and deliver messages
▪ Millions of connections, mostly idle
▪ Need logging, at least between page loads
▪ Make it work in Facebook’s environment
在用戶上線或者下線時通知其所有好友的做法是非常幼稚可笑的,這么做的代價是O(平均好友個數×高峰期用戶數×上下線頻率) 條短信/秒, 上下線頻率是指用戶平均每秒上線和下線的次數。當每個用戶好友的平均數量大約在幾百個,高峰期同時在線用戶數在百萬數量級的時候,這種實現方法的效率簡直 低得無法忍受。
Piaoger:
什么時候,我所工作着的Online Prouct也會有下面的困惑,那將是痛並快樂着:
在當產品的客戶有可能在一夜之間從零增加到七千萬的時候,可擴展性就變為從一開始就必須考慮的問題。
# System Overview

system Overview (Front-end)
▪ Mix of client-side Javascript and server-side PHP
▪ Regular AJAX for sending messages, fetching conversation history
▪ Periodic AJAX polling for list of online friends
▪ AJAX long-polling for messages (Comet)
System overview (Back-end)
▪ Discrete responsibilities for each service
- Communicate via Thrift
▪ Channel (Erlang): message queuing and delivery
- Queue messages in each user’s “channel”
- Deliver messages as responses to long-polling HTTP requests
▪ Presence (C++): aggregates online info in memory (pull-based presence)
▪ Chatlogger (C++): stores conversations between page loads
▪ Web tier (PHP): serves our vanilla web requests
在在集群和分區子系統上,Facebook選擇了C++和Erlang的組合。C++模塊用戶用於記錄聊天信息,而Erlang模塊“將在線用戶的對話保存在內存中並且對長時間輪詢(long-polled)請求提供支持”。
# Realtime Messaging
Facebook采用的是客戶端直接從服務器將新消息“拉”的方式,跟Comet的XHR長時間輪詢(Comet's XHR Long Polling)過程比較相似.
Facebook的頁面會加載一個iframe用於用戶間消息的傳遞, 這個iframe中的Javascript代碼發出一個HTTP GET請求,這個請求將建立與服務器的一個持久連接,直到有消息返回給用戶為止。

# Channel Server Architecture
Overview
▪ One channel per user
▪ Web tier delivers messages for that user
▪ Channel State: short queue of sequenced messages
▪ Long poll for streaming (Comet)
▪ Clients make an HTTP request
- Server replies when a message is ready
- One active request per browser tab
Details
▪ Distributed design
- User id space is partitioned (division of labor)
- Each partition is serviced by a cluster (availability)
▪ Presence aggregation
- Channel servers are authoritative
- Periodically shipped to presence servers
▪ Open source: Erlang, Mochiweb, Thrift, Scribe, fb303, et al.
Channel Servers

Channel Applcations

# Dark launch
啟動這項服務的方式也比較有意思——利用所謂的“摸黑啟動(dark launch)” 。
一夜間就將客戶數由零變為七千萬的秘密就在於避免一步到位地完成這個過程。我們會首先模擬很多用戶訪問的場景,這是通過一個叫做“摸黑啟動”的階段實現的.在這個階段中Facebook的頁面會在沒有任何UI元素的情況下連接聊天服務器,詢問在線信息和模擬信息發送過程。
Piaoger: 這個玩意兒,和我們的Warmup是不是有得一比??
# References
[Facebook Chat的架構(CHS)] (http://www.infoq.com/cn/news/2008/05/facebookchatarchitecture)
[Facebook Chat Architecture(EN)] (http://www.infoq.com/news/2008/05/facebookchatarchitecture)
[Facebook Chat] (http://www.facebook.com/note.php?note_id=14218138919&id=9445547199&index=0)
[Erlang at Facebook] (http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf)
