本篇介紹的是websocket,但是並不介紹它的協議格式,一般能看明白http頭也能明白websocket在協議切換前的協商,能看明白IP報頭也就對websocket在協議切換后通訊格式不陌生。
websocket在可靠傳輸TCP之上,提供了消息包的傳輸服務,即是在websocket的一端的應用層調用websocket發送指定大小的消息,在另一端的websocket就會向協議處理過程提交同樣大小的消息。至於消息的格式客戶自定義。
本篇將通過http協議文檔串聯起來,了解websocket想要解決什么問題。
如果在win平台下需要參考libwebsocket的例子,可以參考上一篇《win平台下編譯帶libev和libuv的libwebsocket》。
本篇參考的協議文檔有
rfc 1945 《Hypertext Transfer Protocol -- HTTP/1.0》 May 1996
1.3 Overall Operation
7.2.2 Length
8. Method Definitions
rfc 2068 《Hypertext Transfer Protocol -- HTTP/1.1》 January 1997
rfc 2616 《Hypertext Transfer Protocol -- HTTP/1.1》 June 1999
8.1 Persistent Connections
10.1.2 101 Switching Protocols
14.10 Connection
19.7.1 Compatibility with HTTP/1.0 Persistent Connections
rfc 2817 《Upgrading to TLS Within HTTP/1.1》 May 2000
3. Client Requested Upgrade to HTTP over TLS
可見http/1.0協議發展到現在已經有30年了。
首先我們來看http/1.0協議文檔
1.3 Overall Operation [page 5-6]
The HTTP protocol is based on a request/response paradigm. A client
establishes a connection with a server and sends a request to the
server in the form of a request method, URI, and protocol version,
followed by a MIME-like message containing request modifiers, client
information, and possible body content. The server responds with a
status line, including the message's protocol version and a success
or error code, followed by a MIME-like message containing server
information, entity metainformation, and possible body content.
[page 7]
On the Internet, HTTP communication generally takes place over TCP/IP
connections. The default port is TCP 80 [15], but other ports can be
used. This does not preclude HTTP from being implemented on top of
any other protocol on the Internet, or on other networks. HTTP only
presumes a reliable transport; any protocol that provides such
guarantees can be used, and the mapping of the HTTP/1.0 request and
response structures onto the transport data units of the protocol in
question is outside the scope of this specification.
從運作概述可知,HTTP協議基於請求/響應(request/response)的范式。客戶端建立對服務端的連接,發送一個固定格式的請求;服務端對請求作出響應。也就是服務端不主動向客戶端發送數據。
接着文檔指出HTTP協議只相信可靠的傳輸,在Internet范圍內首選TCP,默認端口80,眾所周知。所以我們在底層傳輸協議鎖定在TCP,我們在本篇提到的socket就等同於使用在INET協議簇的TCP協議的socket。
websocket一拆開就是web和socket,如何利用http協議通訊建立起的socket連接,玩出新天地。
[page 7]
Except for experimental applications, current practice requires that
the connection be established by the client prior to each request and
closed by the server after sending the response. Both clients and
servers should be aware that either party may close the connection
prematurely, due to user action, automated time-out, or program
failure, and should handle such closing in a predictable fashion. In
any case, the closing of the connection by either or both parties
always terminates the current request, regardless of its status.
除某些實驗性應用程序外,http 1.0協議要求客戶端為每個請求建立起連接,並由服務端在發送完響應消息后關閉連接。在這個過程中,客服兩端都可以關閉連接,其后果是中止本次請求(響應)。
雖然在http 1.0已經有對長度的定義和使用,用來描述消息體(body)的大小,但是也不能復用(reuse)一個連接,並且在沒有content-length的情況下,由服務端關閉連接來標志響應發送的結束,客戶端去判斷。
下面是相關的文檔
7.2.2 Length [Page 29]
When an Entity-Body is included with a message, the length of that
body may be determined in one of two ways. If a Content-Length header
field is present, its value in bytes represents the length of the
Entity-Body. Otherwise, the body length is determined by the closing
of the connection by the server.
Closing the connection cannot be used to indicate the end of a
request body, since it leaves no possibility for the server to send
back a response. Therefore, HTTP/1.0 requests containing an entity
body must include a valid Content-Length header field. If a request
contains an entity body and Content-Length is not specified, and the
server does not recognize or cannot calculate the length from other
fields, then the server should send a 400 (bad request) response.
http協議可以做什么呢?http定義了幾種方法,我們使用得最多就GET方法。
8.1 GET [Page 30]
The GET method means retrieve whatever information (in the form of an
entity) is identified by the Request-URI. If the Request-URI refers
to a data-producing process, it is the produced data which shall be
returned as the entity in the response and not the source text of the
process, unless that text happens to be the output of the process.
我們通過http協議,獲取來自互聯網的資源,由URI指定資源的目的地址。目標資源可能不是一個實體而是一個處理過程(進程),獲取它處理的結果。
到了http 1.1協議,添加了Connection字段,它的使用如文檔
14.10 Connection
The Connection general-header field allows the sender to specify
options that are desired for that particular connection and MUST NOT
be communicated by proxies over further connections.
The Connection header has the following grammar:
Connection = "Connection" ":" 1#(connection-token)
connection-token = token
支持http 1.1協議的服務端開始支持持久連接,也就是當一個request/response結束后,服務端並不關閉連接。需要用"Connection: close"明確標示不支持持久連接的實現。
8.1 Persistent Connections 8.1.1 Purpose Persistent HTTP connections have a number of advantages: - HTTP requests and responses can be pipelined on a connection. Pipelining allows a client to make multiple requests without waiting for each response, allowing a single TCP connection to be used much more efficiently, with much lower elapsed time. 8.1.2.1 Negotiation An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to maintain a persistent connection unless a Connection header including the connection-token "close" was sent in the request. If the server chooses to close the connection immediately after sending the response, it SHOULD send a Connection header including the connection-token close.
文檔中指出了持久連接的用途目的,其中最突出就是,連接可以復用來進行不止一次的request/response,並且兩端可以通過Connection字段來控制持久連接的結束。
試想一下一個網頁下引用了多個圖片,腳本,文本等資源,如果用http 1.0協議,就必須每個資源的請求都建立一個連接,但是用http 1.1協議就可以將這些請求pipeline到一個連接來完成。
我們來看一下libwebsocket的測試例子-簡單http服務的包capture:
(紅框是請求,綠框是響應,藍線是瀏覽器端口關聯)
7681為http服務端口,test.html引用了兩個圖片分別是logo.png和favicon.ico,整個test.html網頁的document引用到的資源只通過一次TCP連接完成了多個http請求。
接下來我們繼續看文檔,看看http 1.1還有什么新元素。
10.1.2 101 Switching Protocols The server understands and is willing to comply with the client's request, via the Upgrade message header field (section 14.42), for a change in the application protocol being used on this connection. The server will switch protocols to those defined by the response's Upgrade header field immediately after the empty line which terminates the 101 response. The protocol SHOULD be switched only when it is advantageous to do so. For example, switching to a newer version of HTTP is advantageous over older versions, and switching to a real-time, synchronous protocol might be advantageous when delivering resources that use such features.
http 1.1開始支持協議切換,websocket自然也就有了一切的支持要素了,想來也不是什么新鮮事,websocket必然就應運而生。
我們看一下其它一些早就應用這個特性的協議,TLS(rfc 2817 Upgrading to TLS Within HTTP/1.1)
3. Client Requested Upgrade to HTTP over TLS When the client sends an HTTP/1.1 request with an Upgrade header field containing the token "TLS/1.0", it is requesting the server to complete the current HTTP/1.1 request after switching to TLS/1.0. 3.1 Optional Upgrade A client MAY offer to switch to secured operation during any clear HTTP request when an unsecured response would be acceptable: GET http://example.bank.com/acct_stat.html?749394889300 HTTP/1.1 Host: example.bank.com Upgrade: TLS/1.0 Connection: Upgrade
現在我們回故一下上面瀏覽過的文檔。
1. http協議依賴可靠傳輸,在互聯網環境中首選使用是TCP傳輸協議。
2. http協議是基於request/response范式的,服務端不主動向客戶端發數據。
3. http 1.0不支持持久連接,每次request/response都要建立一次TCP連接。
4. http 1.1要求支持持久連接,連接可以復用完成多次request/response,但還是不能離開request/response范式,還得依靠輪詢。
5. http 1.1開始支持協議切換。
各種終端硬件性能翻天覆地的發展,各種軟件技術的強大支持,Web應用發展的需要。
websocket出現了。
原理就是利用http協議進行協議切換,將TCP連接(大家說它socket)從http協議中解放出來,進行更有效的應用數據通訊。
websocket協議提供消息包(frame或message)傳輸服務,使用者可以定義任形式的應用協議。websocket使用http協議切換之前,客戶端和服務端必須協商好各種參數,最主要的就是綁定到哪一種使用者定義的應用協議,詢問http服務器可不可切換websocket協議,以及期望http服務器返回101回答我切換好websocket協議並且對你說的某某protocol和extension支持。它們通過一次request/response,利用http協議頭字段進行參數的交互協商,然后切換成websocket通訊。libwebsocket協議切換的狀態機在下一篇。
libwebsocket庫的測試例子為我們展現了websocket致力和善於解決的問題。server例子實現了最基本簡單的http服務以及其它一些基於ws的測試服務,可以看到一個http服務器如何同時支持http和ws協議。test.html使用到的測試例子全都由server測試程序提供服務,分別有基本的網頁服務,post大數據,基於ws協議的自增數推送服務,以及類似於聊天室的消息轉發廣播的鏡子(mirror)服務(不過不是文字記錄,而是繪圖路徑)。還有幾個客戶端測試應用程序,如何用ws消費server程序的服務,以及通過鏡子服務與瀏覽器實時數據傳遞。
本篇到此結束,多謝大家觀看。
其它相關
