HTTP協議知識點 (11個知識點,比較詳細)


(一)   對象更新校驗方式:

HTTP通過兩種方式驗證對象是否有更新if-non-match 或者 if-modified-since. 通過在Request中包含上述header向服務器發起詢問。當response中包含E-tag頭時,瀏覽器應該使用if-non-match來詢問;response中含有last-modified頭時,瀏覽器應用if-modified-since來進行更新詢問。HTTP1.1規范建議使用E-tag方式(當不能使用e-tag方式時使用last-modified),但事實上很多現代服務器依然使用last-modified方式。當服務器同時使用E-tag和last-modified時,瀏覽器應同時發送if-non-match和if-modified-since頭,服務器應同時對這兩個頭進行確認,只有同時符合未更新條件方可返回304響應。

 

(二)   Cache控制:

1.       用在request中的cache控制頭

Pragma: no-cache :兼容早起HTTP協議版本 如1.0+

Cache-Control: no-cache ,表示不希望得到一個緩存內容。只是希望,cache設備可能忽略。

Cache-Control: no-store,表示client與server之間的設備不能緩存響應內容,並應該刪除已有緩存。

Cache-Control: only-if-cached,表示只接受是被緩存的內容

2.       用在response中控制cache的頭

Cache-Control: max-age=3600,用相對於接收到的時間開始可緩存多久
Cache-Control: s-maxage=3600,與上面類似,只是s-maxage一般用在cache服務器上,並只對public緩存有效

Expires: Fri, 05 Jul 2002, 05:00:00 GMT 基於GMT的時間,絕對時間,但該頭容易受到本地錯誤時間影響

Cache-Control: must-revalidate 該頭表示內容可以被緩存但每次必須詢問是否有更新。

各種cache-control頭值和意義:

Cache-Control header directives

Directive

Message type

Description

no-cache

Request

Do not return a cached copy of the document without first revalidating it with the server.

no-store

Request

Do not return a cached copy of the document. Do not store the response from the server.

max-age

Request

The document in the cache must not be older than the specified age.

max-stale

Request

The document may be stale based on the server-specified expiration information, but it must not have been expired for longer than the value in this directive.

min-fresh

Request

The document's age must not be more than its age plus the specified amount. In other words, the response must be fresh for at least the specified amount of time.

no-transform

Request

The document must not be transformed before being sent.

only-if-cached

Request

Send the document only if it is in the cache, without contacting the origin server.

public

Response

Response may be cached by any cache.

private

Response

Response may be cached such that it can be accessed only by a single client.

no-cache

Response

If the directive is accompanied by a list of header fields, the content may be cached and served to clients, but the listed header fields must first be removed. If no header fields are specified, the cached copy must not be served without revalidation with the server.

no-store

Response

Response must not be cached.

no-transform

Response

Response must not be modified in any way before being served.

must-revalidate

Response

Response must be revalidated with the server before being served.

proxy-revalidate

Response

Shared caches must revalidate the response with the origin server before serving. This directive can be ignored by private caches.

max-age

Response

Specifies the maximum length of time the document can be cached and still considered fresh.

s-max-age

Response

Specifies the maximum age of the document as it applies to shared caches (overriding the max-age directive, if one is present). This directive can be ignored by private caches.

 

 

 

(三)   兩個特殊的HTTP 動作 optionstrace

1.       Trace可用來追蹤在client和Server之間存在多少個代理服務器,當然前提是代理服務器支持設置via頭,用法:

執行:

trace /tttt.gif HTTP/1.1

host:www.sohu.com

服務器會返回如下頭信息

HTTP/1.0 200 OK

Date: Mon, 16 Mar 2009 11:47:52 GMT

Server: Apache/1.3.37 (Unix) mod_gzip/1.3.26.1a

Content-Type: message/http

X-Cache: MISS from 19709705.29867846.28603073.sohu.com

Via: 1.0 19709705.29867846.28603073.sohu.com:80 (squid)

Connection: close

服務器返回如下內容(這個內容反應的是中間代理服務器發向OWS的頭部內容)

TRACE / HTTP/1.0

Cache-Control: max-age=36288000

Connection: keep-alive

Host: www.sohu.com

Via: 1.1 19709705.29867846.28603073.sohu.com:80 (squid)

X-Forwarded-For: 58.31.225.229

從上可以看出,中間經過了19709705.29867846.28603073.sohu.com代理服務器,而且該服務器只支持http1.0

2.Options可用來探測請求某個對象時,服務器能支持的HTTP動作

OPTIONS /ssss.gif HTTP/1.1

host:www.sohu.com

 

HTTP/1.0 200 OK

Date: Mon, 16 Mar 2009 11:59:17 GMT

Server: Apache/1.3.37 (Unix) mod_gzip/1.3.26.1a

Cache-Control: max-age=5184000

Expires: Fri, 15 May 2009 11:59:17 GMT

Content-Length: 0

Allow: GET, HEAD, OPTIONS, TRACE

X-Cache: MISS from 32583031.43658676.41464477.sohu.com

Via: 1.1 32583031.43658676.41464477.sohu.com:80 (squid)

Connection: close

 

(四)   HTTP連接控制:

http連接可以分為1.順序連接 2並行連接 3保持連接

順序連接:是為每個對象建立一個TCP連接,這導致了傳輸中增加了大量的TCP建立、拆連時間

 

並行連接: 同時建立多個TCP通道,並行傳輸對象,重疊了TCP連接建立時間,因而總體延遲會減少,但並行連接對客戶端及服務器性能提出了更高要求,HTTP規范並行TCP連接不應超過2個,事實上現代瀏覽器已經支持6-10個不等

 

保持連接:

通過保持TCP通道的打開,在通道內連續傳輸對象,可以有效減少TCP建立帶來的開銷或TCP慢啟動帶來的影響。

 

在HTTP1.0+版本中開始引入keep-alive概念,在HTTP1.1中改為persistent,兩者的區別是HTTP1.0中,必須在header中顯式說明keep-alive,而HTTP1.1中persistent是默認行為,除非使用connection:close明確指明關閉連接。

使用keep-alive或persistent需注意:

在HTTP1.0中必須顯式申明keep-alive,並在一個通道的后續request中也明確包含keep-alive,否則服務器將會認為client希望關閉連接;服務器的response中可以通過包含connection頭來指明是同意keep-alive還是希望關閉連接。

使用保持連接必須在response中正確包含實體內容的長度或使用chunked,否則其他HTTPrequest將無法知道前一個對象是否傳輸完成。

 

(五)   HTTP規范認為:如果Request中不含Accept-Encoding:即表示接受任意編碼類型(例如GZIP壓縮.------------實際測試發現並不一定成立。

 

(六)   Chunked

這是一種傳輸編碼,正常情況下http要求先知道對象的大小才能進行傳輸,以便接收端正確知道傳輸該何時結束,但是如果服務器無法報告對象的大小(例如)時,且連接是一個保持連接,則必須使用chunked傳輸。設置chunked后(在response頭中設置transfer-encoding:chunked),對象將被切割為多個長度來傳輸,每次傳輸均指明當次內容長度,並在最后一次設置0以指示傳輸結束:

 

 

(七)   區間請求(range request

http容許請求一個文檔的指定區間內容,如果一次http下載因為某種原因中途失敗,則http可以在下次請求使用range頭,這樣可以實現斷點續傳。同時range也廣泛用在P2P類下載中,同時從多個服務器上下載同一類容以實現加快下載速度。

GET /bigfile.html HTTP/1.1
Host: www.joes-hardware.com
Range: bytes=4000-
User-Agent: Mozilla/4.61 [en] (WinNT; I)

在request頭中包含Range: bytes=4000-表示已經下載4000bytes,本次請求從4000bytes開始即可。

而在response中可以設置Accept-Ranges: bytes以表示服務器可以接受range請求,並求度量單位是byte。

 

 

 

(八)   Delta Encoding

一種減少http傳輸量的方法,正常情況下,如果服務器端一個文檔更新后,將導致在下次客戶端請求時,服務器端發送整個新文檔給客戶端,而如果這個文檔只是更新了一小部分,重新傳輸完整的文檔則是對資源的一種浪費。http通過delta encoding技術實現只傳輸變化部分,其技術原理是:

1.       服務器在第一次響應中包含一個e-tag頭,表示該文檔的一個唯一版本識別碼

2.       客戶端在下一次請求時,將在request中包含if-non-match頭,向服務器詢問該文檔是否有更新;同時在request設置A-IM(accept-instance manipulation)頭表示可以接受delta技術。

3.       服務器在接到請求后發現自己擁有文檔的新版本(因為文檔的e-tag已經變化了),於是在響應中包含IM頭,e-tag頭,delta-base頭向客戶端表明文檔是如何更新的,其中IM頭的值表示的是delta的某種算法,e-tag頭是新的e-tag,delta-base表示本次delta算法是基於哪個版本計算出來的(正常情況下應該等於request中的if-non-match頭值)

4.       客戶端在接到response后啟動delta算法更新本地文檔,並更新本地文檔的e-tag值為新的e-tag值。

 

在delta算法中用到的頭有:

Delta-encoding headers

Header

Description

ETag

Unique identifier for each instance of a document. Sent by the server in the response; used by clients in subsequent requests in If-Match and If-None-Match headers.

If-None-Match

Request header sent by the client, asking the server for a document if and only if the client's version of the document is different from the server's.

A-IM

Client request header indicating types of instance manipulations accepted.

IM

Server response header specifying the type of instance manipulation applied to the response. This header is sent when the response code is 226 IM Used.

Delta-Base

Server response header that specifies the ETag of the base document used for generating the delta (should be the same as the ETag in the client request's If-None-Match header).

 

可以包含在A-IM和IM頭中的值有(即delta可用的算法):

IANA registered types of instance manipulations

Type

Description

vcdiff

Delta using the vcdiff algorithm[14]

diffe

Delta using the Unix diff -e command

gdiff

Delta using the gdiff algorithm[15]

gzip

Compression using the gzip algorithm

deflate

Compression using the deflate algorithm

range

Used in a server response to indicate that the response is partial content as the result of a range selection

identity

Used in a client request's A-IM header to indicate that the client is willing to accept an identity instance manipulation

 

 

 

(九)   HTTP狀態碼一覽表:

 Status codes

Status code

Reason phrase

Meaning

100

Continue

An initial part of the request was received, and the client should continue.

101

Switching Protocols

The server is changing protocols, as specified by the client, to one listed in the Upgrade header.

200

OK

The request is okay.

201

Created

The resource was created (for requests that create server objects).

202

Accepted

The request was accepted, but the server has not yet performed any action with it.

203

Non-Authoritative Information

The transaction was okay, except the information contained in the entity headers was not from the origin server, but from a copy of the resource.

204

No Content

The response message contains headers and a status line, but no entity body.

205

Reset Content

Another code primarily for browsers; basically means that the browser should clear any HTML form elements on the current page.

206

Partial Content

A partial request was successful.

300

Multiple Choices

A client has requested a URL that actually refers to multiple resources. This code is returned along with a list of options; the user can then select which one he wants.

301

Moved Permanently

The requested URL has been moved. The response should contain a Location URL indicating where the resource now resides.

302

Found

Like the 301 status code, but the move is temporary. The client should use the URL given in the Location header to locate the resource temporarily.

303

See Other

Tells the client that the resource should be fetched using a different URL. This new URL is in the Location header of the response message.

304

Not Modified

Clients can make their requests conditional by the request headers they include. This code indicates that the resource has not changed.

305

Use Proxy

The resource must be accessed through a proxy, the location of the proxy is given in the Location header.

306

(Unused)

This status code currently is not used.

307

Temporary Redirect

Like the 301 status code; however, the client should use the URL given in the Location header to locate the resource temporarily.

400

Bad Request

Tells the client that it sent a malformed request.

401

Unauthorized

Returned along with appropriate headers that ask the client to authenticate itself before it can gain access to the resource.

402

Payment Required

Currently this status code is not used, but it has been set aside for future use.

403

Forbidden

The request was refused by the server.

404

Not Found

The server cannot find the requested URL.

405

Method Not Allowed

A request was made with a method that is not supported for the requested URL. The Allow header should be included in the response to tell the client what methods are allowed on the requested resource.

406

Not Acceptable

Clients can specify parameters about what types of entities they are willing to accept. This code is used when the server has no resource matching the URL that is acceptable for the client.

407

Proxy Authentication Required

Like the 401 status code, but used for proxy servers that require authentication for a resource.

408

Request Timeout

If a client takes too long to complete its request, a server can send back this status code and close down the connection.

409

Conflict

The request is causing some conflict on a resource.

410

Gone

Like the 404 status code, except that the server once held the resource.

411

Length Required

Servers use this code when they require a Content-Length header in the request message. The server will not accept requests for the resource without the Content-Length header.

412

Precondition Failed

If a client makes a conditional request and one of the conditions fails, this response code is returned.

413

Request Entity Too Large

The client sent an entity body that is larger than the server can or wants to process.

414

Request URI Too Long

The client sent a request with a request URL that is larger than what the server can or wants to process.

415

Unsupported Media Type

The client sent an entity of a content type that the server does not understand or support.

416

RequestedRange Not Satisfiable

The request message requested a range of a given resource, and thatrange either was invalid or could not be met.

417

Expectation Failed

The request contained an expectation in the Expect request header that could not be satisfied by the server.

500

Internal Server Error

The server encountered an error that prevented it from servicing the request.

501

Not Implemented

The client made a request that is beyond the server's capabilities.

502

Bad Gateway

A server acting as a proxy or gateway encountered a bogus response from the next link in the request response chain.

503

Service Unavailable

The server cannot currently service the request but will be able to in the future.

504

Gateway Timeout

Similar to the 408 status code, except that the response is coming from a gateway or proxy that has timed out waiting for a response to its request from another server.

505

HTTP Version Not Supported

The server received a request in a version of the protocol that it can't or won't support.

 

 

 

(十)   【原創】一個負載均衡與E-tag頭矛盾導致緩存效果變壞的實例分析:

負載均衡服務器后端是WEB服務器,但這些服務器是異構的比如說有linux的有windows的。 linux上設置http response中含last-modified頭,但沒有etag頭:

linux.jpg (16.64 KB)

2009-3-13 19:02

windows服務器上設置response中既有last-modified 又有etag頭。

windows.jpg (12.37 KB)

2009-3-13 19:02


第一次打開網站,a圖片是從windows服務器上下到的,b圖片是從linux服務器上下到的。 第二次打開網站(第2次打開時候超過緩存時間,由於該網站響應中只含有last-modified頭,因此瀏覽器會使用啟發式機制來計算可緩存時間。啟發式緩存時間控制會有一個計算系數 WA上的assembly策略中有一個50%的系數就是控制這個的) 瀏覽器在請求a圖片時候,被分配到了linux服務器上,b圖片被分配到了windows服務器上。
由於a圖片在第一次下載時擁有etag和last-modified兩種屬性。因此在第二次請求時瀏覽器會同時進行帶2個條件的get if-none-match和if-modified-since,根據http規范必須這2個條件同時滿足未變化才會返回304。可惜第2次請求被分配到了linux服務器上,這個服務器是沒有設置etag屬性的,本來可以從本地緩存的圖片卻變成了重新下載:

截圖00.jpg (57.37 KB)

2009-3-13 19:02


進一步分析: 使用e-tag是一件很壞的事情: 不同的服務器對同樣的e-tag算出的值是不一樣的,如果用e-tag作為判斷條件,在被負載均衡到不同服務器后,則很容易導致緩存失效

截圖01.jpg (52.91 KB)

2009-3-13 19:11

上圖,同一圖片在不同服務器上e-tag不同導致重新下載。 從服務器選擇上看,這個圖片這次恰好又分配到了另一台windows服務器,這樣e-tag和last-modified頭都有了,可以看到時間沒有變化。可惜的是由於e-tag不一致導致重新下載。

 

(十一) 使用yahoo的Yslow評測工具分析一個站點在HTTP方面所做的優化:

Yahoo WEB應用開發團隊是HTTP應用優化的倡導者和身體力行者,其開發團隊根據多年的經驗總結了數條網站優化規則,並編寫成程序,該程序已經被眾多的測試人員所津津樂道,並和強大的firebug工具集成,成為開發和測試人員的有利工具。

安裝方法:

1. 下載安裝Firefox瀏覽器

2. 下載安裝Firefox瀏覽插件firebug

3. 下載安裝Yslow

使用方法很簡單,類似httpwatch,打開一個網站時,該程序會自動分析並評測:

 

從上面可以看出,網站總體測評分較低,屬於F級別(A最優),其中還列出了具體可以優化的項目,並給各個項目的測評級別,例如可以再減少一些HTTP請求書,哪些項目可以使用CDN優化,哪些項目可以使用expire頭或GZIP壓縮等等。從上面的結果看,前5項都有很大的優化空間。具體內容可以展開項目后的三角形箭頭查看,例如CDN部分:

 

對象過期優化:

 

一些對象位置的優化:

 

Gzip優化

 

原文:http://www.cppblog.com/age100/archive/2010/06/25/118688.aspx

http://www.cnblogs.com/ningskyer/articles/4735646.html


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM