1. 為什么要使用Http Trunked協議?
一般http通信時會使用content_length頭信息來表示服務器發送的文檔內容長度,這是因為我們已經提前知道了文檔內容的長度,但
有時候我們無法提前知道我們需要傳輸的文檔的長度,這時我們就要采用分塊傳輸的方式來發送內容,也就是通過我們的http trunked協議。
Http1.1x協議的chunked編碼方式,可以確保接收端能夠准確的判斷不定長內容收取是否完整。
2. http RFC文檔中的chunked編碼格式
chunked編碼一般使用若干個chunk串聯而成,最后一個chunk的長度為0,表示chunk數據結束。每個chunked分為頭部和正文,頭部指
定下一段正文的長度,正文只的是實際內容。通過/r/n分隔符來分隔各個部分。
Chunked編碼格式:
Chunked-Body = *chunk
last-chunk
trailer
CRLF
chunk = chunk-size [ chunk-extension ] CRLF
chunk-data CRLF
chunk-size = 1*HEX
last-chunk = 1*("0") [ chunk-extension ] CRLF
chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
chunk-ext-name = token
chunk-ext-val = token | quoted-string
chunk-data = chunk-size(OCTET)
trailer = *(entity-header CRLF)
Chunked-Body表示經過chunked編碼后的報文體。報文體可以分為chunk, last-chunk,trailer和結束符四部分。chunk的數量在報文
體中最少可以為0,無上限;每個chunk的長度是自指定的,即,起始的數據必然是16進制數字的字符串,代表后面chunk-data的長度
(字節數)。這個16進制的字符串第一個字符如果是“0”,則表示chunk-size為0,該chunk為last-chunk,無chunk-data部分。可選的
chunk-extension由通信雙方自行確定,如果接收者不理解它的意義,可以忽略。
trailer是附加的在尾部的額外頭域,通常包含一些元數據,后文中會給出具體的例子。
3. Http RFC文檔中chunked協議解碼
length := 0 //長度計數器置0
read chunk-size, chunk-extension (if any) and CRLF
//讀取chunk-size, chunk-extension和CRLF
while(chunk-size > 0 ) { //表明不是last-chunk
read chunk-data and CRLF //讀chunk-size大小的chunk-data,skip CRLF
append chunk-data to entity-body //將此塊chunk-data追加到entity-body后
read chunk-size and CRLF //讀取新chunk的chunk-size 和 CRLF
}
read entity-header //entity-header的格式為name:valueCRLF,如果為空即只有CRLF
while (entity-header not empty) //即,不是只有CRLF的空行
{
append entity-header to existing header fields
read entity-header
}
Content-Length:=length //將整個解碼流程結束后計算得到的新報文體length
//作為Content-Length域的值寫入報文中
Remove "chunked" from Transfer-Encoding //同時從Transfer-Encoding中域值去除 chunked這個標記
從這段偽代碼中,我們可以看出chunked協議還是比較簡單的,用任何一門語言實現起來都很方便。下面我們給出Java版實現的例子。

1 private ByteBuffer m_chunkbody = ByteBuffer.allocate(100*1024); 2 public byte[] m_buffer; 3 protected BufferedInputStream m_bis = getInputStream(); 4 BisBuffer bb = new BisBuffer(m_bis, 100);//BisBuffer每次讀一百個字節 5 int contentLength = 0; 6 boolean isDataOverLimit = false; 7 byte b; 8 while (true) { // 頭判斷 9 b = (byte)bb.read();//每次讀取一個字節 10 if (b == 'T' || b == 't') { 11 byte[] keyBuffer = new byte[50]; 12 idx = 0; 13 while ((keyBuffer[idx++] = (byte)bb.read()) != ' ') // "transfer-Encoding: " 14 if (keyBuffer[idx - 1] == '\n') 15 continue; 16 String encoding = new String(keyBuffer, 0, idx - 1); 17 if (encoding.equalsIgnoreCase("ransfer-Encoding:")) { 18 byte[] chuckBuffer = new byte[8]; 19 idx = 0; 20 while ((chuckBuffer[idx++] = (byte)bb.read()) != '\r'); 21 String chunked = new String(chuckBuffer, 0, idx - 1); // chunked協議標記 22 logger.info(chunked); 23 if (!chunked.equalsIgnoreCase("chunked")) 24 throw new Exception("not chunked!"); 25 b = (byte)bb.read(); // \n 26 } else if(encoding.equalsIgnoreCase("railer:")){ 27 byte[] trailerBuffer = new byte[50]; 28 idx = 0; 29 while ((trailerBuffer[idx++] = (byte)bb.read()) != '\r'); 30 String trailer = new String(trailerBuffer, 0, idx - 1); // trailer 31 String[] trailerArr = trailer.split(","); 32 if (!(trailerArr[0].trim()).equalsIgnoreCase("data_over_limit")) 33 throw new Exception("trailer doesn't have data_over_limit!"); 34 b = (byte)bb.read(); // \n 35 }else { 36 while ((b = (byte)bb.read()) != '\n') ; // \n 37 } 38 } else if (b == '\r') { 39 //到此說明接下來是chunked-body相關內容,。 40 b = (byte)bb.read(); // \n 41 byte[] lensize = new byte[Integer.SIZE]; 42 idx = 0; 43 while((lensize[idx++] = (byte)bb.read()) != '\r'); 44 b = (byte)bb.read(); // \n 45 int chunksize = Integer.parseInt(new String(lensize,0,idx-1),16); 46 //n個chunked包的解析 47 while(chunksize > 0){ 48 contentLength += chunksize;//add len 49 if(contentLength < 0 || contentLength > 100*1024) 50 throw new Exception(contentLength+" LENGTH TOO LARGE!"); 51 52 byte[] temp = new byte[chunksize]; 53 idx = 0; 54 while(idx != chunksize){ 55 temp[idx++] = bb.read(); 56 } 57 m_chunkbody.put(temp);//append chunk-data 58 //讀取下一個chunk-data 59 idx = 0; 60 b = (byte)bb.read(); // \r 61 b = (byte)bb.read(); // \n 62 while((lensize[idx++] = (byte)bb.read()) != '\r'); 63 chunksize = Integer.parseInt(new String(lensize,0,idx-1),16); 64 b = (byte)bb.read(); // \n 65 } 66 b = (byte)bb.read(); // \r 67 b = (byte)bb.read(); // \n 68 } else { 69 if(b == 's'){ 70 //end 讀取完chunk-body,最后將trailer數據讀取出來 71 byte[] trailerBuffer = new byte[50]; 72 idx = 0; 73 while ((trailerBuffer[idx++] = (byte)bb.read()) != '\r'); 74 String trailer = new String(trailerBuffer, 0, idx - 1); // trailer 75 int length = trailer.length(); 76 trailer = "s" + trailer; 77 String tailerKey = "data_over_limit: "; 78 if (!trailer.startsWith(tailerKey)) 79 throw new Exception("data_over_limit ERROR!"); 80 String isOverLimit = trailer.substring(tailerKey.length(),length+1); 81 if(isOverLimit.equalsIgnoreCase("true")){ 82 isDataOverLimit = true; 83 }else 84 isDataOverLimit = false; 85 b = (byte)bb.read(); // \n 86 break; 87 } 88 while ((b = (byte)bb.read()) != '\n') ; // 其他頭字段 89 } 90 } 91 //組裝chunk-body的內容,即chunk-size對應的chunk-data的所有塊的組合。 92 m_chunkbody.flip();//反轉 93 m_buffer = m_chunkbody.array(); 94 m_chunkbody.clear();//清空緩沖區
4 . Chunked協議發送端數據組裝
首先來看一下http普通協議和http trunked協議header頭部信息的異同。普通http頭部信息如下所示:
Post xxx http/1.1
Accept-Language: en-us
Accept: */*
Host: xxx.xxx
User-Agent: xxx HTTP Client
Content-Length: 1024
Http Trunked協議頭部信息:
Post xxx http/1.1
Accept-Language: en-us
Accept: */*
Host: xxx.xxx
User-Agent: xxx HTTP Client
Transfer-Encoding: chunked
Trailer: data_over_limit
從上面我們可以看到普通http協議header包含了長度信息,chunked協議是沒有長度的,需要再客戶端全部chunk數據解析后才
能得到傳輸信息的具體長度。
頭部信息的組裝通過java代碼來實現如下:

1 public byte[] creatHttpHeader(){ 2 StringBuilder sb = new StringBuilder(100); 3 sb.append("POST xxx http/1.1").append("\r\n"); 4 sb.append("Accept-Language: en-us").append("\r\n"); 5 sb.append("Accept: */*").append("\r\n"); 6 sb.append("Host: xxx.xxx").append("\r\n"); 7 sb.append("User-Agent: xxx HTTP Client").append("\r\n"); 8 sb.append("Transfer-Encoding: chunked").append("\r\n"); 9 sb.append("Trailer: data_over_limit").append("\r\n"); 10 sb.append("\r\n"); // mark header over 11 return sb.toString().getBytes("US-ASCII"); 12 }
Trailer信息Java代碼實現:

1 public byte[] createChunkedTrailer(){ 2 StringBuilder sb = new StringBuilder(100); 3 sb.append("data_over_limit: true\r\n"); 4 return sb.toString().getBytes("US-ASCII"); 5 }
傳輸內容chunked組裝java代碼實現:

1 public void send(InputStream fileInputStream) throws IOException { 2 OutputStream requestStream = socket.getOutputStream(); 3 ChunkedOutputStream chunkedBodyStream = new ChunkedOutputStream(requestStream); 4 int chunkSize = 2048; 5 this.requestStream.write(createHttpHeader()); 6 7 byte[] buf = new byte[chunkSize]; 8 int readed = 0; 9 int size = 0; 10 while ((readed = fileInputStream.read(buf)) != -1) { 11 size += readed; 12 } 13 this.chunkedBodyStream.finish(); 14 this.requestStream.write(createChunkedTrailer()); 15 }
在介紹一下ChunkedOutputStream這個類,這個類是httpclient-3.0.1.jar里面的一個類,源代碼我們可以拿到,代碼實現的很簡
潔,有興趣的同學可以好好看看,可以去網上獲取httpclient的源代碼。
傳輸內容組裝好之后,就可以通過套接字發送到客戶端去了,第三節中的代碼就可以解析從這里發送過去的數據,怎么樣,很簡單吧
,看過之后大家都會使用了吧。