Tomcat剖析（三）：連接器(1)

第一部分：概述

這一節基於《深度剖析Tomcat》第三章:連接器總結而成。

大家都知道Catalina 中有兩個主要的模塊：連接器和容器。本節將HttpServer2完善為HttpConnect，創建一個更好的請求和響應對象的連接器。

最好先到我的github上下載本書相關的代碼，同時去上網下載這本書。

上一節是使用Reques對象，因為我們不知道請求的類型，而這一節中，如果是HTTP請求，那么Request對象中需要更多的信息，所以本節的重點是連接器(中的處理器)解析 HTTP 請求頭部和cookie並讓 servlet 可以獲得頭部, cookies, 參數名/值等等，不再只是解析URI。同時，將會完善第 2 節中 Response 類的 getWriter 方法，讓它能夠正確運行。由於這些改進，你將會從上一節的PrimitiveServlet 中獲取一個完整的響應，並能夠運行更加復雜的ModernServlet。

上一節的HttpServer 類被分離為兩個類：HttpConnector和 HttpProcessor，Request 被 HttpRequest 所取代，而 Response 被 HttpResponse 所取代。HttpServer 類的職責是等待 HTTP 請求並創建請求和響應對象。在本節的應用中，等待 HTTP 請求的工作交給 HttpConnector 實例，而創建請求和響應對象的工作交給了HttpProcessor 實例。

一個 HttpRequest 對象將會給轉換為一個 HttpServletRequest 實例並傳遞給被調用的 servlet 的 service 方法。因此，每個 HttpRequest 實例必須適當增加字段，以便 servlet可以使用它們。值需要賦給 HttpRequest 對象，包括 URI，查詢字符串，參數，cookies 和其他的頭部等等(可以在代碼中看到)。因為連接器並不知道被調用的 servlet（就是我們自己定義的servlet可能是很復雜的，需要獲取請求中所有的參數）需要哪個值，所以連接器必須從 HTTP 請求中解析所有可獲得的值。

同時，不是簡簡單單的調用自己的await方法，而是用線程啟動。

核心類有4個 (上一節講過的類這一節有的重復出現的，不列出來了)：

HttpConnector.java: 等待 HTTP 請求並創建請求和響應對象
HttpProcessor.java：創建請求和響應對象
SocketInputStream.java：負責直接從輸入流中獲取頭部和請求行信息
RequestUtil.java:這節的功能是對請求頭中cookie的解析
HttpHeader.java、HttpRequest.java、HttpRequestLine.java、HttpResponse.java分別封裝了請求頭，請求，請求行，響應需要的內容。HttpRequestFacade.java和HttpResponseFacade.java同樣是為了安全性而寫的，實現方式和上一節類似(哪里不安全了，可參照上一節)。ServletProcessor.java也只是將相應的類(如RequestFacade改為HttpRequestFacade)進行修改。

第二部分：代碼講解

HttpConnector.java

等待 HTTP 請求
為每個請求創建個 HttpProcessor 實例
調用 HttpProcessor 的 process 方法

這個類是這一節中最簡單的，不用詳細說明了，不同的可以參考上一節代碼注釋解釋如下：

package ex03.pyrmont.connector.http;

import java.io.IOException;
import java.net.InetAddress;
import java.net.ServerSocket;
import java.net.Socket;

public class HttpConnector implements Runnable {

    boolean stopped;
    private String scheme = "http";

    // 返回一個 scheme，這里是HttpConnector，當然是http
    public String getScheme() {
        return scheme;
    }

    public void run() {
        ServerSocket serverSocket = null;
        int port = 8080;
        try {
            serverSocket = new ServerSocket(port, 1,
                    InetAddress.getByName("127.0.0.1"));
        } catch (IOException e) {
            e.printStackTrace();
            System.exit(1);
        }
        while (!stopped) {
            // Accept the next incoming connection from the server socket
            Socket socket = null;
            try {
                socket = serverSocket.accept();
            } catch (Exception e) {
                continue;
            }
            // 除了用線程啟動外，這里也是與上節的不同之處，不從連接器中直接判斷請求類型，而是交給HttpProcessor處理
            HttpProcessor processor = new HttpProcessor(this);
            processor.process(socket);
        }
    }

    public void start() {
        Thread thread = new Thread(this);
        thread.start();
    }
}

HttpProcessor.java

HttpProcessor 類的 process 方法接受前來的 HTTP 請求的套接字，會做下面的事情：

創建一個 HttpRequest 對象。
創建一個 HttpResponse 對象。
解析 HTTP 請求的第一行和頭部，並放到 HttpRequest 對象。
解析 HttpRequest 和 HttpResponse 對象到一個 ServletProcessor 或者StaticResourceProcessor

所以這個類需要有HttpRequest、HttpRequestLine和HttpResponse實例

package ex03.pyrmont.connector.http;

import ex03.pyrmont.ServletProcessor;
import ex03.pyrmont.StaticResourceProcessor;

import java.net.Socket;
import java.io.OutputStream;
import java.io.IOException;

import javax.servlet.ServletException;
import javax.servlet.http.Cookie;

import org.apache.catalina.util.RequestUtil;
import org.apache.catalina.util.StringManager;

public class HttpProcessor {

    public HttpProcessor(HttpConnector connector) {
        this.connector = connector;
    }
    //HttpProcessor對應的HttpConnector，由構造器傳入
    private HttpConnector connector = null;
    private HttpRequest request;
    private HttpRequestLine requestLine = new HttpRequestLine();
    private HttpResponse response;

    protected String method = null;
    protected String queryString = null;

    //Tomcat的錯誤管理方式，將在本節最后講解
    protected StringManager sm = StringManager
            .getManager("ex03.pyrmont.connector.http");

    //由HttpConnector類調用，socket為當前發出請求的用戶
    public void process(Socket socket) {

        //SocketInputStream將在本文后面講解
        SocketInputStream input = null;
        OutputStream output = null;
        try {
            //對獲取輸入流進行封裝。
            //里面的readHeader(HttpHeader)和readRequestLine(HttpRequestLine)
            //用於直接從得到的流中獲取Heander和RequestLine對象
            input = new SocketInputStream(socket.getInputStream(), 2048);
            output = socket.getOutputStream();

            //創建HttpRequest對象
            request = new HttpRequest(input);

            //創建HttpRequest對象
            response = new HttpResponse(output);
            response.setRequest(request);
            //這里可以設置響應給瀏覽器的響應頭
            response.setHeader("Server", "Pyrmont Servlet Container");

            //本節的重點，解析請求行和請求頭
            parseRequest(input, output);
            parseHeaders(input);

            //if else塊和上一節一樣，通過判斷URI的不同調用不同的Processor處理器
            if (request.getRequestURI().startsWith("/servlet/")) {
                ServletProcessor processor = new ServletProcessor();
                processor.process(request, response);
            } else {
                StaticResourceProcessor processor = new StaticResourceProcessor();
                processor.process(request, response);
            }

            //關閉socket
            socket.close();
            //注意：這個應用還沒有關閉程序功能，需要自己強制關閉
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    //功能：解析請求頭
    //這個方法是org.apache.catalina.connector.http.HttpProcessor的簡化版本
    //這個方法解析了一些“簡單”的頭部，像"cookie", "content-length","content-type"，忽略了其他頭部
    private void parseHeaders(SocketInputStream input) throws IOException,
            ServletException {
        //進入死循環，直到解析請求頭結束跳出
        while (true) {
            HttpHeader header = new HttpHeader();
            //SocketInputStream流中獲取Header對象，本文后面具體講解
            //從SocketInputStream代碼的readHeader中可以看到通過這里的while方法，一次提取出一個鍵值對的請求頭的值
            //所以下面的name和value就獲得到了相應的值
            //parseRequest也是使用這樣的方法，下面就忽略了
            input.readHeader(header);

            //檢測 HttpHeader 實例的 nameEnd 和 valueEnd 字段來測試是否可以從輸入流中讀取下一個頭部信息
            if (header.nameEnd == 0) {
                if (header.valueEnd == 0) {
                    return;
                } else {
                    throw new ServletException(
                            sm.getString("httpProcessor.parseHeaders.colon"));
                }
            }

            String name = new String(header.name, 0, header.nameEnd);
            String value = new String(header.value, 0, header.valueEnd);

            //將獲取到的請求頭的名稱和值放入HttpRquest對象中
            //如名稱可以為content-length，值可以為10164(某個數字)
            request.addHeader(name, value);

            //判斷是否是cookie(cookie包含在請求頭中)，格式如
//Cookie: BD_UPN=1d314753; ispeed_lsm=4; sugstore=1; BAIDUID=3E664426E867095427DD59:FG=1; BIDUPSID=3E664426E827DD59; PSTM=1440774226; BDUSS=Ex4NkJ0bEF0WTgwMwAAAA; ATS_PSID=1
            if (name.equals("cookie")) {
                //如果是cookie，還要對cookie做特殊處理,本文后面講解
                Cookie cookies[] = RequestUtil.parseCookieHeader(value);
                for (int i = 0; i < cookies.length; i++) {
                    if (cookies[i].getName().equals("jsessionid")) {
                        // Override anything requested in the URL
                        if (!request.isRequestedSessionIdFromCookie()) {
                            // Accept only the first session id cookie
                            request.setRequestedSessionId(cookies[i].getValue());
                            request.setRequestedSessionCookie(true);
                            request.setRequestedSessionURL(false);
                        }
                    }
                    request.addCookie(cookies[i]);
                }
                //判斷請求中是否有content-length
            } else if (name.equals("content-length")) {
                int n = -1;
                try {
                    //有的話直接轉為int類型保存到HttpRequest對象中
                    n = Integer.parseInt(value);
                } catch (Exception e) {
                    throw new ServletException(
                            sm.getString("httpProcessor.parseHeaders.contentLength"));
                }
                request.setContentLength(n);
            } else if (name.equals("content-type")) {
                //如果是content-type直接保存
                request.setContentType(value);
            }
        } // end while
    }

    //解析請求行
    //請求行如：GET /myApp/ModernServlet?userName=tarzan&password=pwd HTTP/1.1
    private void parseRequest(SocketInputStream input, OutputStream output)
            throws IOException, ServletException {

        //從SocketInputStream對象中直接獲取RequestLine對象
        input.readRequestLine(requestLine);

        //獲取請求的方式：如GET
        String method = new String(requestLine.method, 0, requestLine.methodEnd);
        //這里沒有直接獲取請求的URI
        //因為如/myApp/ModernServlet?userName=tarzan&password=pwd后面有查詢的字符串，需要先分割
        String uri = null;
        //獲取請求的協議版本：如HTTP/1.1
        String protocol = new String(requestLine.protocol, 0,
                requestLine.protocolEnd);

        //請求行無效的情況：沒有請求的方式或沒有請求的URI
        if (method.length() < 1) {
            throw new ServletException("Missing HTTP request method");
        } else if (requestLine.uriEnd < 1) {
            throw new ServletException("Missing HTTP request URI");
        }
        //判斷和獲取請求行中第二項中的請求參數，並獲取到URI
        int question = requestLine.indexOf("?");
        if (question >= 0) {//有參數的
            //得到"?"后面的查詢字符串：如userName=tarzan&password=pwd，並保存到HttpRequest對象中
            request.setQueryString(new String(requestLine.uri, question + 1,
                    requestLine.uriEnd - question - 1));
            //得到URI
            uri = new String(requestLine.uri, 0, question);
        } else {
            //沒參數的
            request.setQueryString(null);
            uri = new String(requestLine.uri, 0, requestLine.uriEnd);
        }

        //這里的if語句用於請求的不是以/開頭的相對資源，
        //即獲取以絕對地址的請求方式的URI
        if (!uri.startsWith("/")) {
            int pos = uri.indexOf("://");
            // Parsing out protocol and host name
            if (pos != -1) {
                pos = uri.indexOf('/', pos + 3);
                if (pos == -1) {
                    uri = "";
                } else {
                    uri = uri.substring(pos);
                }
            }
        }

        //檢查並解析第二項中的可能存在的 jsessionid
        String match = ";jsessionid=";
        int semicolon = uri.indexOf(match);
        if (semicolon >= 0) {
            String rest = uri.substring(semicolon + match.length());
            int semicolon2 = rest.indexOf(';');
            if (semicolon2 >= 0) {
                //將獲取到的值放到HttpRequest對象中
                request.setRequestedSessionId(rest.substring(0, semicolon2));
                rest = rest.substring(semicolon2);
            } else {
                request.setRequestedSessionId(rest);
                rest = "";
            }
            //當 jsessionid 被找到，也意味着會話標識符是攜帶在查詢字符串里邊，而不是在 cookie里邊。需要傳遞true
            request.setRequestedSessionURL(true);
            uri = uri.substring(0, semicolon) + rest;
        } else {
            request.setRequestedSessionId(null);
            request.setRequestedSessionURL(false);
        }

        //用於糾正“異常”的 URI。
        String normalizedUri = normalize(uri);

        // Set the corresponding request properties
        ((HttpRequest) request).setMethod(method);
        request.setProtocol(protocol);
        if (normalizedUri != null) {
            ((HttpRequest) request).setRequestURI(normalizedUri);
        } else {
            ((HttpRequest) request).setRequestURI(uri);
        }

        if (normalizedUri == null) {
            throw new ServletException("Invalid URI: " + uri + "'");
        }
    }

    //糾正“異常”的 URI。例如，任何\的出現都會給/替代。
    //這里涉及到URL的編碼解碼：編碼的格式為：%加字符的ASCII碼的十六進制表示
    //如果URL不能糾正返回null，否則返回相同的或者被糾正后的 URI
    protected String normalize(String path) {
        if (path == null)
            return null;
        // Create a place for the normalized path
        String normalized = path;

        //如果URI是/~開頭的，除去URI前面前四個字符並加上/~
        //%7E->~          
        if (normalized.startsWith("/%7E") || normalized.startsWith("/%7e"))
            normalized = "/~" + normalized.substring(4);

        //下面是解碼后對應的結果,這些字符不能在URI中出現
        //%25->%   %2F->/  %2E->.  %5c->\
        //如果找到如下字符的其中一個，說明URI錯誤
        if ((normalized.indexOf("%25") >= 0)
                || (normalized.indexOf("%2F") >= 0)
                || (normalized.indexOf("%2E") >= 0)
                || (normalized.indexOf("%5C") >= 0)
                || (normalized.indexOf("%2f") >= 0)
                || (normalized.indexOf("%2e") >= 0)
                || (normalized.indexOf("%5c") >= 0)) {
            return null;
        }
        //如果URI僅僅只是/.則返回/
        //如www.cnblogs.com/.是可以糾正的
        if (normalized.equals("/."))
            return "/";

        //將\轉為/，這里的\\是指\，第一個\是轉義字符
        if (normalized.indexOf('\\') >= 0)
            normalized = normalized.replace('\\', '/');
        //URI字符串如果沒有以/開頭就加給它
        if (!normalized.startsWith("/"))
            normalized = "/" + normalized;
        //如果存在//，將剩下/
        //如http://www.cnblogs.com/lzb1096101803/p//4797948.html變為
        //http://www.cnblogs.com/lzb1096101803/p/4797948.html
        while (true) {
            int index = normalized.indexOf("//");
            if (index < 0)
                break;
            normalized = normalized.substring(0, index)
                    + normalized.substring(index + 1);
        }
        //如果存在/./,變成/
        while (true) {
            int index = normalized.indexOf("/./");
            if (index < 0)
                break;
            normalized = normalized.substring(0, index)
                    + normalized.substring(index + 2);
        }
        //如果存在/../
        while (true) {
            int index = normalized.indexOf("/../");
            if (index < 0)
                break;
            if (index == 0)
                return (null); // Trying to go outside our context
            int index2 = normalized.lastIndexOf('/', index - 1);
            normalized = normalized.substring(0, index2)
                    + normalized.substring(index + 3);
        }
        //URI中存在/...或者3個點以上，認為不能糾正
        if (normalized.indexOf("/...") >= 0)
            return (null);

        //返回修改后的URI
        return (normalized);

    }

}

HttpProcessor.java詳細說明：

(其實我更推薦看我注釋的代碼結合github上的源碼進行分析，下面是書上的文字解釋，可以幫助大家了解整個流程)

解析請求行

HttpProcessor 的 process 方法調用私有方法 parseRequest 用來解析請求行例如一個 HTTP請求的第一行。這里是一個請求行的例子：GET /myApp/ModernServlet?userName=tarzan&password=pwd HTTP/1.1請求行的第二部分是 URI 加上一個查詢字符串。在上面的例子中，URI 是這樣的：/myApp/ModernServlet

另外，在問好后面的任何東西都是查詢字符串。因此，查詢字符串是這樣的：userName=tarzan&password=pwd查詢字符串可以包括零個或多個參數。在上面的例子中，有兩個參數名 / 值對，userName/tarzan 和 password/pwd。在 servlet/JSP 編程中，參數名 jsessionid 是用來攜帶一個會話標識符。會話標識符經常被作為 cookie 來嵌入，但是程序員可以選擇把它嵌入到查詢字符串去，例如，當瀏覽器的 cookie 被禁用的時候。

當 parseRequest 方法被 HttpProcessor 類的 process 方法調用的時候，request 變量指向一個 HttpRequest 實例。parseRequest 方法解析請求行用來獲得幾個值並把這些值賦給HttpRequest 對象。parseRequest 方法首先調用 SocketInputStream 類的 readRequestLine 方法,在這里 requestLine 是 HttpProcessor 里邊的 HttpRequestLine 的一個實例.調用它的 readRequestLine 方法來告訴 SocketInputStream 去填入 HttpRequestLine 實例。接下去，parseRequest 方法獲得請求行的方法，URI 和協議.不過，在 URI 后面可以有查詢字符串，假如存在的話，查詢字符串會被一個問好分隔開來。因此，parseRequest 方法試圖首先獲取查詢字符串。並調用 setQueryString 方法來填充HttpRequest 對象.

大多數情況下，URI 指向一個相對資源，URI 還可以是一個絕對值

然后，查詢字符串也可以包含一個會話標識符，用 jsessionid 參數名來指代。因此，parseRequest 方法也檢查一個會話標識符。假如在查詢字符串里邊找到 jessionid，方法就取得會話標識符，並通過調用 setRequestedSessionId 方法把值交給 HttpRequest 實例當 jsessionid 被找到，也意味着會話標識符是攜帶在查詢字符串里邊，而不是在 cookie里邊。因此，傳遞 true 給 request 的 setRequestSessionURL 方法。否則，傳遞 false 給setRequestSessionURL 方法並傳遞 null 給 setRequestedSessionURL 方法。　　

到這個時候，uri 的值已經被去掉了 jsessionid。接下去，parseRequest 方法傳遞 uri 給 normalize 方法，用於糾正“異常”的 URI。例如，任何\的出現都會給/替代。假如 uri 是正確的格式或者異常可以給糾正的話， normalize 將會返回相同的或者被糾正后的 URI。假如 URI 不能糾正的話，它將會給認為是非法的並且通常會返回null。在這種情況下(通常返回 null)，parseRequest 將會在方法的最后拋出一個異常。
解析頭部

parseHeaders 方法包括一個 while 循環用於持續的從 SocketInputStream 中讀取頭部，直到再也沒有頭部出現為止。循環從構建一個 HttpHeader 對象開始，並把它傳遞給類SocketInputStream 的 readHeader 方法.然后，你可以通過檢測 HttpHeader 實例的 nameEnd 和 valueEnd 字段來測試是否可以從輸入流中讀取下一個頭部信息

一旦你獲取到頭部的名稱和值，你通過調用 HttpRequest 對象的 addHeader 方法來把它加入headers 這個 HashMap 中。

一些頭部也需要某些屬性的設置。例如，當 servlet 調用 javax.servlet.ServletRequest的 getContentLength 方法的時候， content-length 頭部的值將被返回。而包含 cookies 的 cookie頭部將會給添加到 cookie 集合中。

RequestUtil.java

功能：解析cookie

cookie格式如下：Cookie: userName=budi; password=pwd

解析cookie過程其實只是簡單的切割字符串然后將key value放入一個cookie對象中，因為不是很關鍵的代碼。

SocketInputStream.java

上一節沒有試圖為那兩個應用程序去進一步解析請求。org.apache.catalina.connector.http.SocketInputStream 提供的方法不僅用來獲取請求行，還有請求頭部。通過傳遞一個 InputStream 和一個指代實例使用的緩沖區大小的整數，來構建一個 SocketInputStream 實例。

對於如何SocketInputStream中如何做到解析請求頭和請求行不用太care，學習tomcat的核心注意是對整個框架有個了解，不用關心太過細的地方，

我認為只要清楚請求中大部分信息的解析流程就足夠了。

第三部分：小結

下一節補充Tomcat是如何在請求到達是才獲取請求參數的。

因為對Tomcat而言，它不需要馬上解析查詢字符串或者 HTTP 請求內容，直到 servlet 才讀取參數。因此，HttpRequest 的這四個方法開頭調用了 parseParameter 方法。參數解析將會使得 SocketInputStream 到達字節流的尾部。類 HttpRequest 使用一個布爾變量 parsed 來指示是否已經解析過了。

同時也加入關於Tomcat錯誤信息獲取和其中涉及到的國際化的內容。

附

相應代碼可以在我的github上找到下載，拷貝到eclipse，然后打開對應包的代碼即可。

如發現編譯錯誤，可能是由於jdk不同版本對編譯的要求不同導致的，可以不管，供學習研究使用。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 Tomcat剖析（四）：Tomcat默認連接器(1) 從連接器組件看 Tomcat 的線程模型學習Tomcat（三）之容器連接器探秘Tomcat——連接器和容器的優雅啟動從連接器組件看Tomcat的線程模型——BIO模式從連接器組件看Tomcat的線程模型——NIO模式 2019.9.25 禁用Tomcat AJP連接器 Tomcat熱部署與熱加載射頻連接器分類 Linux動態連接器 SMA連接器與BNC連接器的區別