之前在Gin中已經說到, Gin比Martini的效率高好多耶, 究其原因是因為使用了httprouter這個路由框架, httprouter的git地址是: httprouter源碼. 今天稍微看了下httprouter的 實現原理, 其實就是使用了一個radix tree(前綴樹)來管理請求的URL, 下面具體看看httprouter原理.
###1. httprouter基本結構
httprouter中, 對於每種方法都有一顆tree來管理, 例如所有的GET方法對應的請求會有一顆tree管理, 所有的POST同樣如此. OK, 那首先看一下 這個router結構體長啥樣:
type Router struct {
// 這個radix tree是最重要的結構
// 按照method將所有的方法分開, 然后每個method下面都是一個radix tree
trees map[string]*node
// Enables automatic redirection if the current route can't be matched but a
// handler for the path with (without) the trailing slash exists.
// For example if /foo/ is requested but a route only exists for /foo, the
// client is redirected to /foo with http status code 301 for GET requests
// and 307 for all other request methods.
// 當/foo/沒有匹配到的時候, 是否允許重定向到/foo路徑
RedirectTrailingSlash bool
// If enabled, the router tries to fix the current request path, if no
// handle is registered for it.
// First superfluous path elements like ../ or // are removed.
// Afterwards the router does a case-insensitive lookup of the cleaned path.
// If a handle can be found for this route, the router makes a redirection
// to the corrected path with status code 301 for GET requests and 307 for
// all other request methods.
// For example /FOO and /..//Foo could be redirected to /foo.
// RedirectTrailingSlash is independent of this option.
// 是否允許修正路徑
RedirectFixedPath bool
// If enabled, the router checks if another method is allowed for the
// current route, if the current request can not be routed.
// If this is the case, the request is answered with 'Method Not Allowed'
// and HTTP status code 405.
// If no other Method is allowed, the request is delegated to the NotFound
// handler.
// 如果當前無法匹配, 那么檢查是否有其他方法能match當前的路由
HandleMethodNotAllowed bool
// If enabled, the router automatically replies to OPTIONS requests.
// Custom OPTIONS handlers take priority over automatic replies.
// 是否允許路由自動匹配options, 注意: 手動匹配的option優先級高於自動匹配
HandleOPTIONS bool
// Configurable http.Handler which is called when no matching route is
// found. If it is not set, http.NotFound is used.
// 當no match的時候, 執行這個handler. 如果沒有配置,那么返回NoFound
NotFound http.Handler
// Configurable http.Handler which is called when a request
// cannot be routed and HandleMethodNotAllowed is true.
// If it is not set, http.Error with http.StatusMethodNotAllowed is used.
// The "Allow" header with allowed request methods is set before the handler
// is called.
// 當no natch並且HandleMethodNotAllowed=true的時候,這個函數被使用
MethodNotAllowed http.Handler
// Function to handle panics recovered from http handlers.
// It should be used to generate a error page and return the http error code
// 500 (Internal Server Error).
// The handler can be used to keep your server from crashing because of
// unrecovered panics.
// panic函數
PanicHandler func(http.ResponseWriter, *http.Request, interface{})
}
上面的結構中, trees map[string]*node代表的一個森林, 里面有一顆GET tree, POST tree…
對應到每棵tree上的結構, 其實就是前綴樹結構, 從github上盜了一張圖:
假設上圖是一顆GET tree, 那么其實是注冊了下面這些GET方法:
GET("/search/", func1)
GET("/support/", func2)
GET("/blog/:post/", func3)
GET("/about-us/", func4)
GET("/about-us/team/", func5)
GET("/contact/", func6)
注意看到, tree的組成是根據前綴來划分的, 例如search和support存在共同前綴s, 所以將s作為單獨的parent節點. 但是注意這個s節點是沒有handle的. 對應/about-us/和/about-us/team/, 前者是后者的parent, 但是前者也是有 handle的, 這一點還是有點區別的.
總體來說, 創建節點和查詢都是按照tree的層層查找來進行處理的. 下面順便解釋一下tree node的結構:
type node struct {
// 保存這個節點上的URL路徑
// 例如上圖中的search和support, 共同的parent節點的path="s"
// 后面兩個節點的path分別是"earch"和"upport"
path string
// 判斷當前節點路徑是不是參數節點, 例如上圖的:post部分就是wildChild節點
wildChild bool
// 節點類型包括static, root, param, catchAll
// static: 靜態節點, 例如上面分裂出來作為parent的s
// root: 如果插入的節點是第一個, 那么是root節點
// catchAll: 有*匹配的節點
// param: 除上面外的節點
nType nodeType
// 記錄路徑上最大參數個數
maxParams uint8
// 和children[]對應, 保存的是分裂的分支的第一個字符
// 例如search和support, 那么s節點的indices對應的"eu"
// 代表有兩個分支, 分支的首字母分別是e和u
indices string
// 保存孩子節點
children []*node
// 當前節點的處理函數
handle Handle
// 優先級, 看起來沒什么卵用的樣子@_@
priority uint32
}
###2. 建樹過程
建樹過程主要涉及到兩個函數: addRoute和insertChild, 下面主要看看這兩個函數:
首先是addRoute函數:
// addRoute adds a node with the given handle to the path.
// Not concurrency-safe!
// 向tree中增加節點
func (n *node) addRoute(path string, handle Handle) {
fullPath := path
n.priority++
numParams := countParams(path)
// non-empty tree
// 如果之前這個Method tree中已經存在節點了
if len(n.path) > 0 || len(n.children) > 0 {
walk:
for {
// Update maxParams of the current node
// 更新當前node的最大參數個數
if numParams > n.maxParams {
n.maxParams = numParams
}
// Find the longest common prefix.
// This also implies that the common prefix contains no ':' or '*'
// since the existing key can't contain those chars.
// 找到最長公共前綴
i := 0
max := min(len(path), len(n.path))
// 匹配相同的字符
for i < max && path[i] == n.path[i] {
i++
}
// Split edge
// 說明前面有一段是匹配的, 例如之前為:/search,現在來了一個/support
// 那么會將/s拿出來作為parent節點, 將child節點變成earch和upport
if i < len(n.path) {
// 將原本路徑的i后半部分作為前半部分的child節點
child := node{
path: n.path[i:],
wildChild: n.wildChild,
nType: static,
indices: n.indices,
children: n.children,
handle: n.handle,
priority: n.priority - 1,
}
// Update maxParams (max of all children)
// 更新最大參數個數
for i := range child.children {
if child.children[i].maxParams > child.maxParams {
child.maxParams = child.children[i].maxParams
}
}
// 當前節點的孩子節點變成剛剛分出來的這個后半部分節點
n.children = []*node{&child}
// []byte for proper unicode char conversion, see #65
n.indices = string([]byte{n.path[i]})
// 路徑變成前i半部分path
n.path = path[:i]
n.handle = nil
n.wildChild = false
}
// Make new node a child of this node
// 同時, 將新來的這個節點插入新的parent節點中當做孩子節點
if i < len(path) {
// i的后半部分作為路徑, 即上面例子support中的upport
path = path[i:]
// 如果n是參數節點(包含:或者*)
if n.wildChild {
n = n.children[0]
n.priority++
// Update maxParams of the child node
if numParams > n.maxParams {
n.maxParams = numParams
}
numParams--
// Check if the wildcard matches
// 例如: /blog/:ppp 和 /blog/:ppppppp, 需要檢查更長的通配符
if len(path) >= len(n.path) && n.path == path[:len(n.path)] {
// check for longer wildcard, e.g. :name and :names
if len(n.path) >= len(path) || path[len(n.path)] == '/' {
continue walk
}
}
panic("path segment '" + path +
"' conflicts with existing wildcard '" + n.path +
"' in path '" + fullPath + "'")
}
c := path[0]
// slash after param
if n.nType == param && c == '/' && len(n.children) == 1 {
n = n.children[0]
n.priority++
continue walk
}
// Check if a child with the next path byte exists
// 檢查路徑是否已經存在, 例如search和support第一個字符相同
for i := 0; i < len(n.indices); i++ {
// 找到第一個匹配的字符
if c == n.indices[i] {
i = n.incrementChildPrio(i)
n = n.children[i]
continue walk
}
}
// Otherwise insert it
// new一個node
if c != ':' && c != '*' {
// []byte for proper unicode char conversion, see #65
// 記錄第一個字符,並放在indices中
n.indices += string([]byte{c})
child := &node{
maxParams: numParams,
}
// 增加孩子節點
n.children = append(n.children, child)
n.incrementChildPrio(len(n.indices) - 1)
n = child
}
// 插入節點
n.insertChild(numParams, path, fullPath, handle)
return
// 說明是相同的路徑,僅僅需要將handle替換就OK
// 如果是nil那么說明取消這個handle, 不是空不允許
} else if i == len(path) { // Make node a (in-path) leaf
if n.handle != nil {
panic("a handle is already registered for path '" + fullPath + "'")
}
n.handle = handle
}
return
}
} else { // Empty tree
// 如果是空樹, 那么插入節點
n.insertChild(numParams, path, fullPath, handle)
// 節點的種類是root
n.nType = root
}
}
上面函數的目的是找到插入節點的位置, 需要主要如果存在common前綴, 那么需要將節點進行分裂, 然后再插入child節點. 再看一些insertChild函數:
// 插入節點函數
// @1: 參數個數
// @2: 輸入路徑
// @3: 完整路徑
// @4: 路徑關聯函數
func (n *node) insertChild(numParams uint8, path, fullPath string, handle Handle) {
var offset int // already handled bytes of the path
// find prefix until first wildcard (beginning with ':'' or '*'')
// 找到前綴, 直到遇到第一個wildcard匹配的參數
for i, max := 0, len(path); numParams > 0; i++ {
c := path[i]
if c != ':' && c != '*' {
continue
}
// find wildcard end (either '/' or path end)
end := i + 1
// 下面判斷:或者*之后不能再有*或者:, 這樣是屬於參數錯誤
// 除非到了下一個/XXX
for end < max && path[end] != '/' {
switch path[end] {
// the wildcard name must not contain ':' and '*'
case ':', '*':
panic("only one wildcard per path segment is allowed, has: '" +
path[i:] + "' in path '" + fullPath + "'")
default:
end++
}
}
// check if this Node existing children which would be
// unreachable if we insert the wildcard here
if len(n.children) > 0 {
panic("wildcard route '" + path[i:end] +
"' conflicts with existing children in path '" + fullPath + "'")
}
// check if the wildcard has a name
// 下面的判斷說明只有:或者*,沒有name,這也是不合法的
if end-i < 2 {
panic("wildcards must be named with a non-empty name in path '" + fullPath + "'")
}
// 如果是':',那么匹配一個參數
if c == ':' { // param
// split path at the beginning of the wildcard
// 節點path是參數前面那么一段, offset代表已經處理了多少path中的字符
if i > 0 {
n.path = path[offset:i]
offset = i
}
// 構造一個child
child := &node{
nType: param,
maxParams: numParams,
}
n.children = []*node{child}
n.wildChild = true
// 下次的循環就是這個新的child節點了
n = child
// 最長匹配, 所以下面節點的優先級++
n.priority++
numParams--
// if the path doesn't end with the wildcard, then there
// will be another non-wildcard subpath starting with '/'
if end < max {
n.path = path[offset:end]
offset = end
child := &node{
maxParams: numParams,
priority: 1,
}
n.children = []*node{child}
n = child
}
} else { // catchAll
// *匹配所有參數
if end != max || numParams > 1 {
panic("catch-all routes are only allowed at the end of the path in path '" + fullPath + "'")
}
if len(n.path) > 0 && n.path[len(n.path)-1] == '/' {
panic("catch-all conflicts with existing handle for the path segment root in path '" + fullPath + "'")
}
// currently fixed width 1 for '/'
i--
if path[i] != '/' {
panic("no / before catch-all in path '" + fullPath + "'")
}
n.path = path[offset:i]
// first node: catchAll node with empty path
child := &node{
wildChild: true,
nType: catchAll,
maxParams: 1,
}
n.children = []*node{child}
n.indices = string(path[i])
n = child
n.priority++
// second node: node holding the variable
child = &node{
path: path[i:],
nType: catchAll,
maxParams: 1,
handle: handle,
priority: 1,
}
n.children = []*node{child}
return
}
}
// insert remaining path part and handle to the leaf
n.path = path[offset:]
n.handle = handle
}
insertChild函數是根據path本身進行分割, 將’/’分開的部分分別作為節點保存, 形成一棵樹結構. 注意參數匹配中的’:’和’*‘的區別, 前者是匹配一個字段, 后者是匹配后面所有的路徑. 具體的細節, 請查看代碼中的注釋.
###3. 查找path過程
這個過程其實就是匹配每個child的path, walk知道path最后.
// Returns the handle registered with the given path (key). The values of
// wildcards are saved to a map.
// If no handle can be found, a TSR (trailing slash redirect) recommendation is
// made if a handle exists with an extra (without the) trailing slash for the
// given path.
func (n *node) getValue(path string) (handle Handle, p Params, tsr bool) {
walk: // outer loop for walking the tree
for {
// 意思是如果還沒有走到路徑end
if len(path) > len(n.path) {
// 前面一段必須和當前節點的path一樣才OK
if path[:len(n.path)] == n.path {
path = path[len(n.path):]
// If this node does not have a wildcard (param or catchAll)
// child, we can just look up the next child node and continue
// to walk down the tree
// 如果不是參數節點, 那么根據分支walk到下一個節點就OK
if !n.wildChild {
c := path[0]
// 找到分支的第一個字符=>找到child
for i := 0; i < len(n.indices); i++ {
if c == n.indices[i] {
n = n.children[i]
continue walk
}
}
// Nothing found.
// We can recommend to redirect to the same URL without a
// trailing slash if a leaf exists for that path.
tsr = (path == "/" && n.handle != nil)
return
}
// handle wildcard child
// 下面處理通配符參數節點
n = n.children[0]
switch n.nType {
// 如果是普通':'節點, 那么找到/或者path end, 獲得參數
case param:
// find param end (either '/' or path end)
end := 0
for end < len(path) && path[end] != '/' {
end++
}
// 獲取參數
// save param value
if p == nil {
// lazy allocation
p = make(Params, 0, n.maxParams)
}
i := len(p)
p = p[:i+1] // expand slice within preallocated capacity
// 獲取key和value
p[i].Key = n.path[1:]
p[i].Value = path[:end]
// we need to go deeper!
// 如果參數還沒處理完, 繼續walk
if end < len(path) {
if len(n.children) > 0 {
path = path[end:]
n = n.children[0]
continue walk
}
// ... but we can't
tsr = (len(path) == end+1)
return
}
// 否則獲得handle返回就OK
if handle = n.handle; handle != nil {
return
} else if len(n.children) == 1 {
// No handle found. Check if a handle for this path + a
// trailing slash exists for TSR recommendation
n = n.children[0]
tsr = (n.path == "/" && n.handle != nil)
}
return
case catchAll:
// save param value
if p == nil {
// lazy allocation
p = make(Params, 0, n.maxParams)
}
i := len(p)
p = p[:i+1] // expand slice within preallocated capacity
p[i].Key = n.path[2:]
p[i].Value = path
handle = n.handle
return
default:
panic("invalid node type")
}
}
// 走到路徑end
} else if path == n.path {
// We should have reached the node containing the handle.
// Check if this node has a handle registered.
// 判斷這個路徑節點是都存在handle, 如果存在, 那么就可以直接返回了.
if handle = n.handle; handle != nil {
return
}
// 下面判斷是不是需要進入重定向
if path == "/" && n.wildChild && n.nType != root {
tsr = true
return
}
// No handle found. Check if a handle for this path + a
// trailing slash exists for trailing slash recommendation
// 判斷path+'/'是否存在handle
for i := 0; i < len(n.indices); i++ {
if n.indices[i] == '/' {
n = n.children[i]
tsr = (len(n.path) == 1 && n.handle != nil) ||
(n.nType == catchAll && n.children[0].handle != nil)
return
}
}
return
}
// Nothing found. We can recommend to redirect to the same URL with an
// extra trailing slash if a leaf exists for that path
tsr = (path == "/") ||
(len(n.path) == len(path)+1 && n.path[len(path)] == '/' &&
path == n.path[:len(n.path)-1] && n.handle != nil)
return
}
}
