Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for regexp routes #86

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 16 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ httptreemux [![Build Status](https://travis-ci.org/dimfeld/httptreemux.png?bran

High-speed, flexible, tree-based HTTP router for Go.

This is inspired by [Julien Schmidt's httprouter](https://www.github.com/julienschmidt/httprouter), in that it uses a patricia tree, but the implementation is rather different. Specifically, the routing rules are relaxed so that a single path segment may be a wildcard in one route and a static token in another. This gives a nice combination of high performance with a lot of convenience in designing the routing patterns. In [benchmarks](https://github.com/julienschmidt/go-http-routing-benchmark), httptreemux is close to, but slightly slower than, httprouter.
This is inspired by [Julien Schmidt's httprouter](https://www.github.com/julienschmidt/httprouter), in that it uses a patricia tree, but the implementation is rather different. Specifically, the routing rules are relaxed so that a single path segment may be a wildcard in one route and a static token in another. It also supports regex routes which will be checked after static and wildcard routes. This gives a nice combination of high performance with a lot of convenience in designing the routing patterns. In [benchmarks](https://github.com/julienschmidt/go-http-routing-benchmark), httptreemux is close to, but slightly slower than, httprouter.

Release notes may be found using the [Github releases tab](https://github.com/dimfeld/httptreemux/releases). Version numbers are compatible with the [Semantic Versioning 2.0.0](http://semver.org/) convention, and a new release is made after every change to the code.

Expand All @@ -15,7 +15,7 @@ When using Go Modules, import this repository with `import "github.com/dimfeld/h
There are a lot of good routers out there. But looking at the ones that were really lightweight, I couldn't quite get something that fit with the route patterns I wanted. The code itself is simple enough, so I spent an evening writing this.

## Handler
The handler is a simple function with the prototype `func(w http.ResponseWriter, r *http.Request, params map[string]string)`. The params argument contains the parameters parsed from wildcards and catch-alls in the URL, as described below. This type is aliased as httptreemux.HandlerFunc.
The handler is a simple function with the prototype `func(w http.ResponseWriter, r *http.Request, params map[string]string)`. The params argument contains the parameters parsed from wildcards, catch-all and regexp named capturing groups in the URL, as described below. This type is aliased as httptreemux.HandlerFunc.

### Using http.HandlerFunc
Due to the inclusion of the [context](https://godoc.org/context) package as of Go 1.7, `httptreemux` now supports handlers of type [http.HandlerFunc](https://godoc.org/net/http#HandlerFunc). There are two ways to enable this support.
Expand Down Expand Up @@ -81,13 +81,14 @@ http.ListenAndServe(":8080", router)


## Routing Rules
The syntax here is also modeled after httprouter. Each variable in a path may match on one segment only, except for an optional catch-all variable at the end of the URL.
The syntax here is also modeled after httprouter. Each variable in a path may match on one segment only, except for an optional catch-all variable or a regular expression at the end of the URL.

Some examples of valid URL patterns are:
* `/post/all`
* `/post/:postid`
* `/post/:postid/page/:page`
* `/post/:postid/:page`
* `/images/~^(?P<category>\w+)-(?P<name>.+)$`
* `/images/*path`
* `/favicon.ico`
* `/:year/:month/`
Expand All @@ -98,17 +99,21 @@ Note that all of the above URL patterns may exist concurrently in the router.

Path elements starting with `:` indicate a wildcard in the path. A wildcard will only match on a single path segment. That is, the pattern `/post/:postid` will match on `/post/1` or `/post/1/`, but not `/post/1/2`.

A path element starting with `~` is a regexp route, all text after `~` is considered the regular expression. Regexp routes are checked after static and wildcards routes. Multiple regexp are allowed to be registered with same prefix, they will be checked in their registering order. Named capturing groups will be passed to handler as params.

A path element starting with `*` is a catch-all, whose value will be a string containing all text in the URL matched by the wildcards. For example, with a pattern of `/images/*path` and a requested URL `images/abc/def`, path would contain `abc/def`. A catch-all path will not match an empty string, so in this example a separate route would need to be installed if you also want to match `/images/`.

#### Using : and * in routing patterns
#### Using : * and ~ in routing patterns

The characters `:` and `*` can be used at the beginning of a path segment by escaping them with a backslash. A double backslash at the beginning of a segment is interpreted as a single backslash. These escapes are only checked at the very beginning of a path segment; they are not necessary or processed elsewhere in a token.
The characters `:`, `*` and `~` can be used at the beginning of a path segment by escaping them with a backslash. A double backslash at the beginning of a segment is interpreted as a single backslash. These escapes are only checked at the very beginning of a path segment; they are not necessary or processed elsewhere in a token.

```go
router.GET("/foo/\\*starToken", handler) // matches /foo/*starToken
router.GET("/foo/\\*starToken", handler) // matches /foo/*starToken
router.GET("/foo/star*inTheMiddle", handler) // matches /foo/star*inTheMiddle
router.GET("/foo/starBackslash\\*", handler) // matches /foo/starBackslash\*
router.GET("/foo/\\\\*backslashWithStar") // matches /foo/\*backslashWithStar
router.GET("/foo/\\\\*backslashWithStar") // matches /foo/\*backslashWithStar
router.GET("/foo/\\~tildeToken", handler) // matches /foo/~tildeToken
router.GET("/foo/tilde~inMiddle", handler) // matches /foo/tilde~inMiddle
```

### Routing Groups
Expand All @@ -129,14 +134,16 @@ The priority rules in the router are simple.

1. Static path segments take the highest priority. If a segment and its subtree are able to match the URL, that match is returned.
2. Wildcards take second priority. For a particular wildcard to match, that wildcard and its subtree must match the URL.
3. Finally, a catch-all rule will match when the earlier path segments have matched, and none of the static or wildcard conditions have matched. Catch-all rules must be at the end of a pattern.
3. Regexp routes are checked after static and wildcards routes. Multiple regexp routes under a same prefix are checked in their registering order, if a regexp route matches the URL, the match is returned.
4. Finally, a catch-all rule will match when the earlier path segments have matched, and none of the static or wildcard conditions have matched. Catch-all rules must be at the end of a pattern.

So with the following patterns adapted from [simpleblog](https://www.github.com/dimfeld/simpleblog), we'll see certain matches:
```go
router = httptreemux.New()
router.GET("/:page", pageHandler)
router.GET("/:year/:month/:post", postHandler)
router.GET("/:year/:month", archiveHandler)
router.GET(`/images/~^(?P<category>\w+)-(?P<name>.+)$`)
router.GET("/images/*path", staticHandler)
router.GET("/favicon.ico", staticHandler)
```
Expand All @@ -146,6 +153,7 @@ router.GET("/favicon.ico", staticHandler)
- `/abc` will match `/:page`
- `/2014/05` will match `/:year/:month`
- `/2014/05/really-great-blog-post` will match `/:year/:month/:post`
- `/images/cate1-Img1.jpg` will match `/images/~^(?P<category>\w+)-(?P<name>.+)$`, the params will be `category=cate1` and `name=Img1.jpg`.
- `/images/CoolImage.gif` will match `/images/*path`
- `/images/2014/05/MayImage.jpg` will also match `/images/*path`, with all the text after `/images` stored in the variable path.
- `/favicon.ico` will match `/favicon.ico`
Expand Down
27 changes: 18 additions & 9 deletions router.go
Original file line number Diff line number Diff line change
Expand Up @@ -192,16 +192,25 @@ func (t *TreeMux) lookup(w http.ResponseWriter, r *http.Request) (result LookupR

var paramMap map[string]string
if len(params) != 0 {
if len(params) != len(n.leafWildcardNames) {
// Need better behavior here. Should this be a panic?
panic(fmt.Sprintf("httptreemux parameter list length mismatch: %v, %v",
params, n.leafWildcardNames))
}
if n.isRegex {
paramMap = make(map[string]string)
for i, name := range n.regExpr.SubexpNames() {
if i > 0 && name != "" {
paramMap[name] = params[i]
}
}
} else {
if len(params) != len(n.leafWildcardNames) {
// Need better behavior here. Should this be a panic?
panic(fmt.Sprintf("httptreemux parameter list length mismatch: %v, %v",
params, n.leafWildcardNames))
}

paramMap = make(map[string]string)
numParams := len(params)
for index := 0; index < numParams; index++ {
paramMap[n.leafWildcardNames[numParams-index-1]] = params[index]
paramMap = make(map[string]string)
numParams := len(params)
for index := 0; index < numParams; index++ {
paramMap[n.leafWildcardNames[numParams-index-1]] = params[index]
}
}
}

Expand Down
6 changes: 5 additions & 1 deletion router_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,6 @@ func testMethods(t *testing.T, newRequest RequestCreator, headCanUseGet bool, us
testMethod("HEAD", "HEAD")
}


func TestCaseInsensitiveRouting(t *testing.T) {
router := New()
// create case-insensitive route
Expand Down Expand Up @@ -1028,6 +1027,7 @@ func TestLookup(t *testing.T) {
router.POST("/user/dimfeld", simpleHandler)
router.GET("/abc/*", simpleHandler)
router.POST("/abc/*", simpleHandler)
router.GET(`/smith/~^(\w+)`, simpleHandler)

var tryLookup = func(method, path string, expectFound bool, expectCode int) {
r, _ := newRequest(method, path, nil)
Expand Down Expand Up @@ -1057,6 +1057,10 @@ func TestLookup(t *testing.T) {
tryLookup("PATCH", "/user/dimfeld", false, http.StatusMethodNotAllowed)
tryLookup("GET", "/abc/def/ghi", true, http.StatusOK)

tryLookup("GET", "/smith/something", true, http.StatusOK)
tryLookup("POST", "/smith/something", false, http.StatusNotFound)
tryLookup("GET", "/smith/***something", false, http.StatusNotFound)

router.RedirectBehavior = Redirect307
tryLookup("POST", "/user/dimfeld/", true, http.StatusTemporaryRedirect)
}
Expand Down
50 changes: 48 additions & 2 deletions tree.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package httptreemux

import (
"fmt"
"regexp"
"strings"
)

Expand All @@ -20,10 +21,14 @@ type node struct {
// If none of the above match, then we use the catch-all, if applicable.
catchAllChild *node

regexChild []*node
regExpr *regexp.Regexp

// Data for the node is below.

addSlash bool
isCatchAll bool
isRegex bool
// If true, the head handler was set implicitly, so let it also be set explicitly.
implicitHead bool
// If this node is the end of the URL, then call the handler, if applicable.
Expand Down Expand Up @@ -125,6 +130,19 @@ func (n *node) addPath(path string, wildcards []string, inStaticToken bool) *nod
n.catchAllChild.leafWildcardNames = wildcards

return n.catchAllChild

} else if c == '~' && !inStaticToken {
thisToken = thisToken[1:]
for _, child := range n.regexChild {
if path[1:] == child.path {
return child
}
}
re := regexp.MustCompile(path[1:])
child := &node{path: path[1:], isRegex: true, regExpr: re}
n.regexChild = append(n.regexChild, child)
return child

} else if c == ':' && !inStaticToken {
// Token starts with a :
thisToken = thisToken[1:]
Expand All @@ -148,7 +166,7 @@ func (n *node) addPath(path string, wildcards []string, inStaticToken bool) *nod

unescaped := false
if len(thisToken) >= 2 && !inStaticToken {
if thisToken[0] == '\\' && (thisToken[1] == '*' || thisToken[1] == ':' || thisToken[1] == '\\') {
if thisToken[0] == '\\' && (thisToken[1] == '*' || thisToken[1] == ':' || thisToken[1] == '~' || thisToken[1] == '\\') {
// The token starts with a character escaped by a backslash. Drop the backslash.
c = thisToken[1]
thisToken = thisToken[1:]
Expand Down Expand Up @@ -302,6 +320,14 @@ func (n *node) search(method, path string) (found *node, handler HandlerFunc, pa
}
}

if len(n.regexChild) > 0 {
// Test regex routes in their registering order.
child, handler, params := n.searchRegexChild(method, path)
if child != nil {
return child, handler, params
}
}

catchAllChild := n.catchAllChild
if catchAllChild != nil {
// Hit the catchall, so just assign the whole remaining path if it
Expand All @@ -317,12 +343,29 @@ func (n *node) search(method, path string) (found *node, handler HandlerFunc, pa

return catchAllChild, handler, []string{unescaped}
}

}

return found, handler, params
}

func (n *node) searchRegexChild(method, path string) (found *node, handler HandlerFunc, params []string) {
for _, child := range n.regexChild {
re := child.regExpr
match := re.FindStringSubmatch(path)
if len(match) == 0 {
continue
}
handler = child.leafHandler[method]
if handler != nil {
params = match
return child, handler, params
}

// Else no handler is registered for this method, ignore it.
}
return nil, nil, nil
}

func (n *node) dumpTree(prefix, nodeType string) string {
line := fmt.Sprintf("%s %02d %s%s [%d] %v wildcards %v\n", prefix, n.priority, nodeType, n.path,
len(n.staticChild), n.leafHandler, n.leafWildcardNames)
Expand All @@ -333,6 +376,9 @@ func (n *node) dumpTree(prefix, nodeType string) string {
if n.wildcardChild != nil {
line += n.wildcardChild.dumpTree(prefix, ":")
}
for _, child := range n.regexChild {
line += child.dumpTree(prefix, "~")
}
if n.catchAllChild != nil {
line += n.catchAllChild.dumpTree(prefix, "*")
}
Expand Down
36 changes: 33 additions & 3 deletions tree_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package httptreemux

import (
"net/http"
"reflect"
"testing"
)

Expand Down Expand Up @@ -63,11 +64,24 @@ func testPath(t *testing.T, tree *node, path string, expectPath string, expected
t.Error("Node and subtree was\n" + n.dumpTree("", " "))
}

if expectedParams == nil {
if len(paramList) != 0 {
t.Errorf("Path %s expected no parameters, saw %v", path, paramList)
if n.isRegex {
paramMap := make(map[string]string)
for i, name := range n.regExpr.SubexpNames() {
if i > 0 && name != "" {
paramMap[name] = paramList[i]
}
}
if len(expectedParams) != len(paramMap) {
t.Errorf("Path %s expected no parameters, got %v", path, paramMap)
} else if len(expectedParams) > 0 && !reflect.DeepEqual(paramMap, expectedParams) {
t.Errorf("Regexp params not match, want %v, but got %v", expectedParams, paramMap)
}
} else {
if expectedParams == nil {
if len(paramList) != 0 {
t.Errorf("Path %s expected no parameters, saw %v", path, paramList)
}
}
if len(paramList) != len(n.leafWildcardNames) {
t.Errorf("Got %d params back but node specifies %d",
len(paramList), len(n.leafWildcardNames))
Expand Down Expand Up @@ -140,12 +154,28 @@ func TestTree(t *testing.T) {
addPath(t, tree, "/plaster")
addPath(t, tree, "/users/:pk/:related")
addPath(t, tree, "/users/:id/updatePassword")
addPath(t, tree, `/users/~^.+$`) // not matched by others go to this route
addPath(t, tree, "/:something/abc")
addPath(t, tree, "/:something/def")
addPath(t, tree, "/apples/ab:cde/:fg/*hi")
addPath(t, tree, "/apples/ab*cde/:fg/*hi")
addPath(t, tree, "/apples/ab\\*cde/:fg/*hi")
addPath(t, tree, "/apples/ab*dde")
addPath(t, tree, `/smith/~^.+$`)
addPath(t, tree, `/smith/abc/~^some-(?P<var1>\w+)-(?P<var2>\d+)-(.*)$`)
addPath(t, tree, `/smith/abc/~^some-.*second.*$`) // the previous one will be matched first
addPath(t, tree, "/images3/*path")
addPath(t, tree, `/images3/~^(?P<category>\w+)-(?P<name>.+)$`)

testPath(t, tree, "/smith/abc/some-holiday-202110-hawaii-beach", `/smith/abc/~^some-(?P<var1>\w+)-(?P<var2>\d+)-(.*)$`,
map[string]string{"var1": "holiday", "var2": "202110"})
testPath(t, tree, "/smith/abc/some-matchthesecondregex", `/smith/abc/~^some-.*second.*$`, nil)
testPath(t, tree, "/smith/abc/third-no-specific-match", `/smith/~^.+$`, nil)
testPath(t, tree, "/users/123/something/notmatch", `/users/~^.+$`, nil)
testPath(t, tree, "/images3/categorya-img1.jpg", `/images3/~^(?P<category>\w+)-(?P<name>.+)$`,
map[string]string{"category": "categorya", "name": "img1.jpg"})
testPath(t, tree, "/images3/nocategoryimg.jpg", "/images3/*path",
map[string]string{"path": "nocategoryimg.jpg"})

testPath(t, tree, "/users/abc/updatePassword", "/users/:id/updatePassword",
map[string]string{"id": "abc"})
Expand Down