dimfeld · jxskiss · Oct 30, 2021 · Oct 31, 2021 · Oct 31, 2021 · Nov 7, 2021
diff --git a/README.md b/README.md
@@ -3,7 +3,7 @@ httptreemux  [![Build Status](https://travis-ci.org/dimfeld/httptreemux.png?bran
 
 High-speed, flexible, tree-based HTTP router for Go.
 
-This is inspired by [Julien Schmidt's httprouter](https://www.github.com/julienschmidt/httprouter), in that it uses a patricia tree, but the implementation is rather different. Specifically, the routing rules are relaxed so that a single path segment may be a wildcard in one route and a static token in another. This gives a nice combination of high performance with a lot of convenience in designing the routing patterns. In [benchmarks](https://github.com/julienschmidt/go-http-routing-benchmark), httptreemux is close to, but slightly slower than, httprouter.
+This is inspired by [Julien Schmidt's httprouter](https://www.github.com/julienschmidt/httprouter), in that it uses a patricia tree, but the implementation is rather different. Specifically, the routing rules are relaxed so that a single path segment may be a wildcard in one route and a static token in another. It also supports regex routes which will be checked after static and wildcard routes. This gives a nice combination of high performance with a lot of convenience in designing the routing patterns. In [benchmarks](https://github.com/julienschmidt/go-http-routing-benchmark), httptreemux is close to, but slightly slower than, httprouter.
 
 Release notes may be found using the [Github releases tab](https://github.com/dimfeld/httptreemux/releases). Version numbers are compatible with the [Semantic Versioning 2.0.0](http://semver.org/) convention, and a new release is made after every change to the code.
 
@@ -15,7 +15,7 @@ When using Go Modules, import this repository with `import "github.com/dimfeld/h
 There are a lot of good routers out there. But looking at the ones that were really lightweight, I couldn't quite get something that fit with the route patterns I wanted. The code itself is simple enough, so I spent an evening writing this.
 
 ## Handler
-The handler is a simple function with the prototype `func(w http.ResponseWriter, r *http.Request, params map[string]string)`. The params argument contains the parameters parsed from wildcards and catch-alls in the URL, as described below. This type is aliased as httptreemux.HandlerFunc.
+The handler is a simple function with the prototype `func(w http.ResponseWriter, r *http.Request, params map[string]string)`. The params argument contains the parameters parsed from wildcards, catch-all and regexp named capturing groups in the URL, as described below. This type is aliased as httptreemux.HandlerFunc.
 
 ### Using http.HandlerFunc
 Due to the inclusion of the [context](https://godoc.org/context) package as of Go 1.7, `httptreemux` now supports handlers of type [http.HandlerFunc](https://godoc.org/net/http#HandlerFunc). There are two ways to enable this support.
@@ -81,13 +81,14 @@ http.ListenAndServe(":8080", router)
 
 
 ## Routing Rules
-The syntax here is also modeled after httprouter. Each variable in a path may match on one segment only, except for an optional catch-all variable at the end of the URL.
+The syntax here is also modeled after httprouter. Each variable in a path may match on one segment only, except for an optional catch-all variable or a regular expression at the end of the URL.
 
 Some examples of valid URL patterns are:
 * `/post/all`
 * `/post/:postid`
 * `/post/:postid/page/:page`
 * `/post/:postid/:page`
+* `/images/~^(?P<category>\w+)-(?P<name>.+)$`
 * `/images/*path`
 * `/favicon.ico`
 * `/:year/:month/`
@@ -98,17 +99,21 @@ Note that all of the above URL patterns may exist concurrently in the router.
 
 Path elements starting with `:` indicate a wildcard in the path. A wildcard will only match on a single path segment. That is, the pattern `/post/:postid` will match on `/post/1` or `/post/1/`, but not `/post/1/2`.
 
+A path element starting with `~` is a regexp route, all text after `~` is considered the regular expression. Regexp routes are checked after static and wildcards routes. Multiple regexp are allowed to be registered with same prefix, they will be checked in their registering order. Named capturing groups will be passed to handler as params.
+
 A path element starting with `*` is a catch-all, whose value will be a string containing all text in the URL matched by the wildcards. For example, with a pattern of `/images/*path` and a requested URL `images/abc/def`, path would contain `abc/def`. A catch-all path will not match an empty string, so in this example a separate route would need to be installed if you also want to match `/images/`.
 
-#### Using : and * in routing patterns
+#### Using : * and ~ in routing patterns
 
-The characters `:` and `*` can be used at the beginning of a path segment by escaping them with a backslash. A double backslash at the beginning of a segment is interpreted as a single backslash. These escapes are only checked at the very beginning of a path segment; they are not necessary or processed elsewhere in a token.
+The characters `:`, `*` and `~` can be used at the beginning of a path segment by escaping them with a backslash. A double backslash at the beginning of a segment is interpreted as a single backslash. These escapes are only checked at the very beginning of a path segment; they are not necessary or processed elsewhere in a token.
 
 ```go
-router.GET("/foo/\\*starToken", handler) // matches /foo/*starToken
+router.GET("/foo/\\*starToken", handler)     // matches /foo/*starToken
 router.GET("/foo/star*inTheMiddle", handler) // matches /foo/star*inTheMiddle
 router.GET("/foo/starBackslash\\*", handler) // matches /foo/starBackslash\*
-router.GET("/foo/\\\\*backslashWithStar") // matches /foo/\*backslashWithStar
+router.GET("/foo/\\\\*backslashWithStar")    // matches /foo/\*backslashWithStar
+router.GET("/foo/\\~tildeToken", handler)    // matches /foo/~tildeToken
+router.GET("/foo/tilde~inMiddle", handler)   // matches /foo/tilde~inMiddle
 ```
 
 ### Routing Groups
@@ -129,14 +134,16 @@ The priority rules in the router are simple.
 
 1. Static path segments take the highest priority. If a segment and its subtree are able to match the URL, that match is returned.
 2. Wildcards take second priority. For a particular wildcard to match, that wildcard and its subtree must match the URL.
-3. Finally, a catch-all rule will match when the earlier path segments have matched, and none of the static or wildcard conditions have matched. Catch-all rules must be at the end of a pattern.
+3. Regexp routes are checked after static and wildcards routes. Multiple regexp routes under a same prefix are checked in their registering order, if a regexp route matches the URL, the match is returned.
+4. Finally, a catch-all rule will match when the earlier path segments have matched, and none of the static or wildcard conditions have matched. Catch-all rules must be at the end of a pattern.
 
 So with the following patterns adapted from [simpleblog](https://www.github.com/dimfeld/simpleblog), we'll see certain matches:
 ```go
 router = httptreemux.New()
 router.GET("/:page", pageHandler)
 router.GET("/:year/:month/:post", postHandler)
 router.GET("/:year/:month", archiveHandler)
+router.GET(`/images/~^(?P<category>\w+)-(?P<name>.+)$`)
 router.GET("/images/*path", staticHandler)
 router.GET("/favicon.ico", staticHandler)
 ```
@@ -146,6 +153,7 @@ router.GET("/favicon.ico", staticHandler)
 - `/abc` will match `/:page`
 - `/2014/05` will match `/:year/:month`
 - `/2014/05/really-great-blog-post` will match `/:year/:month/:post`
+- `/images/cate1-Img1.jpg` will match `/images/~^(?P<category>\w+)-(?P<name>.+)$`, the params will be `category=cate1` and `name=Img1.jpg`.
 - `/images/CoolImage.gif` will match `/images/*path`
 - `/images/2014/05/MayImage.jpg` will also match `/images/*path`, with all the text after `/images` stored in the variable path.
 - `/favicon.ico` will match `/favicon.ico`

diff --git a/router.go b/router.go
@@ -192,16 +192,25 @@ func (t *TreeMux) lookup(w http.ResponseWriter, r *http.Request) (result LookupR
 
 	var paramMap map[string]string
 	if len(params) != 0 {
-		if len(params) != len(n.leafWildcardNames) {
-			// Need better behavior here. Should this be a panic?
-			panic(fmt.Sprintf("httptreemux parameter list length mismatch: %v, %v",
-				params, n.leafWildcardNames))
-		}
+		if n.isRegex {
+			paramMap = make(map[string]string)
+			for i, name := range n.regExpr.SubexpNames() {
+				if i > 0 && name != "" {
+					paramMap[name] = params[i]
+				}
+			}
+		} else {
+			if len(params) != len(n.leafWildcardNames) {
+				// Need better behavior here. Should this be a panic?
+				panic(fmt.Sprintf("httptreemux parameter list length mismatch: %v, %v",
+					params, n.leafWildcardNames))
+			}
 
-		paramMap = make(map[string]string)
-		numParams := len(params)
-		for index := 0; index < numParams; index++ {
-			paramMap[n.leafWildcardNames[numParams-index-1]] = params[index]
+			paramMap = make(map[string]string)
+			numParams := len(params)
+			for index := 0; index < numParams; index++ {
+				paramMap[n.leafWildcardNames[numParams-index-1]] = params[index]
+			}
 		}
 	}
 

diff --git a/router_test.go b/router_test.go
@@ -149,7 +149,6 @@ func testMethods(t *testing.T, newRequest RequestCreator, headCanUseGet bool, us
 	testMethod("HEAD", "HEAD")
 }
 
-
 func TestCaseInsensitiveRouting(t *testing.T) {
 	router := New()
 	// create case-insensitive route
@@ -1028,6 +1027,7 @@ func TestLookup(t *testing.T) {
 	router.POST("/user/dimfeld", simpleHandler)
 	router.GET("/abc/*", simpleHandler)
 	router.POST("/abc/*", simpleHandler)
+	router.GET(`/smith/~^(\w+)`, simpleHandler)
 
 	var tryLookup = func(method, path string, expectFound bool, expectCode int) {
 		r, _ := newRequest(method, path, nil)
@@ -1057,6 +1057,10 @@ func TestLookup(t *testing.T) {
 	tryLookup("PATCH", "/user/dimfeld", false, http.StatusMethodNotAllowed)
 	tryLookup("GET", "/abc/def/ghi", true, http.StatusOK)
 
+	tryLookup("GET", "/smith/something", true, http.StatusOK)
+	tryLookup("POST", "/smith/something", false, http.StatusNotFound)
+	tryLookup("GET", "/smith/***something", false, http.StatusNotFound)
+
 	router.RedirectBehavior = Redirect307
 	tryLookup("POST", "/user/dimfeld/", true, http.StatusTemporaryRedirect)
 }

diff --git a/tree.go b/tree.go
@@ -2,6 +2,7 @@ package httptreemux
 
 import (
 	"fmt"
+	"regexp"
 	"strings"
 )
 
@@ -20,10 +21,14 @@ type node struct {
 	// If none of the above match, then we use the catch-all, if applicable.
 	catchAllChild *node
 
+	regexChild []*node
+	regExpr    *regexp.Regexp
+
 	// Data for the node is below.
 
 	addSlash   bool
 	isCatchAll bool
+	isRegex    bool
 	// If true, the head handler was set implicitly, so let it also be set explicitly.
 	implicitHead bool
 	// If this node is the end of the URL, then call the handler, if applicable.
@@ -125,6 +130,19 @@ func (n *node) addPath(path string, wildcards []string, inStaticToken bool) *nod
 		n.catchAllChild.leafWildcardNames = wildcards
 
 		return n.catchAllChild
+
+	} else if c == '~' && !inStaticToken {
+		thisToken = thisToken[1:]
+		for _, child := range n.regexChild {
+			if path[1:] == child.path {
+				return child
+			}
+		}
+		re := regexp.MustCompile(path[1:])
+		child := &node{path: path[1:], isRegex: true, regExpr: re}
+		n.regexChild = append(n.regexChild, child)
+		return child
+
 	} else if c == ':' && !inStaticToken {
 		// Token starts with a :
 		thisToken = thisToken[1:]
@@ -148,7 +166,7 @@ func (n *node) addPath(path string, wildcards []string, inStaticToken bool) *nod
 
 		unescaped := false
 		if len(thisToken) >= 2 && !inStaticToken {
-			if thisToken[0] == '\\' && (thisToken[1] == '*' || thisToken[1] == ':' || thisToken[1] == '\\') {
+			if thisToken[0] == '\\' && (thisToken[1] == '*' || thisToken[1] == ':' || thisToken[1] == '~' || thisToken[1] == '\\') {
 				// The token starts with a character escaped by a backslash. Drop the backslash.
 				c = thisToken[1]
 				thisToken = thisToken[1:]
@@ -302,6 +320,14 @@ func (n *node) search(method, path string) (found *node, handler HandlerFunc, pa
 		}
 	}
 
+	if len(n.regexChild) > 0 {
+		// Test regex routes in their registering order.
+		child, handler, params := n.searchRegexChild(method, path)
+		if child != nil {
+			return child, handler, params
+		}
+	}
+
 	catchAllChild := n.catchAllChild
 	if catchAllChild != nil {
 		// Hit the catchall, so just assign the whole remaining path if it
@@ -317,12 +343,29 @@ func (n *node) search(method, path string) (found *node, handler HandlerFunc, pa
 
 			return catchAllChild, handler, []string{unescaped}
 		}
-
 	}
 
 	return found, handler, params
 }
 
+func (n *node) searchRegexChild(method, path string) (found *node, handler HandlerFunc, params []string) {
+	for _, child := range n.regexChild {
+		re := child.regExpr
+		match := re.FindStringSubmatch(path)
+		if len(match) == 0 {
+			continue
+		}
+		handler = child.leafHandler[method]
+		if handler != nil {
+			params = match
+			return child, handler, params
+		}
+
+		// Else no handler is registered for this method, ignore it.
+	}
+	return nil, nil, nil
+}
+
 func (n *node) dumpTree(prefix, nodeType string) string {
 	line := fmt.Sprintf("%s %02d %s%s [%d] %v wildcards %v\n", prefix, n.priority, nodeType, n.path,
 		len(n.staticChild), n.leafHandler, n.leafWildcardNames)
@@ -333,6 +376,9 @@ func (n *node) dumpTree(prefix, nodeType string) string {
 	if n.wildcardChild != nil {
 		line += n.wildcardChild.dumpTree(prefix, ":")
 	}
+	for _, child := range n.regexChild {
+		line += child.dumpTree(prefix, "~")
+	}
 	if n.catchAllChild != nil {
 		line += n.catchAllChild.dumpTree(prefix, "*")
 	}

diff --git a/tree_test.go b/tree_test.go
@@ -2,6 +2,7 @@ package httptreemux
 
 import (
 	"net/http"
+	"reflect"
 	"testing"
 )
 
@@ -63,11 +64,24 @@ func testPath(t *testing.T, tree *node, path string, expectPath string, expected
 		t.Error("Node and subtree was\n" + n.dumpTree("", " "))
 	}
 
-	if expectedParams == nil {
-		if len(paramList) != 0 {
-			t.Errorf("Path %s expected no parameters, saw %v", path, paramList)
+	if n.isRegex {
+		paramMap := make(map[string]string)
+		for i, name := range n.regExpr.SubexpNames() {
+			if i > 0 && name != "" {
+				paramMap[name] = paramList[i]
+			}
+		}
+		if len(expectedParams) != len(paramMap) {
+			t.Errorf("Path %s expected no parameters, got %v", path, paramMap)
+		} else if len(expectedParams) > 0 && !reflect.DeepEqual(paramMap, expectedParams) {
+			t.Errorf("Regexp params not match, want %v, but got %v", expectedParams, paramMap)
 		}
 	} else {
+		if expectedParams == nil {
+			if len(paramList) != 0 {
+				t.Errorf("Path %s expected no parameters, saw %v", path, paramList)
+			}
+		}
 		if len(paramList) != len(n.leafWildcardNames) {
 			t.Errorf("Got %d params back but node specifies %d",
 				len(paramList), len(n.leafWildcardNames))
@@ -140,12 +154,28 @@ func TestTree(t *testing.T) {
 	addPath(t, tree, "/plaster")
 	addPath(t, tree, "/users/:pk/:related")
 	addPath(t, tree, "/users/:id/updatePassword")
+	addPath(t, tree, `/users/~^.+$`) // not matched by others go to this route
 	addPath(t, tree, "/:something/abc")
 	addPath(t, tree, "/:something/def")
 	addPath(t, tree, "/apples/ab:cde/:fg/*hi")
 	addPath(t, tree, "/apples/ab*cde/:fg/*hi")
 	addPath(t, tree, "/apples/ab\\*cde/:fg/*hi")
 	addPath(t, tree, "/apples/ab*dde")
+	addPath(t, tree, `/smith/~^.+$`)
+	addPath(t, tree, `/smith/abc/~^some-(?P<var1>\w+)-(?P<var2>\d+)-(.*)$`)
+	addPath(t, tree, `/smith/abc/~^some-.*second.*$`) // the previous one will be matched first
+	addPath(t, tree, "/images3/*path")
+	addPath(t, tree, `/images3/~^(?P<category>\w+)-(?P<name>.+)$`)
+
+	testPath(t, tree, "/smith/abc/some-holiday-202110-hawaii-beach", `/smith/abc/~^some-(?P<var1>\w+)-(?P<var2>\d+)-(.*)$`,
+		map[string]string{"var1": "holiday", "var2": "202110"})
+	testPath(t, tree, "/smith/abc/some-matchthesecondregex", `/smith/abc/~^some-.*second.*$`, nil)
+	testPath(t, tree, "/smith/abc/third-no-specific-match", `/smith/~^.+$`, nil)
+	testPath(t, tree, "/users/123/something/notmatch", `/users/~^.+$`, nil)
+	testPath(t, tree, "/images3/categorya-img1.jpg", `/images3/~^(?P<category>\w+)-(?P<name>.+)$`,
+		map[string]string{"category": "categorya", "name": "img1.jpg"})
+	testPath(t, tree, "/images3/nocategoryimg.jpg", "/images3/*path",
+		map[string]string{"path": "nocategoryimg.jpg"})
 
 	testPath(t, tree, "/users/abc/updatePassword", "/users/:id/updatePassword",
 		map[string]string{"id": "abc"})