ZF-166: Improvements to RewriteRouter - Can now match routes like Ruby on Rails

Description

h3. Overview: I gave the Zend_Controller_Router_Route (Route.php) class of the RewriteRouter some new features for defining routes and capturing parameters. It can now match anything the Ruby on Rails router is capable of matching while maintaining 100% backwards compatibility with the current RewriteRouter.

h3. Benefits over the current Zend_Controller_Router_Route: * less overhead when defining a route. In my AdvancedRoute class __construct()'s only job is to store 3 variables into the object. No looping or processing occurs until match() or assemble() is called. In the current Zend_Controller_Router_Route class each call to __construct() performs 2 cases of typecasting, an explode, foreach loop containing 2 if statements, and 2 calls to substr(). Let's say you have a large site with 100 routes and the 90th route you defined is a match. Using AdvancedRoute you will only parse and process 10 routes but with Zend_Controller_Router_Route you parse all 100 routes every request whether or not you used them in addition to processing 10 routes. * {{:variables}} can be defined anywhere and in any combination in your route, not just in between {{/}}'s * {{:variables}} can be next to eachother * wildcards can be defined anywhere in your route and you can use more than one wildcard as opposed to Zend_Controller_Router_Route which only allows a single wildcard at the end of the route with a hardcoded var1/value1/var2/value2 format. * All of this comes at a cost of only {{.1}} (yes that is point one) millisecond per request. AdvancedRoute is 100% backwards compatible with the existing Zend_Controller_Router_Route so no code has to be changed.

h3. How To Install (updated to reflect RewriteRouter API change in SVN):

Create a folder {{Custom/Controller/Router}} inside your Zend Framework {{"library"}} folder. If you created the folders in the correct place you should see both {{"Zend"}} and {{"Custom"}} next to eachother in the {{"library"}} folder

Download the topmost {{AdvancedRoute.php}} from the attached files above and place it in your {{Custom/Controller/Router}} folder

h3. How To Use:

<?php
$router = new Zend_Controller_RewriteRouter();

// you can add routes like normal still
$router->addRoute('news', new Zend_Controller_Router_Route('news/:controller/:action/:id'));

// AdvancedRoute routes are backwards compatible so the same code works
$router->addRoute('news', new Custom_Controller_Router_AdvancedRoute('news/:controller/:action/:id'));

// but now you can also add routes which require advanced matching
$router->addRoute(
    'logViewer',
    new Custom_Controller_Router_AdvancedRoute(
        ':controller/:action/customer{:id}httpd-error.:ext',
        array(),
        array('id' => '\d+')
    )
);

NOTE: When naming {{:variables}} in your Advanced Routes use the PHP variable naming conventions. Basicly the name can only consist of the following characters: {{_, A-Z, 0-9, and ASCII characters from 127 through 255}}. For more information please look at http://php.net/variables

h3. Examples: In Zend_Controller_Router_Route, routes are limited to only one variable between each {{'/'}} in the url. Now you can define routes with variables anywhere such as:

$router->addRoute(
    'logViewer',
    new Custom_Controller_Router_AdvancedRoute(
        ':controller/:action/file:id.:ext',
        array(),
        array('id' => '\d+')
    )
);
matches: http://mysite.com/log/view/file206.xml
Array
(
    [controller] => log
    [action] => view
    [id] => 206
    [ext] => xml
)

If a variable in the route is ambiguous, you can escape it with {}'s:

$router->addRoute(
    'logViewer',
    new Custom_Controller_Router_AdvancedRoute(
        ':controller/:action/customer{:id}httpd-error.:ext',
        array(),
        array('id' => '\d+')
    )
);
matches: http://mysite.com/log/view/…
Array
(
    [controller] => log
    [action] => view
    [id] => 206
    [ext] => log
)

You can also use (multiple) wildcards to capture parameters:

$router->addRoute(
    'directions',
    new Custom_Controller_Router_AdvancedRoute(
        'search/:action/from/:from/to/:to',
        array('controller' => 'mapit'),
        array('from' => '.+', 'to' => '.+')
    )
);
matches: http://mysite.com/search/getDirections/from/1866 Euclid Ave/Cleveland/OH/44315/to/Rome/Italy
Array
(
    [controller] => mapit
    [action] => getDirections
    [from] => 1866%20Euclid%20Ave/Cleveland/OH/44315
    [to] => Rome/Italy
)

You can even put two variables next to eachother:

$router->addRoute(
    'onlineStore',
    new Custom_Controller_Router_AdvancedRoute(
        ':controller/:action/:model_id:modelnum',
        array(),
        array('model_id' => '[a-z]+', 'modelnum' => '\d+')
    )
);
matches: http://mysite.com/product/info/MX1035347
Array
(
    [controller] => product
    [action] => info
    [model_id] => MX
    [modelnum] => 1035347
)

h3. Benchmarks: What do these changes cost in speed? The quick answer is that AdvancedRoute tested 2.78 requests per second less than Zend_Controller_Router_Route or about {{.1}} millisecond slower per request. The test machine is a 2.4GHz Celeron and 1GB RAM running FreeBSD 6.1 with PHP 5.14 no optimizer (such as APC, Zend Optimizer, Turck) running as a fastcgi on Apache 2.2.2. I ran each script 10 times after a freshly restarted apache and the 2nd best result was picked for each script. Here are the results:

Using AdvancedRoute.php:

[root /home/mjsdigital/public_html/benchmark]$ ab -n 1000 -c 10 http://127.0.0.1/benchmark/mjs.php
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 1997-2005 The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Finished 1000 requests


Server Software:        Apache/2.2.2
Server Hostname:        127.0.0.1
Server Port:            80

Document Path:          /benchmark/mjs.php
Document Length:        0 bytes

Concurrency Level:      10
Time taken for tests:   6.284683 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      245000 bytes
HTML transferred:       0 bytes
Requests per second:    159.12 [#/sec] (mean)
Time per request:       62.847 [ms] (mean)
Time per request:       6.285 [ms] (mean, across all concurrent requests)
Transfer rate:          38.03 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.7      0       9
Processing:    17   61  24.4     58     168
Waiting:       17   60  24.6     57     168
Total:         17   61  24.4     58     168

Percentage of the requests served within a certain time (ms)
  50%     58
  66%     68
  75%     74
  80%     80
  90%     95
  95%    108
  98%    126
  99%    140
 100%    168 (longest request)

And Zend_Controller_Router_Route with wildcard support:

[[root /home/mjsdigital/public_html/benchmark]$ ab -n 1000 -c 10 http://127.0.0.1
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 1997-2005 The Apache Software Foundation, http://www.apache.org/

Benchmarking 127.0.0.1 (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Finished 1000 requests


Server Software:        Apache/2.2.2
Server Hostname:        127.0.0.1
Server Port:            80

Document Path:          /benchmark/martel.php
Document Length:        0 bytes

Concurrency Level:      10
Time taken for tests:   6.176693 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      245000 bytes
HTML transferred:       0 bytes
Requests per second:    161.90 [#/sec] (mean)
Time per request:       61.767 [ms] (mean)
Time per request:       6.177 [ms] (mean, across all concurrent requests)
Transfer rate:          38.69 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   1.5      0      15
Processing:    11   58  23.3     55     237
Waiting:        8   50  26.5     45     236
Total:         14   59  23.3     55     241

Percentage of the requests served within a certain time (ms)
  50%     55
  66%     60
  75%     65
  80%     69
  90%     88
  95%    103
  98%    126
  99%    147
 100%    241 (longest request)

Source code used for the tests: mjs.php:

<?php
set_include_path(get_include_path() . PATH_SEPARATOR . '/home/mjsdigital/public_html/trunk/library');
require_once 'Custom/Controller/Router/AdvancedRoute.php';
require_once 'variables.php';
foreach ($tests as $test) {
    $route = new Custom_Controller_Router_AdvancedRoute($test['map'], $test['defaults'], $test['requirements']);
    $result = $route->match($test['path']);
    if (isset($_GET['debug'])) print_r($result);
}

martel.php:

<?php
set_include_path(get_include_path() . PATH_SEPARATOR . '/home/mjsdigital/public_html/trunk/library');
require_once 'Zend/Controller/Router/Route.php';
require_once 'variables.php';
foreach ($tests as $test) {
    $route = new Zend_Controller_Router_Route($test['map'], $test['defaults'], $test['requirements']);
    $result = $route->match($test['path']);
    if (isset($_GET['debug'])) print_r($result);
}

variables.php:

<?php
$tests = array();

$tests[] = array(
    'map'          => '',
    'defaults'     => array('controller' => 'index', 'action' => 'index'),
    'requirements' => array(),
    'path'         => ''
);

$tests[] = array(
    'map'          => ':controller/:action/*',
    'defaults'     => array('action' => 'index'),
    'requirements' => array(),
    'path'         => 'ctrl/act/one/1/two/2/three/3/four/4/five/5'
);

$tests[] = array(
    'map'          => ':controller/:action/:id',
    'defaults'     => array(),
    'requirements' => array(),
    'path'         => 'ctrl/act/myid'
);

// The following test results in an incorrect return value of "false"
// in Zend_Controller_Router_Route in SVN but it used to work in 0.1.5.
// It should return: array('controller' => 'ctrl', 'action' => null, 'id' => 'myId')
//$tests[] = array(
//    'map'          => ':controller/:action/:id',
//    'defaults'     => array('action' => null),
//    'requirements' => array(),
//    'path'         => 'ctrl//myid'
//);

Comments

This issue is related to ZF-158 and

and also ZF-158

oops I meant ZF-63 in that last comment! :)

Fixed some regex bugs. Now passes all of the Route tests except for "testVariablesWithDefaultAndRequirementAndIncorrectValue". I'm not sure if this test is correct or not. In my opinion returning false is correct in this case.

Okay I just installed Ruby on Rails and tested the behavior I commented above (testVariablesWithDefaultAndRequirementAndIncorrectValue).

On line 156 of RouteTest.php it has:

$route = new Zend_Controller_Router_Route(':controller:/:action:/:year:', array('year' => 2006), array('year' => '\d+'));

It was my opinion that if {{:year:}} did not match the {{\d+}} regex then the path does not match this route and {{match()}} should return {{false}}. After testing on RoR they also have the same behavior. In the current RouteTest.php it is expecting {{:year:}} to drop back to the default {{2006}} if it did not match the {{\d+}} regex. What are your opinions on this? Should it return {{false}} or drop {{:year:}} to the default {{2006}} and consider it a match?

I get a bool(false) as a result of:


require_once 'RouteM.php';

$route = new Zend_Controller_Router_Route(':controller/:action/:year', array('year' => 2006), array('year' => '\d+'));
$values = $route->match('c/a/2000');
var_dump($values);

RouteM is your route of course. I will test it again when I will have a working solution.

About a benchmark. You have to test it with Apache Benchmark since php has a per-thread global pool of compiled regexes. In your solution you set a different regex for each route so they have to be compiled every time - now imagine a big site with 100-1000 routes. My solution has only one regex regardles of the number of routes so it theoretically it should run faster. I will benchmark both when you provide working code.

Christopher Thompson proposed a different aproach for a wildcard match which suites me better. It involves two step matching - at routes level and a controller or an action level. You will be able to define global application routes (like now but with a wildcard probably) and local routes where you will be able to define routes for that particular controller only. This may save a lot of matching at global level.

Now about many variables per "dir". I have it on my to do list and I'll try to implement it in the future. But it's really low priority for now.

Well, I forgot you have changed the variable representation. For above route the benchmark is:

Not matching routes - common. Remeber you will have to check this O(n) times:

ab2 -n 1000 -c 10 http://localhost/test/router-ab2-incorrect.php

Mine: Requests per second: 313.68 [#/sec] (mean) Time per request: 31.880 [ms] (mean) Time per request: 3.188 [ms] (mean, across all concurrent requests)

Your: Requests per second: 287.20 [#/sec] (mean) Time per request: 34.819 [ms] (mean) Time per request: 3.482 [ms] (mean, across all concurrent requests)

And a rare case - a match:

ab2 -n 1000 -c 10 http://localhost/test/router-ab2-match.php

Match - mine: Requests per second: 309.10 [#/sec] (mean) Time per request: 32.352 [ms] (mean) Time per request: 3.235 [ms] (mean, across all concurrent requests)

Match - your: Requests per second: 287.93 [#/sec] (mean) Time per request: 34.731 [ms] (mean) Time per request: 3.473 [ms] (mean, across all concurrent requests)

Both are faster in my code but the difference is negligible.

Can we somehow disable those smileys and other icons in the comments? They're just getting in the way.

Ahh so much for that I guess... I thought I had a breakthrough in routing :) I think it would be good to test this regex method vs your current looping method when you implement "multiple vars per dir" and "wildcard vars". Benchmarking routing code isn't easy because the input varies so much and a certain path/map might be faster w/ loops and slower w/ regex or vice versa. I must say this is a great challenge to figure out a solution to this problem though!

Attached an updated RouteTest to work with the changes I made in Route.php.

  1. Changed mapVars from {{:var}} to {{:var:}}
  2. Changed expected result for {{testVariablesWithDefaultAndRequirementAndIncorrectValue()}} to {{false}}
  3. Added test {{testVariablesWithDefaultInTheMiddleAndDefaultHasRequirement()}}

Attached Route.php:

  1. Changed from a 100% regex solution to a hybrid loops/regex similar to the current RewriteRouter
  2. Wildcard params are no longer {{:params*:}} but just a regular {{:params:}} with an entry of {{'.+'}} in the requirements parameter
  3. Now follows the same rules as Ruby on Rails when matching routes

Attached an update to Route.php

  1. optimized match() a little
  2. re-arranged and cleaned up code, removed some redundant parts
  3. fixed a bug where an optional mapVar following a pathPart that wasn't a / was not being properly handled

Attached: Route.php

  1. cleaned up / optimized code a bit more
  2. if a successful preg_match contains the next part of the path it will use that (cached) instead of running through another loop to find it

Attached new Route.php

  1. A few general cleanups in the code

You intending this for 0.2.0? If not change the fix version.

I'm not going use the above code since it's a major rewrite of the class but I will introduce some of the proposed changes of mapping logic in existing codebase. Wildcard match and a false on a reqirement not being met.

Thank you, Michael.

Attached Route.php

switched from :var: to :var and {:var} for backwards compatibility

Attached Route.php

  1. Made code more readable
  2. some minor speed tweaks
  3. detailed comments to better understand how everything fits together
  4. assemble() now tests vars aginst required regex patterns before assembling a route

Updated {{RouteTest.php}} back to the old style {{:vars}} instead of my crazy {{:vars:}} from before :)

Attached Route.php:

  • Fixed an undefined offset notice
  • Updated DEFAULT_REGEX to include all reserved and unreserved characters in RFC 3986 except for {{/}}

Attached: AdvancedRoute.php * Incorporated best of regex method + loop method * optimized regex used for matching routes * optimized __construct() method to only assign variables * added parse() method which will process the parameters supplied to __construct on a per-need basis for match() and assemble() * AdvancedRoute is now faster than any previous file submitted

I am reopening this issue on the following basis:

  • My changes to the router are now 100% backwards compatible and current with the most recent changes in SVN
  • There is only a 3-5 request per second speed decrease in exchange for the ability to match routes with the same flexibility as Ruby on Rails
  • There are similar or less lines of code in this router than the current Route.php
  • The code is now easier to understand in this version compared to previous versions.

My proposal is for Custom_Controller_Router_AdvancedRoute (AdvancedRoute.php) to be renamed to Zend_Controller_Router_Route (Route.php) and replace the current Route.php used for the RewriteRouter. The performance difference is neglegible and the features / flexibility gained are more than worth it.

Route is not backwards compatible (failed unt tests) and it's flexibility is just not worth the effort.

Michael, I'm not going to accept a total rewrite of the class which does exactly the same thing but in another manner. Accept that. It won't make it, so don't waste your energy on something that's already functional. Focus on something that's not ready yet - there are a lot of fields where your help will count.

It most definitely is 100% backwards compatible. The problem is that your unit tests are flawed.

__construct() expects a string and 2 optional arrays

{{testVariablesWithRequirementAndValue()}} and {{testVariablesWithRequirementAndIncorrectValue()}} supply a null value for a parameter which is expected to be an array. Your code allows for the error while mine simple enforces it and provides a detailed error message.

and no the current router does not do exactly the same thing. Please explain how to do any one of the above examples using the current Route.php. I wrote this code because the current Route.php was not functional enough. You have to remember that both beginners and expert developers will be using this framework. Your Route.php has the beginner developers covered but when you want to get into more advanced matching it falls short. People have been asking for this functionality on the mailing list. Issues have been opened in the tracker for situations where the current router did not meet their needs. People are using AdvancedRoute.php right now because they need more than what is offered by Route.php. Those who are coming from other frameworks expect at least the level of routing functionality that their old frameworks had, not less. It should not be this hard to give away code that works.

cleaned up long lines of code in examples to make this page more readable