ZF-7721: Weak check of chunked body structure may cause infinite loop
Description
Zend_Http_Response::decodeChunkedBody() checks format of chunked HTTP response with weak regular expression:
/^([\da-fA-F]+)[^\r\n]*\r\n/sm
As can be seen - it allows potentially any string that starts with hexadecimal number giving wide space for errors. Also it doesn't comply with format specification (http://tools.ietf.org/html/rfc2616#section-3.6.1).
Incorrect treating of chunked file format may lead to infinite loop, as can be seen in provided real world example. Proposed solution is to change format checking regular expression to one with more strict format check, for example:
/^([\da-fA-F]+)\s(;[^()\<>@,;:\\"\/[]\?={}\t\s]+=[^\r\n])?\r\n/sm
Comments
Posted by Alexander Grimalovsky (flying) on 2009-08-29T05:46:37.000+0000
Real world example of page which causes infinite loop when being processed by decodeChunkedBody()
Posted by sky2k (sky2k) on 2010-10-06T05:20:22.000+0000
Confirmed (1.10.8). Site crawler hangs every day. During investigation I found that decodeChunked function causes infinite loop. E.g. try to do $response->getBody() on http://ir{.}kvh{.}com url or ir{.}stanleyblackanddecker{.}com.
Posted by Till Klampaeckel (till) on 2010-10-18T15:44:04.000+0000
I can confirm this issue as well.
@Shahar: What do you think about the improved regex provided by Alexander?