ZF-3382: Zend_Cache_Backend_File problems under very high load
Description
When using Zend_Cache and the site is under heavy load it does not use cache anymore. We have tested using ab benchmark software and it seems that Zend_Cache cannot read each time the metadata file which generates a new cache instance so the cache is regenerated
Comments
Posted by julien PAULI (doctorrock83) on 2008-06-04T08:29:51.000+0000
Seems to be a file locking problem , have you tried such a test with a memory based backend ? (APC, Memcached ...)
Posted by caval ionut (daniel123) on 2008-06-04T08:41:45.000+0000
No we haven't tried that. Unfortunately we cannot use a memory based cached. Are there any other solutions that we might try? Thank you
Posted by Fabien MARTY (fab) on 2008-06-04T10:13:38.000+0000
following code works under heavy load with apache bench :
I tried with "ab -c 50 -n 1000" (50 concurrent request)
I got : "Failed requests: 0"
If you try the same example with a lifetime of 2 seconds for example (to get a NOT CACHED during the test), you have some "Failed requests because of length which is not the same "CACHED" or "NOT CACHED").
So IMHO, File backend works under heavy load.
Have try to activate logging ? If it's a file locking problem (because of specific software or hardware...), you can try to desactivate file locking in backend options ?
Can you try my example on your servers ?
Posted by caval ionut (daniel123) on 2008-06-10T02:59:49.000+0000
Hi, Sorry for the late response but i was working on other solution of cache. I have tried your solution and the problem persists. <> I got that too but during that test the cached file was rewritten once/second. Please try you test with more content on the page. I have tested using the the index page of my site which has most resource consuming modules. So the problem is not that i don't get the response back from server, the problem is that sometime the script cannot read the file and therefore it will generate another cache file. Thank you, hope that helps
Posted by Fabien MARTY (fab) on 2008-06-20T12:55:38.000+0000
no problem for me
please activate logging
Posted by caval ionut (daniel123) on 2008-06-25T08:25:01.000+0000
I attach the code that i have used too:
$logger = new Zend_Log(); $writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');
$logger->addWriter($writer);
$logger->log('>>>>START ' . $session_id, 1);
$frontendOptions = array( 'lifetime' => 600000, 'logging' => true );
$backendOptions = array( 'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files 'file_locking' => false );
$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);
if(!$result = $cache->load('myresult')) { for ($i=0;$i<100000;$i++) { $result .= $i . '
'; } //$result='foobar foobar foobar...'; $cache->save($result, 'myresult'); $logger->log('NOT CACHED', 1); echo "NOT CACHED"; $logger->log('<<<<END ' . $session_id, 1); } else { $logger->log('CACHED', 1); echo "CACHED"; $logger->log('<<<<END ' . $session_id, 1); }
Posted by caval ionut (daniel123) on 2008-06-25T08:29:23.000+0000
Using the code above I was still getting the following results in ab:
{panel} This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright 2006 The Apache Software Foundation, http://www.apache.org/
Benchmarking www.intrebari-daniel.ro (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Finished 1000 requests
Server Software: Apache/2.2.3 Server Hostname: www.intrebari-daniel.ro Server Port: 80
Document Path: / Document Length: 6 bytes
Concurrency Level: 50 Time taken for tests: 51.9509 seconds Complete requests: 1000 Failed requests: 35 (Connect: 0, Length: 35, Exceptions: 0) Write errors: 0 Total transferred: 402000 bytes HTML transferred: 8000 bytes Requests per second: 19.60 [#/sec] (mean) Time per request: 2550.476 [ms] (mean) Time per request: 51.010 [ms] (mean, across all concurrent requests) Transfer rate: 7.68 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 17 53.1 0 298 Processing: 33 2388 7297.9 64 41585 Waiting: 2 1958 6057.9 61 41583 Total: 33 2405 7298.0 64 41585
Percentage of the requests served within a certain time (ms) 50% 64 66% 180 75% 941 80% 1620 90% 4513 95% 15133 98% 36262 99% 38378 100% 41585 (longest request) {panel}
I have tried using Sqlite engine and the cache was fine
Thank you
Posted by Cody Pisto (cpisto) on 2008-06-25T15:12:30.000+0000
I'm definitely seeing the same thing here.
Not entirely sure whats happening, but the file backend is removing metadata and cache files before they expire under high load.
Posted by Cody Pisto (cpisto) on 2008-06-25T15:28:39.000+0000
I can reliably reproduce this using the testcase code using ab and -n 1000 -c 4 (instead of -c 50).
PHP 5.2.6 running as FastCGI
Posted by Cody Pisto (cpisto) on 2008-06-25T15:30:04.000+0000
Also... using APC 3.0.19
Posted by Cody Pisto (cpisto) on 2008-06-25T15:35:24.000+0000
Still happens with file_locking turned off.
Posted by Cody Pisto (cpisto) on 2008-06-25T15:45:43.000+0000
Here is a better log, with the actual Zend_Cache_Backend_File logging output
Posted by Cody Pisto (cpisto) on 2008-06-25T15:48:38.000+0000
Here is another log, this time with file_locking turned on
Posted by Cody Pisto (cpisto) on 2008-06-25T15:55:18.000+0000
The problem still happens with file locking on and both read and write control turned off.
Not sure whats going on here but it basically makes the File Backend completely useless under any kind of load.
pretty standard linux setup, 2.6.18 on an ext3 fs.
Posted by Cody Pisto (cpisto) on 2008-06-25T16:06:34.000+0000
OK, I've found the problem and a fix.
I will post a patch after I've had a chance to verify it and clean it up.
Posted by Cody Pisto (cpisto) on 2008-06-25T17:45:10.000+0000
I've attached a patch to fix this issue. apply with patch -p0
Basically there were some implementation details of the reading and writing of files here that were very vulnerable to race conditions.
I have tested this on linux 2.6 and windows, it performs perfectly in both cases.
here is a summary of the changes in the patch:
Posted by caval ionut (daniel123) on 2008-06-26T00:20:11.000+0000
That seems to fix it. I have added the patch to Zend library.
Thanks a lot
Posted by Cody Pisto (cpisto) on 2008-06-26T05:18:13.000+0000
I will get a contributor license agreement sent to Zend today so that the patch can be applied for 1.5.3
Posted by Cody Pisto (cpisto) on 2008-06-26T06:33:18.000+0000
Also, just a quick note, the likely reason Fabian didn't see the problem with his testcase, is the amount of data being cached in his testcase is small enough that it fits in the normal fwrite/fread buffers and thus minimizes the race condition a bit, using Caval's test case will hit the problem every time.
Posted by Cody Pisto (cpisto) on 2008-06-26T09:47:25.000+0000
Also this patch improves performance greatly in the case of a php backend that does not have a particular cache id in its internal metadatas array, as it no longer rewrites the metadatas file to disk in this case.
Posted by Fabien MARTY (fab) on 2008-06-26T10:41:59.000+0000
Can't reproduce this even with a 10 MB cache file !
{{
Posted by Fabien MARTY (fab) on 2008-06-26T10:50:10.000+0000
some interesting things in the patch anyway
I'm reviewing it
Posted by Cody Pisto (cpisto) on 2008-06-26T11:01:36.000+0000
The ab failed request is not what shows you the bug :-)
activate a logger for the cache and log cached vs not cached in a log file.
What is happening is under load, without my patch, you will get constant cache misses as the metadatas and cache files are trampled due to race conditions. see the log files I have attached to the bug.
Posted by Cody Pisto (cpisto) on 2008-06-26T11:04:05.000+0000
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Core::save() / write_control : written and read data do not match 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:47:00+00:00 ALERT (1): NOT CACHED
Posted by Cody Pisto (cpisto) on 2008-06-26T11:09:48.000+0000
My test case
{{
<?php
define('ROOT_DIR', '/var/www/test'); set_include_path(ROOT_DIR . '/lib' . PATH_SEPARATOR . get_include_path());
require 'Zend/Log.php'; require 'Zend/Log/Writer/Stream.php'; require 'Zend/Cache.php';
$logger = new Zend_Log(); $writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');
$logger->addWriter($writer);
$logger->log('>>>>START ' . $session_id, 1);
$frontendOptions = array( 'lifetime' => 600000, 'logging' => true, 'logger' => $logger );
$backendOptions = array( 'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files 'file_locking' => true );
$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);
if (!$result = $cache->load('myresult')) { for ($i=0;$i<100000;$i++) { $result .= $i . '
'; } $cache->save($result, 'myresult'); $logger->log('NOT CACHED', 1); echo "NOT CACHED"; $logger->log('<<<<END ' . $session_id, 1); } else { $logger->log('CACHED', 1); echo "CACHED"; $logger->log('<<<<END ' . $session_id, 1); }
}}
Posted by Fabien MARTY (fab) on 2008-06-26T14:03:24.000+0000
The part of the patch "added third optional parameter to _setMetadatas, to disable writing metadata to disk, eliminating the needless rewriting of metadatas file during _getMetadatas" is commited into the trunk and into the 1.5 branch
Other parts should be commited only into the trunk (for 1.6 release)
Could you try the head of the 1.5 branch to see if the problem is fixed for you ?
Posted by Fabien MARTY (fab) on 2008-06-26T14:11:43.000+0000
other part of the patch commited to the trunk (for 1.6 release)
good job
thanks
Posted by Cody Pisto (cpisto) on 2008-06-26T14:27:44.000+0000
Tested 1.5 branch snapshot, definitely helps reduce the problem quite a bit, but does not totally eliminate it as the full patch does.
Thanks a ton for your work, and I really look forward to the next 1.5 release and more so to 1.6!!
Posted by Wil Sinclair (wil) on 2008-09-02T10:39:33.000+0000
Updating for the 1.6.0 release.