Issues

ZF-3382: Zend_Cache_Backend_File problems under very high load

Description

When using Zend_Cache and the site is under heavy load it does not use cache anymore. We have tested using ab benchmark software and it seems that Zend_Cache cannot read each time the metadata file which generates a new cache instance so the cache is regenerated

Comments

Seems to be a file locking problem , have you tried such a test with a memory based backend ? (APC, Memcached ...)

No we haven't tried that. Unfortunately we cannot use a memory based cached. Are there any other solutions that we might try? Thank you

following code works under heavy load with apache bench :


<?php

require_once 'Zend/Cache.php';

$frontendOptions = array(
   'lifetime' => 7200
);

$backendOptions = array(
    'cache_dir' => '/tmp/' // Directory where to put the cache files
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
    $result='foobar foobar foobar...';
    $cache->save($result, 'myresult');
    echo "NOT CACHED";
} else {
    echo "CACHED";
}

I tried with "ab -c 50 -n 1000" (50 concurrent request)

I got : "Failed requests: 0"

If you try the same example with a lifetime of 2 seconds for example (to get a NOT CACHED during the test), you have some "Failed requests because of length which is not the same "CACHED" or "NOT CACHED").

So IMHO, File backend works under heavy load.

Have try to activate logging ? If it's a file locking problem (because of specific software or hardware...), you can try to desactivate file locking in backend options ?

Can you try my example on your servers ?

Hi, Sorry for the late response but i was working on other solution of cache. I have tried your solution and the problem persists. <> I got that too but during that test the cached file was rewritten once/second. Please try you test with more content on the page. I have tested using the the index page of my site which has most resource consuming modules. So the problem is not that i don't get the response back from server, the problem is that sometime the script cannot read the file and therefore it will generate another cache file. Thank you, hope that helps

no problem for me

please activate logging

I attach the code that i have used too:

$logger = new Zend_Log(); $writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');

$logger->addWriter($writer);

$logger->log('>>>>START ' . $session_id, 1);

$frontendOptions = array( 'lifetime' => 600000, 'logging' => true );

$backendOptions = array( 'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files 'file_locking' => false );

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) { for ($i=0;$i<100000;$i++) { $result .= $i . '
'; } //$result='foobar foobar foobar...'; $cache->save($result, 'myresult'); $logger->log('NOT CACHED', 1); echo "NOT CACHED"; $logger->log('<<<<END ' . $session_id, 1); } else { $logger->log('CACHED', 1); echo "CACHED"; $logger->log('<<<<END ' . $session_id, 1); }

Using the code above I was still getting the following results in ab:

{panel} This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking www.intrebari-daniel.ro (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Finished 1000 requests

Server Software: Apache/2.2.3 Server Hostname: www.intrebari-daniel.ro Server Port: 80

Document Path: / Document Length: 6 bytes

Concurrency Level: 50 Time taken for tests: 51.9509 seconds Complete requests: 1000 Failed requests: 35 (Connect: 0, Length: 35, Exceptions: 0) Write errors: 0 Total transferred: 402000 bytes HTML transferred: 8000 bytes Requests per second: 19.60 [#/sec] (mean) Time per request: 2550.476 [ms] (mean) Time per request: 51.010 [ms] (mean, across all concurrent requests) Transfer rate: 7.68 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 17 53.1 0 298 Processing: 33 2388 7297.9 64 41585 Waiting: 2 1958 6057.9 61 41583 Total: 33 2405 7298.0 64 41585

Percentage of the requests served within a certain time (ms) 50% 64 66% 180 75% 941 80% 1620 90% 4513 95% 15133 98% 36262 99% 38378 100% 41585 (longest request) {panel}

I have tried using Sqlite engine and the cache was fine

Thank you

I'm definitely seeing the same thing here.

Not entirely sure whats happening, but the file backend is removing metadata and cache files before they expire under high load.

I can reliably reproduce this using the testcase code using ab and -n 1000 -c 4 (instead of -c 50).

PHP 5.2.6 running as FastCGI

Also... using APC 3.0.19

Still happens with file_locking turned off.

Here is a better log, with the actual Zend_Cache_Backend_File logging output

Here is another log, this time with file_locking turned on

The problem still happens with file locking on and both read and write control turned off.

Not sure whats going on here but it basically makes the File Backend completely useless under any kind of load.

pretty standard linux setup, 2.6.18 on an ext3 fs.

OK, I've found the problem and a fix.

I will post a patch after I've had a chance to verify it and clean it up.

I've attached a patch to fix this issue. apply with patch -p0

Basically there were some implementation details of the reading and writing of files here that were very vulnerable to race conditions.

I have tested this on linux 2.6 and windows, it performs perfectly in both cases.

here is a summary of the changes in the patch:

  • removed filesize check and fread from _fileGetContents and replaced with stream_get_contents, lessens race vulnerability
  • removed flock(LOCK_UN) calls from _fileGetContents/_filePutContents, since fclose releases locks anyway, and they create a tiny and needless race condition between lock release and fclose.
  • added third optional parameter to _setMetadatas, to disable writing metadata to disk, eliminating the needless rewriting of metadatas file during _getMetadatas
  • changed fopen mode in _filePutContents to ab+ from wb, and added fseek(0) and ftruncate(0) calls after LOCK_EX, This virtually eliminates the race problems due to PHP fopen implementation. see http://bugs.php.net/bug.php?id=28371 and http://www.rooftopsolutions.nl/article/107 for reasons why this is better

That seems to fix it. I have added the patch to Zend library.

Thanks a lot

I will get a contributor license agreement sent to Zend today so that the patch can be applied for 1.5.3

Also, just a quick note, the likely reason Fabian didn't see the problem with his testcase, is the amount of data being cached in his testcase is small enough that it fits in the normal fwrite/fread buffers and thus minimizes the race condition a bit, using Caval's test case will hit the problem every time.

Also this patch improves performance greatly in the case of a php backend that does not have a particular cache id in its internal metadatas array, as it no longer rewrites the metadatas file to disk in this case.

Can't reproduce this even with a 10 MB cache file !

{{


<?php

require_once 'Zend/Cache.php';

$frontendOptions = array(
   'lifetime' => 7200
);

$backendOptions = array(
    'cache_dir' => '/tmp/' // Directory where to put the cache files
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
    $res = '';
    for ($i = 0 ; $i < 1000000 ; $i++) {
        $res .= '0123456789';
    }
    $cache->save($res, 'myresult');
    echo "NOT CACHED";
} else {
    echo "CACHED";
}

?>

ab -c 50 -n 1000 URL

=> no failed request (a "not cached" would produce a length error)

{{

some interesting things in the patch anyway

I'm reviewing it

The ab failed request is not what shows you the bug :-)

activate a logger for the cache and log cached vs not cached in a log file.

What is happening is under load, without my patch, you will get constant cache misses as the metadatas and cache files are trampled due to race conditions. see the log files I have attached to the bug.

2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Core::save() / write_control : written and read data do not match 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:47:00+00:00 ALERT (1): NOT CACHED

My test case

{{

<?php

define('ROOT_DIR', '/var/www/test'); set_include_path(ROOT_DIR . '/lib' . PATH_SEPARATOR . get_include_path());

require 'Zend/Log.php'; require 'Zend/Log/Writer/Stream.php'; require 'Zend/Cache.php';

$logger = new Zend_Log(); $writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');

$logger->addWriter($writer);

$logger->log('>>>>START ' . $session_id, 1);

$frontendOptions = array( 'lifetime' => 600000, 'logging' => true, 'logger' => $logger );

$backendOptions = array( 'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files 'file_locking' => true );

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if (!$result = $cache->load('myresult')) { for ($i=0;$i<100000;$i++) { $result .= $i . '
'; } $cache->save($result, 'myresult'); $logger->log('NOT CACHED', 1); echo "NOT CACHED"; $logger->log('<<<<END ' . $session_id, 1); } else { $logger->log('CACHED', 1); echo "CACHED"; $logger->log('<<<<END ' . $session_id, 1); }

}}

The part of the patch "added third optional parameter to _setMetadatas, to disable writing metadata to disk, eliminating the needless rewriting of metadatas file during _getMetadatas" is commited into the trunk and into the 1.5 branch

Other parts should be commited only into the trunk (for 1.6 release)

Could you try the head of the 1.5 branch to see if the problem is fixed for you ?

other part of the patch commited to the trunk (for 1.6 release)

good job

thanks

Tested 1.5 branch snapshot, definitely helps reduce the problem quite a bit, but does not totally eliminate it as the full patch does.

Thanks a ton for your work, and I really look forward to the next 1.5 release and more so to 1.6!!

Updating for the 1.6.0 release.