Issues

ZF-3382: Zend_Cache_Backend_File problems under very high load

Issue Type: Bug Created: 2008-06-04T05:13:00.000+0000 Last Updated: 2008-09-02T10:39:33.000+0000 Status: Resolved Fix version(s): - 1.6.0 (02/Sep/08)

Reporter: caval ionut (daniel123) Assignee: Fabien MARTY (fab) Tags: - Zend_Cache

Related issues: Attachments: - better_log.txt

Description

When using Zend_Cache and the site is under heavy load it does not use cache anymore. We have tested using ab benchmark software and it seems that Zend_Cache cannot read each time the metadata file which generates a new cache instance so the cache is regenerated

Comments

Posted by julien PAULI (doctorrock83) on 2008-06-04T08:29:51.000+0000

Seems to be a file locking problem , have you tried such a test with a memory based backend ? (APC, Memcached ...)

Posted by caval ionut (daniel123) on 2008-06-04T08:41:45.000+0000

No we haven't tried that. Unfortunately we cannot use a memory based cached. Are there any other solutions that we might try? Thank you

Posted by Fabien MARTY (fab) on 2008-06-04T10:13:38.000+0000

following code works under heavy load with apache bench :

<pre class="highlight">
<?php

require_once 'Zend/Cache.php';

$frontendOptions = array(
   'lifetime' => 7200
);

$backendOptions = array(
    'cache_dir' => '/tmp/' // Directory where to put the cache files
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
    $result='foobar foobar foobar...';
    $cache->save($result, 'myresult');
    echo "NOT CACHED";
} else {
    echo "CACHED";
}

I tried with "ab -c 50 -n 1000" (50 concurrent request)

I got : "Failed requests: 0"

If you try the same example with a lifetime of 2 seconds for example (to get a NOT CACHED during the test), you have some "Failed requests because of length which is not the same "CACHED" or "NOT CACHED").

So IMHO, File backend works under heavy load.

Have try to activate logging ? If it's a file locking problem (because of specific software or hardware...), you can try to desactivate file locking in backend options ?

Can you try my example on your servers ?

Posted by caval ionut (daniel123) on 2008-06-10T02:59:49.000+0000

Hi, Sorry for the late response but i was working on other solution of cache. I have tried your solution and the problem persists. <> I got that too but during that test the cached file was rewritten once/second. Please try you test with more content on the page. I have tested using the the index page of my site which has most resource consuming modules. So the problem is not that i don't get the response back from server, the problem is that sometime the script cannot read the file and therefore it will generate another cache file. Thank you, hope that helps

Posted by Fabien MARTY (fab) on 2008-06-20T12:55:38.000+0000

no problem for me

please activate logging

Posted by caval ionut (daniel123) on 2008-06-25T08:25:01.000+0000

I attach the code that i have used too:

$logger = new Zend_Log(); $writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');

$logger->addWriter($writer);

$logger->log('>>>>START ' . $session_id, 1);

$frontendOptions = array( 'lifetime' => 600000, 'logging' => true );

$backendOptions = array( 'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files 'file_locking' => false );

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) { for ($i=0;$i<100000;$i++) { $result .= $i . '
'; } //$result='foobar foobar foobar...'; $cache->save($result, 'myresult'); $logger->log('NOT CACHED', 1); echo "NOT CACHED"; $logger->log('<<<<END ' . $session_id, 1); } else { $logger->log('CACHED', 1); echo "CACHED"; $logger->log('<<<<END ' . $session_id, 1); }

Posted by caval ionut (daniel123) on 2008-06-25T08:29:23.000+0000

Using the code above I was still getting the following results in ab:

{panel} This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking <www.intrebari-daniel.ro> (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Completed 500 requests Completed 600 requests Completed 700 requests Completed 800 requests Completed 900 requests Finished 1000 requests

Server Software: Apache/2.2.3 Server Hostname: <www.intrebari-daniel.ro> Server Port: 80

Document Path: / Document Length: 6 bytes

Concurrency Level: 50 Time taken for tests: 51.9509 seconds Complete requests: 1000 Failed requests: 35 (Connect: 0, Length: 35, Exceptions: 0) Write errors: 0 Total transferred: 402000 bytes HTML transferred: 8000 bytes Requests per second: 19.60 [#/sec] (mean) Time per request: 2550.476 [ms] (mean) Time per request: 51.010 [ms] (mean, across all concurrent requests) Transfer rate: 7.68 [Kbytes/sec] received

Connection Times (ms) min mean[+/-sd] median max Connect: 0 17 53.1 0 298 Processing: 33 2388 7297.9 64 41585 Waiting: 2 1958 6057.9 61 41583 Total: 33 2405 7298.0 64 41585

Percentage of the requests served within a certain time (ms) 50% 64 66% 180 75% 941 80% 1620 90% 4513 95% 15133 98% 36262 99% 38378 100% 41585 (longest request) {panel}

I have tried using Sqlite engine and the cache was fine

Thank you

Posted by Cody Pisto (cpisto) on 2008-06-25T15:12:30.000+0000

I'm definitely seeing the same thing here.

Not entirely sure whats happening, but the file backend is removing metadata and cache files before they expire under high load.

Posted by Cody Pisto (cpisto) on 2008-06-25T15:28:39.000+0000

I can reliably reproduce this using the testcase code using ab and -n 1000 -c 4 (instead of -c 50).

PHP 5.2.6 running as FastCGI

Posted by Cody Pisto (cpisto) on 2008-06-25T15:30:04.000+0000

Also... using APC 3.0.19

Posted by Cody Pisto (cpisto) on 2008-06-25T15:35:24.000+0000

Still happens with file_locking turned off.

Posted by Cody Pisto (cpisto) on 2008-06-25T15:45:43.000+0000

Here is a better log, with the actual Zend_Cache_Backend_File logging output

Posted by Cody Pisto (cpisto) on 2008-06-25T15:48:38.000+0000

Here is another log, this time with file_locking turned on

Posted by Cody Pisto (cpisto) on 2008-06-25T15:55:18.000+0000

The problem still happens with file locking on and both read and write control turned off.

Not sure whats going on here but it basically makes the File Backend completely useless under any kind of load.

pretty standard linux setup, 2.6.18 on an ext3 fs.

Posted by Cody Pisto (cpisto) on 2008-06-25T16:06:34.000+0000

OK, I've found the problem and a fix.

I will post a patch after I've had a chance to verify it and clean it up.

Posted by Cody Pisto (cpisto) on 2008-06-25T17:45:10.000+0000

I've attached a patch to fix this issue. apply with patch -p0

Basically there were some implementation details of the reading and writing of files here that were very vulnerable to race conditions.

I have tested this on linux 2.6 and windows, it performs perfectly in both cases.

here is a summary of the changes in the patch:

  • removed filesize check and fread from _fileGetContents and replaced with stream_get_contents, lessens race vulnerability
  • removed flock(LOCK_UN) calls from _fileGetContents/_filePutContents, since fclose releases locks anyway, and they create a tiny and needless race condition between lock release and fclose.
  • added third optional parameter to _setMetadatas, to disable writing metadata to disk, eliminating the needless rewriting of metadatas file during _getMetadatas
  • changed fopen mode in _filePutContents to ab+ from wb, and added fseek(0) and ftruncate(0) calls after LOCK_EX, This virtually eliminates the race problems due to PHP fopen implementation. see http://bugs.php.net/bug.php?id=28371 and http://www.rooftopsolutions.nl/article/107 for reasons why this is better

Posted by caval ionut (daniel123) on 2008-06-26T00:20:11.000+0000

That seems to fix it. I have added the patch to Zend library.

Thanks a lot

Posted by Cody Pisto (cpisto) on 2008-06-26T05:18:13.000+0000

I will get a contributor license agreement sent to Zend today so that the patch can be applied for 1.5.3

Posted by Cody Pisto (cpisto) on 2008-06-26T06:33:18.000+0000

Also, just a quick note, the likely reason Fabian didn't see the problem with his testcase, is the amount of data being cached in his testcase is small enough that it fits in the normal fwrite/fread buffers and thus minimizes the race condition a bit, using Caval's test case will hit the problem every time.

Posted by Cody Pisto (cpisto) on 2008-06-26T09:47:25.000+0000

Also this patch improves performance greatly in the case of a php backend that does not have a particular cache id in its internal metadatas array, as it no longer rewrites the metadatas file to disk in this case.

Posted by Fabien MARTY (fab) on 2008-06-26T10:41:59.000+0000

Can't reproduce this even with a 10 MB cache file !

<pre class="highlight">
<?php

require_once 'Zend/Cache.php';

$frontendOptions = array(
   'lifetime' => 7200
);

$backendOptions = array(
    'cache_dir' => '/tmp/' // Directory where to put the cache files
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
    $res = '';
    for ($i = 0 ; $i < 1000000 ; $i++) {
        $res .= '0123456789';
    }
    $cache->save($res, 'myresult');
    echo "NOT CACHED";
} else {
    echo "CACHED";
}

?>

ab -c 50 -n 1000 URL

=> no failed request (a "not cached" would produce a length error)

Posted by Fabien MARTY (fab) on 2008-06-26T10:50:10.000+0000

some interesting things in the patch anyway

I'm reviewing it

Posted by Cody Pisto (cpisto) on 2008-06-26T11:01:36.000+0000

The ab failed request is not what shows you the bug :-)

activate a logger for the cache and log cached vs not cached in a log file.

What is happening is under load, without my patch, you will get constant cache misses as the metadatas and cache files are trampled due to race conditions. see the log files I have attached to the bug.

Posted by Cody Pisto (cpisto) on 2008-06-26T11:04:05.000+0000

2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Core::save() / write_control : written and read data do not match 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:46:59+00:00 ALERT (1): CACHED 2008-06-25T22:46:59+00:00 ALERT (1): <<<>>>START 2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match 2008-06-25T22:46:59+00:00 ALERT (1): >>>>START 2008-06-25T22:47:00+00:00 ALERT (1): NOT CACHED

Posted by Cody Pisto (cpisto) on 2008-06-26T11:09:48.000+0000

My test case

addWriter($writer); $logger->log('>>>>START ' . $session\_id, 1); $frontendOptions = array( 'lifetime' => 600000, 'logging' => true, 'logger' => $logger ); $backendOptions = array( 'cache\_dir' => ROOT\_DIR . '/cache/', // Directory where to put the cache files 'file\_locking' => true ); $cache = Zend\_Cache::factory('Core', 'File', $frontendOptions, $backendOptions); if (!$result = $cache->load('myresult')) { for ($i=0;$i<100000;$i++) { $result .= $i . ' '; } $cache->save($result, 'myresult'); $logger->log('NOT CACHED', 1); echo "NOT CACHED"; $logger->log('<<<log('CACHED', 1); echo "CACHED"; $logger->log('<<<

Have you found an issue?

See the Overview section for more details.