Issue Details (XML | Word | Printable)

Key: ZF-3382
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Fabien MARTY
Reporter: caval ionut
Votes: 1
Watchers: 2
Operations

If you were logged in you would be able to see more operations.
Google issue summary
Zend Framework

Zend_Cache_Backend_File problems under very high load

Created: 04/Jun/08 05:13 AM   Updated: 02/Sep/08 10:39 AM   Resolved: 26/Jun/08 02:19 PM
Return to search "Fixed in 1.5.1"
Component/s: Zend_Cache
Affects Version/s: 1.5.2
Fix Version/s: 1.6.0

Time Tracking:
Not Specified

File Attachments: 1. Text File better_log.txt (162 kB)
2. Text File better_log_with_locking.txt (155 kB)
3. Text File log.txt (186 kB)
4. Text File zend_cache.patch (3 kB)



 Description  « Hide

When using Zend_Cache and the site is under heavy load it does not use cache anymore.
We have tested using ab benchmark software and it seems that Zend_Cache cannot read each time the metadata file which generates a new cache instance so the cache is regenerated



julien PAULI added a comment - 04/Jun/08 08:29 AM

Seems to be a file locking problem , have you tried such a test with a memory based backend ? (APC, Memcached ...)


caval ionut added a comment - 04/Jun/08 08:41 AM

No we haven't tried that. Unfortunately we cannot use a memory based cached.
Are there any other solutions that we might try?
Thank you


Fabien MARTY added a comment - 04/Jun/08 10:13 AM

following code works under heavy load with apache bench :

<?php

require_once 'Zend/Cache.php';

$frontendOptions = array(
   'lifetime' => 7200
);

$backendOptions = array(
    'cache_dir' => '/tmp/' // Directory where to put the cache files
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
    $result='foobar foobar foobar...';
    $cache->save($result, 'myresult');
    echo "NOT CACHED";
} else {
    echo "CACHED";
}

I tried with "ab -c 50 -n 1000" (50 concurrent request)

I got : "Failed requests: 0"

If you try the same example with a lifetime of 2 seconds for example (to get a NOT CACHED during the test), you have some "Failed requests because of length which is not the same "CACHED" or "NOT CACHED").

So IMHO, File backend works under heavy load.

Have try to activate logging ? If it's a file locking problem (because of specific software or hardware...), you can try to desactivate file locking in backend options ?

Can you try my example on your servers ?


caval ionut added a comment - 10/Jun/08 02:59 AM

Hi,
Sorry for the late response but i was working on other solution of cache. I have tried your solution and the problem persists.
<<I got : "Failed requests: 0">>
I got that too but during that test the cached file was rewritten once/second. Please try you test with more content on the page. I have tested using the the index page of my site which has most resource consuming modules.
So the problem is not that i don't get the response back from server, the problem is that sometime the script cannot read the file and therefore it will generate another cache file.
Thank you, hope that helps


Fabien MARTY added a comment - 20/Jun/08 12:55 PM

no problem for me

please activate logging


caval ionut added a comment - 25/Jun/08 08:25 AM

I attach the code that i have used too:

$logger = new Zend_Log();
$writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');

$logger->addWriter($writer);

$logger->log('>>>>START ' . $session_id, 1);

$frontendOptions = array(
'lifetime' => 600000,
'logging' => true
);

$backendOptions = array(
'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files
'file_locking' => false
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
for ($i=0;$i<100000;$i++) { $result .= $i . '<br>'; }
//$result='foobar foobar foobar...';
$cache->save($result, 'myresult');
$logger->log('NOT CACHED', 1);
echo "NOT CACHED";
$logger->log('<<<<END ' . $session_id, 1);
} else {
$logger->log('CACHED', 1);
echo "CACHED";
$logger->log('<<<<END ' . $session_id, 1);
}


caval ionut added a comment - 25/Jun/08 08:29 AM

Using the code above I was still getting the following results in ab:

This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking www.intrebari-daniel.ro (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Finished 1000 requests

Server Software: Apache/2.2.3
Server Hostname: www.intrebari-daniel.ro
Server Port: 80

Document Path: /
Document Length: 6 bytes

Concurrency Level: 50
Time taken for tests: 51.9509 seconds
Complete requests: 1000
Failed requests: 35
(Connect: 0, Length: 35, Exceptions: 0)
Write errors: 0
Total transferred: 402000 bytes
HTML transferred: 8000 bytes
Requests per second: 19.60 /sec (mean)
Time per request: 2550.476 [ms] (mean)
Time per request: 51.010 [ms] (mean, across all concurrent requests)
Transfer rate: 7.68 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 17 53.1 0 298
Processing: 33 2388 7297.9 64 41585
Waiting: 2 1958 6057.9 61 41583
Total: 33 2405 7298.0 64 41585

Percentage of the requests served within a certain time (ms)
50% 64
66% 180
75% 941
80% 1620
90% 4513
95% 15133
98% 36262
99% 38378
100% 41585 (longest request)

I have tried using Sqlite engine and the cache was fine

Thank you


Cody Pisto added a comment - 25/Jun/08 03:12 PM

I'm definitely seeing the same thing here.

Not entirely sure whats happening, but the file backend is removing metadata and cache files before they expire under high load.


Cody Pisto added a comment - 25/Jun/08 03:28 PM

I can reliably reproduce this using the testcase code using ab and -n 1000 -c 4 (instead of -c 50).

PHP 5.2.6 running as FastCGI


Cody Pisto added a comment - 25/Jun/08 03:30 PM

Also... using APC 3.0.19


Cody Pisto added a comment - 25/Jun/08 03:35 PM

Still happens with file_locking turned off.


Cody Pisto added a comment - 25/Jun/08 03:45 PM

Here is a better log, with the actual Zend_Cache_Backend_File logging output


Cody Pisto added a comment - 25/Jun/08 03:48 PM

Here is another log, this time with file_locking turned on


Cody Pisto added a comment - 25/Jun/08 03:55 PM

The problem still happens with file locking on and both read and write control turned off.

Not sure whats going on here but it basically makes the File Backend completely useless under any kind of load.

pretty standard linux setup, 2.6.18 on an ext3 fs.


Cody Pisto added a comment - 25/Jun/08 04:06 PM

OK, I've found the problem and a fix.

I will post a patch after I've had a chance to verify it and clean it up.


Cody Pisto added a comment - 25/Jun/08 05:45 PM

I've attached a patch to fix this issue. apply with patch -p0

Basically there were some implementation details of the reading and writing of files here that were very vulnerable to race conditions.

I have tested this on linux 2.6 and windows, it performs perfectly in both cases.

here is a summary of the changes in the patch:

  • removed filesize check and fread from _fileGetContents and replaced with stream_get_contents, lessens race vulnerability
  • removed flock(LOCK_UN) calls from _fileGetContents/_filePutContents, since fclose releases locks anyway, and they create a tiny and needless race condition between lock release and fclose.
  • added third optional parameter to _setMetadatas, to disable writing metadata to disk, eliminating the needless rewriting of metadatas file during _getMetadatas
  • changed fopen mode in _filePutContents to ab+ from wb, and added fseek(0) and ftruncate(0) calls after LOCK_EX, This virtually eliminates the race problems due to PHP fopen implementation. see http://bugs.php.net/bug.php?id=28371 and http://www.rooftopsolutions.nl/article/107 for reasons why this is better

caval ionut added a comment - 26/Jun/08 12:20 AM

That seems to fix it. I have added the patch to Zend library.

Thanks a lot


Cody Pisto added a comment - 26/Jun/08 05:18 AM

I will get a contributor license agreement sent to Zend today so that the patch can be applied for 1.5.3


Cody Pisto added a comment - 26/Jun/08 06:33 AM

Also, just a quick note, the likely reason Fabian didn't see the problem with his testcase, is the amount of data being cached in his testcase is small enough that it fits in the normal fwrite/fread buffers and thus minimizes the race condition a bit, using Caval's test case will hit the problem every time.


Cody Pisto added a comment - 26/Jun/08 09:47 AM

Also this patch improves performance greatly in the case of a php backend that does not have a particular cache id in its internal metadatas array, as it no longer rewrites the metadatas file to disk in this case.


Fabien MARTY added a comment - 26/Jun/08 10:41 AM

Can't reproduce this even with a 10 MB cache file !

}}
<?php

require_once 'Zend/Cache.php';

$frontendOptions = array(
   'lifetime' => 7200
);

$backendOptions = array(
    'cache_dir' => '/tmp/' // Directory where to put the cache files
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if(!$result = $cache->load('myresult')) {
    $res = '';
    for ($i = 0 ; $i < 1000000 ; $i++) {
        $res .= '0123456789';
    }
    $cache->save($res, 'myresult');
    echo "NOT CACHED";
} else {
    echo "CACHED";
}

?>

ab -c 50 -n 1000 URL

=> no failed request (a "not cached" would produce a length error)

{{


Fabien MARTY added a comment - 26/Jun/08 10:50 AM

some interesting things in the patch anyway

I'm reviewing it


Cody Pisto added a comment - 26/Jun/08 11:01 AM

The ab failed request is not what shows you the bug

activate a logger for the cache and log cached vs not cached in a log file.

What is happening is under load, without my patch, you will get constant cache misses as the metadatas and cache files are trampled due to race conditions. see the log files I have attached to the bug.


Cody Pisto added a comment - 26/Jun/08 11:04 AM

2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match
2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Core::save() / write_control : written and read data do not match
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match
2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Core::save() / write_control : written and read data do not match
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 ALERT (1): CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): NOT CACHED
2008-06-25T22:46:59+00:00 ALERT (1): <<<<END
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:46:59+00:00 WARN (4): Zend_Cache_Backend_File::load() / read_control : stored hash and computed hash do not match
2008-06-25T22:46:59+00:00 ALERT (1): >>>>START
2008-06-25T22:47:00+00:00 ALERT (1): NOT CACHED


Cody Pisto added a comment - 26/Jun/08 11:09 AM

My test case

{{

<?php

define('ROOT_DIR', '/var/www/test');
set_include_path(ROOT_DIR . '/lib' . PATH_SEPARATOR . get_include_path());

require 'Zend/Log.php';
require 'Zend/Log/Writer/Stream.php';
require 'Zend/Cache.php';

$logger = new Zend_Log();
$writer = new Zend_Log_Writer_Stream(ROOT_DIR . '/cache/log.txt');

$logger->addWriter($writer);

$logger->log('>>>>START ' . $session_id, 1);

$frontendOptions = array(
'lifetime' => 600000,
'logging' => true,
'logger' => $logger
);

$backendOptions = array(
'cache_dir' => ROOT_DIR . '/cache/', // Directory where to put the cache files
'file_locking' => true
);

$cache = Zend_Cache::factory('Core', 'File', $frontendOptions, $backendOptions);

if (!$result = $cache->load('myresult')) {
for ($i=0;$i<100000;$i++) { $result .= $i . '<br>'; }
$cache->save($result, 'myresult');
$logger->log('NOT CACHED', 1);
echo "NOT CACHED";
$logger->log('<<<<END ' . $session_id, 1);
} else { $logger->log('CACHED', 1); echo "CACHED"; $logger->log('<<<<END ' . $session_id, 1); } }}


Fabien MARTY added a comment - 26/Jun/08 02:03 PM

The part of the patch "added third optional parameter to _setMetadatas, to disable writing metadata to disk, eliminating the needless rewriting of metadatas file during _getMetadatas" is commited into the trunk and into the 1.5 branch

Other parts should be commited only into the trunk (for 1.6 release)

Could you try the head of the 1.5 branch to see if the problem is fixed for you ?


Fabien MARTY added a comment - 26/Jun/08 02:11 PM

other part of the patch commited to the trunk (for 1.6 release)

good job

thanks


Cody Pisto added a comment - 26/Jun/08 02:27 PM

Tested 1.5 branch snapshot, definitely helps reduce the problem quite a bit, but does not totally eliminate it as the full patch does.

Thanks a ton for your work, and I really look forward to the next 1.5 release and more so to 1.6!!


Wil Sinclair added a comment - 02/Sep/08 10:39 AM

Updating for the 1.6.0 release.