ZF-6548: Zend_Cache_Backend_File Indexing

Description

When cleaning caches based on tags or expiry (manual or automated), if there are a lot of caches with the same prefix it will take too long and in some cases timeout requests.

I propose that tags and expiries be stored in central index files for each prefix, that way Zend_Cache_Backend_File doesn't have to open every single cache metadata file to figure out which ones need to be cleaned, instead it could open just one file and loop through an array.

This could be made more efficient not only by storing an array of all caches, but also storing arrays for each tag with all caches containing that tag. This will minimise memory/processor usage.

I came across this problem when a highly load balanced would "randomly" take 30-60 seconds or more on some requests. It turned out the default automatic cleanup of 1/10 was occuring often and i had hundreds of thousands of caches with the same prefix (equivalent to the number of rows in a table). Each time it would loop through hundreds of thousands of files over NFS just to see if they were due to expire.

More details here: http://www.yewchube.com/?p=58 http://www.yewchube.com/?p=62

Comments

change Assignee because I'm inactive now

If we create an index file to store all tags and expire times for every entry it would slow down writes.

But for speed up cleaning by expire on ZF2 we are using filemtime to calculate the expire time of a cache entry. Tags will be stored anymore on a metadata file but the new structure would allow you to set the tagging plugin on file storage which enables tagging support for every backend and is working with an index.

I would be happy if you could test the performance of the new filesystem storage. -> http://framework.zend.com/wiki/display/… -> http://github.com/marc-mabe/zf2/tree/cache

Greetings

This issue won't be fixed in ZF1.

For ZF2 clearing should be faster because it's using GlobIterator instead of glob and don't need to read much files as ZF1 needs. It's also searching (reading) files of each cache file to get the tags. If virtial tagging will be implemented this can be used for tagging, too. -> see ZF2-137