Zend Framework: Zend_Mime_Magic Component Proposal
| Proposed Component Name | Zend_Mime_Magic |
|---|---|
| Developer Notes | http://framework.zend.com/wiki/display/ZFDEV/Zend_Mime_Magic |
| Proposers | Matthew Ratzloff |
| Revision | 0.1 - March 9, 2007: Posted initial proposal (wiki revision: 21) |
Table of Contents
1. Overview
Zend_Mime_Magic attempts to detect the MIME type of a file by comparing the file's byte signature to known "magic" values. By implementing it natively in PHP, it is cross-platform without the need to install the Fileinfo extension or rely on the (deprecated) mime_content_type() function. This makes installation of open-source software that relies on MIME type detection easier, because no server configuration changes are required. Also, because it uses the standard Linux magic file format, users can substitute their own magic.mime file if they wish, or add custom rules they find on the Web.
2. References
- Apache documentation for mod_mime_magic - A brief overview of the magic file format
3. Component Requirements, Constraints, and Acceptance Criteria
- This component will attempt to discern the correct file type for a given file.
- This component will be compatible with the standard magic.mime file format, common to most Linux installations.
- This component will not be compatible with the (inferior) magic.mime file that comes packaged with PHP.
- This component will not be as fast as a C extension, but will not be prohibitively slow.
- This component will be completely serializable, to allow user-level caching. This will increase its speed and ease of use.
4. Dependencies on Other Framework Components
- Zend_Exception
5. Theory of Operation
See use cases.
6. Milestones / Tasks
- Milestone 1: [DONE] Working prototype exists.
- Milestone 2: Design notes will be published here
- Milestone 3: Prototype checked into the incubator.
- Milestone 4: Unit tests exist, work, and are checked into Subversion.
- Milestone 5: Initial documentation exists.
7. Class Index
- Zend_Mime_Magic
- Zend_Mime_Magic_Exception
- Zend_Mime_Magic_Parser
- Zend_Mime_Magic_Test
- Zend_Mime_Magic_Test_Result
8. Use Cases
| UC-01 |
|---|
| UC-02 |
|---|
9. Class Skeletons
10. Initial speed/accuracy comparisons:
http://www.php.net/manual/en/ref.fileinfo.php
I know several have expressed interest in a more general purpose ZF file management and manipulation component. Perhaps "mime magic" fits into such a component?
Also, the heuristics traditionally used to perform "mime magic" use a fair amount of CPU cycles. Thus, those that care about performance probably would want to use the PECL extension. So, if this proposal were API compliant with the PECL extension, and made a seemless transition for those who have the extension, then I see even greater usefulness for this proposed component.
Also, the heuristics traditionally used to perform "mime magic" use a fair amount of CPU cycles. Thus, those that care about performance probably would want to use the PECL extension. So, if this proposal were API compliant with the PECL extension, and made a seemless transition for those who have the extension, then I see even greater usefulness for this proposed component.
The idea was that for heavy use of file detection, Fileinfo could be installed and work seamlessly with Zend_Mime_Magic, but not the other way around. I must admit that I dislike Fileinfo's API and have no desire to imitate it.
Using Fileinfo (even if the extension is loaded) should also be optional to turn off--not only for testing, but also since Fileinfo's magic file parser apparently has a severe bug, at least in my version. From what it looks like, it doesn't properly handle \n literals in the magic file. See the accuracy test above for the effects of this bug.
For most usage, there should be no problem using the native parser. It only consumes around 50-70 ticks (0.005-0.007 seconds) per file; not really a bottleneck when compared to bandwidth, etc. Also, in addition to fine-tuning the accuracy, I plan to rearrange the magic file so that the most common Web file formats are at the very top.
I know several have expressed interest in a more general purpose ZF file management and manipulation component. Perhaps "mime magic" fits into such a component?
That's an interesting idea, but I think it's more likely that if there was a "Zend_File" component, it would just talk to Zend_Mime_Magic for MIME type information.
If anyone can ever figure out the various SPL iterators (
) we could do some interesting things with abstracted directory listings (recursing, caching, filtering)...
So long as we don't make it difficult for developers to throw a switch and use the PECL extensions, then I would say we are API "compliant". I'm disappointed to see the poor results from FileInfo above. I've been using mime_content_type() for a long time, and I'm sorry to see that is has been deprecated, since it works ![]()
I really like your idea of optimizing this component for web apps. As a configuration option, the developer might choose to use a specific magic file containing only what they need, instead of all types known throughout eternity like the UNIX file(1) utility. I'm sure that would greatly speed up processing.
Right, I wouldn't expect extremely tight coupling between a file management utility, but only something loose to simplify use. Until the utility exists, we can only hypothesize.
I've been looking for a dependency-less way to detect mime types in php, and since the deprecation of mime_content_type() there doesn't appear to be a way to do so... but then I found this proposal. How's it coming along? When can we expect it to be in the incubator?
I would love to be able to use it in some of the project proposals I'm working on.
Hi Derek,
It's functional, and at some point I'm going to refactor it to use my not-yet-proposed Zend_Io stream reading/writing class, so it'll be more robust once I do that. While refactoring I'll also improve the accuracy.
To be honest, though, I haven't worked on it in a little while because the focus is obviously on getting Zend Framework ready for 1.0, and approval of new components has taken a back seat. Once it's approved and Zend_Io is complete I can finish this component up pretty quickly.
Hi Matthew,
Has there been any movement on the proposal recently? I'd quite like to use it, or at least try it out. Whereabouts is the prototype available from?
ZF Home Page
Code Browser
Wiki Dashboard
Sorry, but I'm not entirely sure I understand exactly what advantage the serialization functions provide. Without those, it almost seems like the class methods would be better if they were static. Can you clarify?