Skip to end of metadata
Go to start of metadata

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

Zend Framework: Zend_Mime_Magic Component Proposal

Proposed Component Name Zend_Mime_Magic
Developer Notes http://framework.zend.com/wiki/display/ZFDEV/Zend_Mime_Magic
Proposers Matthew Ratzloff
Revision 0.1 - March 9, 2007: Posted initial proposal (wiki revision: 22)

Table of Contents

1. Overview

Zend_Mime_Magic attempts to detect the MIME type of a file by comparing the file's byte signature to known "magic" values. By implementing it natively in PHP, it is cross-platform without the need to install the Fileinfo extension or rely on the (deprecated) mime_content_type() function. This makes installation of open-source software that relies on MIME type detection easier, because no server configuration changes are required. Also, because it uses the standard Linux magic file format, users can substitute their own magic.mime file if they wish, or add custom rules they find on the Web.

2. References

3. Component Requirements, Constraints, and Acceptance Criteria

  • This component will attempt to discern the correct file type for a given file.
  • This component will be compatible with the standard magic.mime file format, common to most Linux installations.
  • This component will not be compatible with the (inferior) magic.mime file that comes packaged with PHP.
  • This component will not be as fast as a C extension, but will not be prohibitively slow.
  • This component will be completely serializable, to allow user-level caching. This will increase its speed and ease of use.

4. Dependencies on Other Framework Components

  • Zend_Exception

5. Theory of Operation

See use cases.

6. Milestones / Tasks

  • Milestone 1: [DONE] Working prototype exists.
  • Milestone 2: Design notes will be published here
  • Milestone 3: Prototype checked into the incubator.
  • Milestone 4: Unit tests exist, work, and are checked into Subversion.
  • Milestone 5: Initial documentation exists.

7. Class Index

  • Zend_Mime_Magic
  • Zend_Mime_Magic_Exception
  • Zend_Mime_Magic_Parser
  • Zend_Mime_Magic_Test
  • Zend_Mime_Magic_Test_Result

8. Use Cases

UC-01
UC-02

9. Class Skeletons

]]></ac:plain-text-body></ac:macro>

]]></ac:plain-text-body></ac:macro>

<h2>10. Initial speed/accuracy comparisons:</h2>
<p><br class="atl-forced-newline" /></p>

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[]]></ac:plain-text-body></ac:macro>
<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

]]></ac:plain-text-body></ac:macro>

Labels:
zend_mime_magic zend_mime_magic Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Mar 10, 2007

    <p>Sorry, but I'm not entirely sure I understand exactly what advantage the serialization functions provide. Without those, it almost seems like the class methods would be better if they were static. Can you clarify?</p>

    1. Mar 10, 2007

      <p>There's a fixed cost to parsing the magic.mime file (splitting rows and columns, converting values, deriving new columns, etc.). The object does this on construction. By serializing the object you only have to parse the magic.mime file once, not every request.</p>

  2. Mar 12, 2007

    <p><a class="external-link" href="http://www.php.net/manual/en/ref.fileinfo.php">http://www.php.net/manual/en/ref.fileinfo.php</a></p>

    <p>I know several have expressed interest in a more general purpose ZF file management and manipulation component. Perhaps "mime magic" fits into such a component?</p>

    <p>Also, the heuristics traditionally used to perform "mime magic" use a fair amount of CPU cycles. Thus, those that care about performance probably would want to use the PECL extension. So, if this proposal were API compliant with the PECL extension, and made a seemless transition for those who have the extension, then I see even greater usefulness for this proposed component.</p>

    1. Mar 12, 2007

      <blockquote>
      <p>Also, the heuristics traditionally used to perform "mime magic" use a fair amount of CPU cycles. Thus, those that care about performance probably would want to use the PECL extension. So, if this proposal were API compliant with the PECL extension, and made a seemless transition for those who have the extension, then I see even greater usefulness for this proposed component.</p></blockquote>

      <p>The idea was that for heavy use of file detection, Fileinfo could be installed and work seamlessly with Zend_Mime_Magic, but not the other way around. I must admit that I dislike Fileinfo's API and have no desire to imitate it.</p>

      <p>Using Fileinfo (even if the extension is loaded) should also be optional to turn off--not only for testing, but also since Fileinfo's magic file parser apparently has a severe bug, at least in my version. From what it looks like, it doesn't properly handle \n literals in the magic file. See the accuracy test above for the effects of this bug.</p>

      <p>For most usage, there should be no problem using the native parser. It only consumes around 50-70 ticks (0.005-0.007 seconds) per file; not really a bottleneck when compared to bandwidth, etc. Also, in addition to fine-tuning the accuracy, I plan to rearrange the magic file so that the most common Web file formats are at the very top.</p>

      <blockquote>
      <p>I know several have expressed interest in a more general purpose ZF file management and manipulation component. Perhaps "mime magic" fits into such a component?</p></blockquote>

      <p>That's an interesting idea, but I think it's more likely that if there was a "Zend_File" component, it would just talk to Zend_Mime_Magic for MIME type information.</p>

      <p>If anyone can ever figure out the various SPL iterators (<ac:emoticon ac:name="wink" />) we could do some interesting things with abstracted directory listings (recursing, caching, filtering)...</p>

      1. Mar 12, 2007

        <p>So long as we don't make it difficult for developers to throw a switch and use the PECL extensions, then I would say we are API "compliant". I'm disappointed to see the poor results from FileInfo above. I've been using <code>mime_content_type()</code> for a long time, and I'm sorry to see that is has been deprecated, since it works <ac:emoticon ac:name="smile" /></p>

        <p>I really like your idea of optimizing this component for web apps. As a configuration option, the developer might choose to use a specific magic file containing only what they need, instead of all types known throughout eternity like the UNIX file(1) utility. I'm sure that would greatly speed up processing.</p>

        <p>Right, I wouldn't expect extremely tight coupling between a file management utility, but only something loose to simplify use. Until the utility exists, we can only hypothesize.</p>

  3. May 03, 2007

    <p>I've been looking for a dependency-less way to detect mime types in php, and since the deprecation of mime_content_type() there doesn't appear to be a way to do so... but then I found this proposal. How's it coming along? When can we expect it to be in the incubator? </p>

    <p>I would love to be able to use it in some of the project proposals I'm working on.</p>

    1. May 03, 2007

      <p>Hi Derek,</p>

      <p>It's functional, and at some point I'm going to refactor it to use my not-yet-proposed <code>Zend_Io</code> stream reading/writing class, so it'll be more robust once I do that. While refactoring I'll also improve the accuracy.</p>

      <p>To be honest, though, I haven't worked on it in a little while because the focus is obviously on getting Zend Framework ready for 1.0, and approval of new components has taken a back seat. Once it's approved and <code>Zend_Io</code> is complete I can finish this component up pretty quickly.</p>

  4. Sep 19, 2007

    <p>Hi Matthew, <br />
    Has there been any movement on the proposal recently? I'd quite like to use it, or at least try it out. Whereabouts is the prototype available from?</p>

    1. Sep 19, 2007

      <p>Hi Jack,</p>

      <p>I bought a house recently and unfortunately don't have a lot of time/energy to devote to off-hours programming lately. I anticipate that changing in the next month or two.</p>

      <p>Believe me, I'd like to get it out there too!</p>

  5. Sep 19, 2007

    <p>Hi Matthew, <br />
    Ok great, good to know what's happening, I'll just have to wait <ac:emoticon ac:name="smile" /> . Thanks for the update.</p>