Skip to end of metadata
Go to start of metadata

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

Zend Framework: Zend_Translate Component Proposal

Proposed Component Name Zend_Translate
Developer Notes http://framework.zend.com/wiki/display/ZFDEV/Zend_Translate
Proposers
Revision 1.0 - 19 September 2006: Extracted from Zend_Locale (wiki revision: 22)

Table of Contents

1. Overview

The Zend_Translate component provides the Zend Framework with message translation functionality, where pre-translated strings are stored in a structured manner suitable for algorithmic use by ZF developers. It can handle different source file formats for translation. Locale-awareness is achieved using Zend_Locale.

2. References

Source Integration

Details on Translation Source Standards which have to be integrated.

TMX Standard

TMX is a XML based industry standard for translating issues.

GetText Standard

Gettext is a popular standard for open-source translation support.

Common Language Data Repository

For details on CLDR and it's implementation in the framework look at Zend_Locale

Additional Informations and International Standards

The following international standards must be used with Zend_Translate in order to maintain compatibility with Zend_Locale.

ISO 639

International Language Code Definition
ISO 639-1 for 2 letter, ISO 639-2 for 3 letter language codes

ISO 3166

International Country Code Definitions
ISO 3166-1 for 2 letter country codes

RFC 3066

Identification of languages

I18N General

Mailing Lists

Mailinglist discussions in past:

Unicode Discussion for upcoming PHP6

Discussions for Zend Framework

3. Component Requirements, Constraints, and Acceptance Criteria

  • Wrapper functionality
  • Lightweight and fast implementation
  • Simple use for ZF-user
  • Handling of different source formats
  • Automatic recognition of language the browser requests

4. Dependencies on Other Framework Components

5. Theory of Operation

Basics

Zend_Translate is a wrapper for string message translating mechanisms in the Zend Framework. It has to be simple to use and as lightweight as possible. Phrases (string messages) are translated by a simple mapping of phrases from one language to other languages, instead of using linguistics or semantic translation mechanisms. The phrases and mappings are created by people for their applications, possibly using software editing tools, such as poEdit.

Source formats/Abstraction

Zend_Translate must work with different source formats. The initial supported source formats will be:

  • Gettext
  • Tmx
  • Sql Databases (MySql, Mssql, SqLite) through Zend_DB

Each translation source will be integrated as Adaptor, so each source format has to implement the same functionality. The handling is equal for each source format, so the user has only to know the base layer Zend_Translate, and not the details for Gettext, Tmx and so on.

Locale awareness

Zend_Translate must be locale-aware, and must therefore use Zend_Locale.

Automatic language recognition

This will be done using Zend_Locale.

Locale aware formatting

Zend_Translate should be aware of localized date/time and number strings by using Zend_Locale_Format.

6. Milestones / Tasks

  • Milestone 2: [DONE] Core implementation
  • Milestone 3: [DONE] Zend_Translate base class implementation
  • Milestone 4: [DONE] Zend_Translate_Gettext, unit tests, and docs
  • Milestone 5: [DONE] Zend_Translate_Array, unit tests, and docs
  • Milestone 6: [DONE] Documentation
  • Milestone 7: [DONE] Zend_Translate_TMX implementation, unit tests, and docs.
  • Milestone 8: [0%] Zend_Translate_Sql implementation, unit tests, and docs.

Already implemented milestones:

  • Milestone 1: [DONE] Design notes

7. Class Index

  • Zend_Translate - base class
  • Zend_Translate_Exception- exception handling
  • Zend_Translate_Adapter - base adapter class
  • Zend_Translate_Array - Array adapter
  • Zend_Translate_Core - Core adapter
  • Zend_Translate_Gettext - Gettext adapter
  • Zend_Translate_Tmx - Tmx adapter
  • Zend_Translate_Sql - Sql adapte

8. Use Cases

Use of translation - HTTP_ACCEPT_LANGUAGE: de_AT

Set temporary other language

Set other language not temporary

Get actual locale settings

Set a different locale source

9. Class Skeletons

]]></ac:plain-text-body></ac:macro>

]]></ac:plain-text-body></ac:macro>

Labels:
translate translate Delete
translation translation Delete
localization localization Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Sep 19, 2006

    <p>Your translation sources include SQL, TMX, and gettext. I'd like to propose another possibility I've used in the past: PHP arrays. Basically, you hash the translation string and use it as an array key that points to an associative array of language => string pairs. The strengths of such a solution are ease of defining translation files (they're PHP arrays, and a loader creates the hashes), and performance (again, PHP arrays). Cons are that you'll tend to put all translations into a single file, which can get very unwieldy, and it doesn't lend itself well to clustering (replication issues).</p>

    1. Sep 19, 2006

      <p>Maybe you could provide a sample for your source solution so we can inspect your approach in detail.<br />
      In my head there's only place for date related issues since 2 weeks... <ac:emoticon ac:name="wink" /></p>

      <p>Btw:<br />
      Gettext and TMX are the only which will be definitely integrated for 1.0.<br />
      All others will be integrated based on proof of speed and implementation.</p>

      <p>I also thought of adding Qt as source.</p>

  2. Sep 20, 2006

    <p>Our i18n/locale team is ready to begin work on this proposal.</p>

    <p>Originally, the functionality in Zend_Translate was included in the Zend_Locale proposal.<br />
    A previous "version" of Zend_Translate has already been reviewed, and comments are here:</p>

    <p><a class="external-link" href="http://framework.zend.com/wiki/x/5Q">http://framework.zend.com/wiki/x/5Q</a></p>

    <p>Therefore, we would like to do a quick re-review (with community input). If no issues are found, we expect to promptly move Zend_Translate to the incubator status.</p>

    <p>Thanks everyone for reviewing this proposal and posting your comments. As always, we rely on the strength of our community to help improve every proposal, project, and component. Also, the i18n/locale team is <a href="http://framework.zend.com/wiki/display/ZFDEV/Home">seeking more volunteers ...</a></p>

    1. Nov 03, 2006

      <p>As there are already 6 weeks gone and there are no new comments I would be pleased it the proposal could be incubatored, as the old one already was...</p>

      <p>Pls dont forget the additional source integration proposals I wrote seperatly.</p>

  3. Sep 20, 2006

    <ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
    $lang = new Zend_Translate(new Zend_Translate_Gettext('\home\www\lang\'));
    ]]></ac:plain-text-body></ac:macro>

    <p>is a very similar construct to the current creation of a Zend_Config:</p>
    <ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
    $config = new Zend_Config(new Zend_Config_Ini('config.ini', 'section'));
    ]]></ac:plain-text-body></ac:macro>

    <p>However, <a href="http://framework.zend.com/issues/browse/ZF-388">ZF-388</a> has been created to make instantiation of a config object less complex.</p>

    <p>I suggest that we try to make instantiation of a translate object work the same way as proposed on <a href="http://framework.zend.com/issues/browse/ZF-388">ZF-388</a> to provide consistency and avoid rework later.</p>

    1. Sep 20, 2006

      <p>The original "new Subclass" approach was not from me...</p>

      <p>I personaly like the following ways of defining more.</p>

      <ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
      $lang = new Zend_Translate(Zend_Translate::gettext, 'my\dir');
      $lang2= new Zend_Translate(Zend_Translate::SQL, 'mysql', 'user', 'pwd', 'database');
      ]]></ac:plain-text-body></ac:macro>

      <p>The use cases will be changed by me to reflect this.</p>

      1. Sep 21, 2006

        <p>Using func_get_args() to handle the variable parameters that can now be taken by Zend_Translate's constructor?</p>

        1. Sep 21, 2006

          <p>I think this way it's easier to handle for end users.<br />
          And it's easy to implement.</p>

          <p>Btw:<br />
          By now theres not one line of code ready for Zend_Translate.<br />
          So we can discuss almost everything vis-a-versa without getting in problems with existing code.</p>

          <p>For now I'm focusing in getting Zend_Locale_Format ready to include Zend_Locale into the next core-release.</p>

    2. Jan 22, 2007

      <p>I agree. A syntax of</p>

      <ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
      $lang = new Zend_Translate_Gettext('...');
      ]]></ac:plain-text-body></ac:macro>

      <p>is much better, in my opinion. And, like Rob says, it is more consistent overall.</p>

  4. Sep 21, 2006

    <p>Ok, what follows is a mostly incoherent pile of thoughts on the translation component - may not always make sense, and may contain wild and unfounded assumptions... just thinking aloud. YOU HAVE BEEN WARNED :-D</p>

    <p>How about segmentation of translation files? Will there be one huge translation file for each language? This could grow to ungodly sizes pretty quickly.</p>

    <p>Maybe it would be a good idea to automatically save translations in a file that corresponds to the current controller / action, or just the current template, to keep the filesize down? Then again, this would stand in the way of loose coupling, and it would also mean duplicating translations for the most common things (like "Submit", "Okay", "Cancel" and the likes that appear virtually everywhere). On the other hand, sometimes having different translations for the same phrases can make perfect sense, depending on the current context.</p>

    <p>Caching could be a solution to the problem. But - on what level does caching make sense for translation? Surely not per method call. But if you go up to a higher level, what's left is caching entire templates, which is out of scope for the translation component and can also be undesired (what if I have highly dynamic data in my template and only want some text around it translated?).</p>

    <p>I've written a translation component before, for a moderately-sized application. I had one translation file (just PHP array definitions that were included - a KISS approach) for each language. Even though the application wasn't that big, the files grew very quickly and started to slow down each request, because they had to be read into memory entirely before being able to start translating the first phrase. This is what's made me concerned about performance of translation classes, but as you can see I only have a lot of questions, but no really good answers to the problem.</p>

    1. Sep 21, 2006

      <p>Segmentation:<br />
      -------------<br />
      Segmentation will be included as this is a nice key feature for big applications. (as my own for example)<br />
      But this must be handled in the subclasses by each souce implementation, and there will be a special <br />
      prerequisit for useage.</p>

      <p>For example all segmented filed will stay in the same directory.<br />
      But we have not thought of all pros and cons of this internally.</p>

      <p>For SQL it would make no sense to split, as we then would have to connect to 2 databases.<br />
      But for gettext and tmx a segmentation will be very comfortable.</p>

      <p>Save translations:<br />
      ------------------<br />
      Zend_Translate itself will NOT save translation strings in files.<br />
      It has to translate. Strings should be saved by a tool, not by the framework.</p>

      <p>The framework itself only handles the source files.<br />
      For segmented files we think of a fallback machanism...<br />
      Already translated phrases will be cached, others will fallback to the next translation source if not found.<br />
      You can set a searchorder for the sources by hand, or automatically.</p>

      <p>A correlation from controller / action to a corresponding translation file would be no good idea.<br />
      This would lead to many small files which is also a bad idea as having one supersize file.</p>

      <p>Caching:<br />
      --------<br />
      We could add a possibility to enable/disable caching...</p>

      <p>PHP-Array source:<br />
      -----------------<br />
      Matthew already asked of adding simple PHP-Array as translation source class.<br />
      I'm not sure if we should do this...</p>

      <p>I think this is quite unhandy but I never had worked small business <ac:emoticon ac:name="wink" /></p>

      <p>Anyway, if we decide to add this, we must include a warning in the documentation that this class<br />
      is only performant for maximum xxxxx Array entries. Otherwise using gettext or tmx is highly recommended.</p>

  5. Dec 27, 2006

    <p>Anyone use TMX before? It looks bulky compared to gettext, also storing all the translations in a single file like it suggests is going to cause undue overhead on a large site.</p>

    1. Dec 27, 2006

      <p>TMX is a industry standard.<br />
      It is commonly used when different programs (C++/Java/Perl) use the same translation base.</p>

      <p>Pro's are</p>
      <ul>
      <li>Human readable due to XML basis</li>
      <li>Multi-Thread save (gettext is not for now)</li>
      <li>Useable also in environments where gettext is not pre-installed (like Windows)</li>
      </ul>

      <p>Btw:<br />
      With TMX you have not to store all translation strings in one file. Like gettext you can seperate the translations to multiple files and you have to switch the context for using a other file.</p>

  6. Dec 27, 2006

    <ac:macro ac:name="note"><ac:parameter ac:name="title">Zend Feedback</ac:parameter><ac:rich-text-body>
    <p>Zend_Translate is conditionally accepted subject to the clarifications below. Also, there are numerous comments on the Zend_Locale proposal that apply to this proposal, since this proposal was split off from the Zend_Locale proposal. As such, the relevant discussions in the comments of the <ac:link><ri:page ri:content-title="Zend_Locale Proposal - Thomas Weidner" /><ac:link-body>Zend_Locale proposal</ac:link-body></ac:link> should be considered applicable to this proposal. For example:</p>

    <blockquote><p>The SQL backend storage should remain in the incubator until performance<br />
    benchmarks demonstrate performance is comparable to the TMX and Gettext<br />
    storage backends to avoid potential complications with developers<br />
    encountering unexpected performance problems.</p></blockquote>

    <p>The native gettext support wrapped by PHP's gettext(), textdomain(), etc. contains some idiosyncrasies that could be considered "unfriendly" and incompatible with the Zend Framework's goal of extreme simplicity. For example:</p>

    <ul>
    <li>Most gettext libraries check for translation files once, and then store them in a cache. Since they were not designed to work with web servers, if a cached file is changed, then the webserver normally only sees the change after restarting.</li>
    </ul>

    <ul>
    <li>When using the system's native gettext library, the developer should insure that all needed locales have been added (many systems only have the local locale(s) pre-installed). For example, on Debian, after adding the desired locales, one must then run "locale-gen", as the root superuser.</li>
    </ul>

    <ul>
    <li>The setlocale() function should be called before gettext() before fetching messages for non-default locales. However, setlocale() is not thread-safe.</li>
    </ul>

    <p>Therefore, the Zend gettext-compatible adapter should provide a pure-PHP solution that works with gettext formatted translation files, does not use the gettext() PHP function, and avoids these various idiosyncrasies of the traditional gettext C library by:</p>

    <ul>
    <li>using Zend_Cache for caching support (if needed)</li>
    <li>support for segmentation of lengthy files</li>
    <li>interoperability with Zend_Locale to remove dependencies on OS-specific locale files that may or may not be present.</li>
    </ul>
    </ac:rich-text-body></ac:macro>

  7. Feb 28, 2008

    <p>I'm using a per "module" architecture</p>
    <ul>
    <li>application
    <ul>
    <li>modules
    <ul>
    <li>blog</li>
    <li>cms</li>
    <li>...</li>
    </ul>
    </li>
    <li>models</li>
    <li>layouts
    <ul>
    <li>
    <ul>
    <li>layout.phtml</li>
    <li>admin.phtml</li>
    </ul>
    </li>
    </ul>
    </li>
    <li>languages
    <ul>
    <li>blog
    <ul>
    <li>en.tmx</li>
    <li>es.tmx</li>
    <li>...</li>
    </ul>
    </li>
    <li>cms
    <ul>
    <li>...</li>
    </ul>
    </li>
    <li>layouts</li>
    </ul>
    </li>
    </ul>
    </li>
    </ul>

    <p>and I believe that would be nice to have some mechanism to load :</p>
    <ul>
    <li>a "global" translation</li>
    <li>a "layout" translation file</li>
    <li>and if it exists a per "module" and/or "controller" translation</li>
    </ul>

    <p>we plan to build various independant module : blog, forum, cms ...<br />
    deployment will not be easy if we have just on file for all the application folder.</p>

    <p>a "global" translation file is suficient to store very common things like buttons labels or menu.<br />
    but it's very important that each module has it's own translation file.</p>

    <p>"layout(s)" message ids could take place in the global file but what happens if<br />
    a module has his own layouts ? </p>

    <p>It would be possible to implement this as a Zend_Controller_Plugin but I'm not sure that is the best way. I'm trying to do this directly in the Translate view helper.</p>

    <p>I'm prefer to use TMX or XLIFF adapter instead of gettext or csv but I think that it's better to separate each languages. a TMX should contains one main language and his variations. Moreover it will be easier to maintain for the translation team.</p>

    1. Feb 28, 2008

      <p>Sebastian...</p>

      <p>You may not have mentioned it but this proposal is already in core since about one year.<br />
      And it is captable of scanning/loading complete directory structures since about 4 months.</p>