Issues

ZF-9566: Zend_Feed_Writer_Renderer_Entry_Atom: a (utf-8) non breaking space makes atom feed creation fail

Issue Type: Bug Created: 2010-03-29T08:00:49.000+0000 Last Updated: 2010-04-29T07:44:37.000+0000 Status: Resolved Fix version(s): - 1.10.5 (26/May/10)

Reporter: Lasse Fister (lasse) Assignee: Pádraic Brady (padraic) Tags: - Zend_Feed_Writer

Related issues: Attachments:

Description

Warning: DOMDocument::loadXML() [domdocument.loadxml]: Entity 'nbsp' not defined in Entity, line: 1 in /home/commander/web/libraries/ZendFramework-1.10.2/library/Zend/Feed/Writer/Renderer/Entry/Atom.php on line 354

That's because Zend_Feed_Writer_Renderer_Entry_Atom::_loadXhtml() attempts to create XML using tidy. Tidy itself substitutes utf-8 non breaking spaces width   entities which in turn are not XML (and not Atom IMHO).

reproduce:

<pre class="highlight">
<?php
$this->layout()->disableLayout();
$feed = new Zend_Feed_Writer_Feed();
$feed->setTitle('test');
$feed->setLink('<a href="http://www.testtest.org">http://www.testtest.org</a>');
$feed->setFeedLink('<a href="http://www.testtest.org/feed">http://www.testtest.org/feed</a>', 'atom');
$feed->setDescription('test articles');
$feed->setId('<a href="http://www.testtest.org/feed/atom/">http://www.testtest.org/feed/atom/</a>');
$feed->setEncoding('UTF-8');
$feed->addAuthor(array(
    'name'  => 'test',
    'email' => 'test@testtest.org',
    'uri'   => '<a href="http://www.testtest.org">http://www.testtest.org</a>',
));
$feed->setDateModified(new Zend_Date());
$entry = $feed->createEntry();
$entry->setTitle('test article');
$entry->setId('<a href="http://www.testtest.org/feed/testarticle1id">http://www.testtest.org/feed/testarticle1id</a>');
$entry->setDateCreated(new Zend_Date());
$entry->setDateModified(new Zend_Date());

//using html_entity_decode to get an utf-8 no breaking space
$htmlSnippet = '

Test'.html_entity_decode(' ', ENT_COMPAT, 'UTF-8').'Content

';
$entry->setContent($htmlSnippet);

$entry->setDescription('test Description');
$feed->addEntry($entry);
echo $feed->export('atom');

adding 'quote-nbsp' => false to the tidy configuration resolved my problem:

<pre class="highlight">

    /**
     * Load a HTML string and attempt to normalise to XML
     */
    protected function _loadXhtml($content)
    {
        $xhtml = '';
        if (FALSE && class_exists('tidy', false)) {
            $tidy = new tidy;
            $config = array(
                'output-xhtml' => true,
                'show-body-only' => true,
                'quote-nbsp' => false
            );
...

But a more robust implementation of _loadXhtml would propably be better. The concept of "attempt[ing] to normalise to XML" seems not to fit in here. The Example above doesn't fail when tidy isn't installed (or used) at all. Let the user take care of passing something valid and throw an Exception on failure or turn the DOM warning into an Exception and try to handle that (possibly width tidy or a callback), instead of using tidy regardless of what the input is.

Regards, Lasse

Comments

Posted by Lasse Fister (lasse) on 2010-03-29T08:04:06.000+0000

removed a typo

Posted by Tobias Munk (schmunk) on 2010-04-09T08:52:41.000+0000

I got exactly the same problem.

My dev machine has no tidy and I created a valid XML string manually, while our production machine has tidy and the attempt to clean the string breaks it.

I found a function for HTML to XML entity replacement here:http://inanimatt.com/php-convert-entities.php

Posted by Tobias Munk (schmunk) on 2010-04-12T15:11:47.000+0000

Let the user take care of passing something valid and throw an Exception on failure or turn the DOM warning into an Exception and try to handle that (possibly width tidy or a callback), instead of using tidy regardless of what the input is. I second that.

Posted by Pádraic Brady (padraic) on 2010-04-29T07:44:36.000+0000

Resolved in r22054.

I'll investigate a more robust solution in time for ZF 2.0 and experiment with a better cleanup approach. I'll also build in options to disable the Tidy stage (or remove it completely unless warranted/can report on a fail with a consistently helpful message).

Have you found an issue?

See the Overview section for more details.

Copyright

© 2006-2016 by Zend, a Rogue Wave Company. Made with by awesome contributors.

This website is built using zend-expressive and it runs on PHP 7.

Contacts