Issues

ZF-8457: Improving scalability of Zend_XmlRpc by replacing DOMDocument with plain text templates

Description

Hey guys! I'm currently working on a WebService using Zend_XmlRpc. The service has to handle rather large responses. However, as I had to find out, the (imho excessive) use of DOMDocument instances for building XML requests/responses becomes a tremendous parameter as the number of processed values grows.

Replacing DOMDocument with plain text templates in Zend_XmlRpc_Value_*::saveXML() improved performance a lot. The cummulativ execution time of Zend_XmlRpc_Response::__toString() became about 3 times faster.

Attached to this report you can find a patch against ZF-1.9.3PL1 which solves this issue. A fork of Zend_XmlRpc which can be used as a standalone replacement (i.e. no need to touch ZF code) can be found here: http://github.com/polycaster/Polycast_XmlRpc/

Cheers! Andreas

Comments

Using plain text templates is a no-go due to character encoding issues. DOMDocument largely masks these for us and ensures they do not become a problem; this was one reason for choosing it.

One alternative implementation would be to use XMLWriter, which would be quite possible. XMLWriter is very fast, and also would mask the encoding issues. Would you be willing to try an implementation using it and benchmarking it?

Sure, encoding is a good point. I'll give the XMLWriter solution a try and keep you posted.

XDebug profiles of two similar request. The only difference is the XmlRpc implementation.

On the left handside is the current ZF implementation. On the right the patched version using XMLWriter.

Looks like XMLWriter is performing way better.

Argh, sorry, the last patch was broken. Here's a proper one :)

As I'm on vacation for the next week, I would like to work on this issue. @Matthew: could we just bring that in without a proposal? @Everybody: I would prefer an adapter based version, as XMLWriter is the more exotic extension and I would imagine it might not be available everywhere. @Andreas: are you available in the next week to work on this issue?

I've committed a refactored version of the client which introduces a new subpackage (Zend_XmlRpc_Generator_*). This generators are used to build the XML of request/response/values. It would be very kind if you could redo your performance measurements with the version in standard trunk and see if it is still as fast as you expected it to be.

I'll perform another benchmark in the next days. Have been offline the last week. I'll keep you posted.

Sorry for the dely.

I tested the new version of Zend_XmlRpc with the same benchmark as before, but now, using XMLWriter the memory goes through the roof with just a fracture of the number of Zend_XmlRpc Values involved.

For comparison: Before, I was able to handle about 900 (more or less equal) response objects within approx. 3.2 seconds with the benchmark application. Now it works with 100 of them in ~3ms, but runs out of memory with larger responses - whereas the memory limit is quite fair.

So, the CPU load looks constantly fine. I can't see a drastic overhead compared to the version without the adapters. But, I can't really compare the results as I was not able to test on the large scale.

However, the memory thing is a no-go in my opinion.

I traced the benchmark with xdebug and found after all, that the problem can be solved if XMLWriter is flush()ed more frequently.

Find a my proposal for a solution attached.

@Lars, regarding the patch: Before I read your latest patch I had already worked on an adapter-based solution, but didn't finish it, before you released yours. After I did the benchmark it was easier for me to modify my own code, because it was coincidentally closer to the flush-more-often-solution. Adapt the patch as you like, but I'm sorry I reverted some of your changes.

Besides (and totally not related to the actual issue anymore): I moved the ?etEncoding() methods to Zend_XmlRpc_Generator_AbstractGenerator. I think it is better to solve this with dependency injection as it gets you out of the troubles you get in, when setting the generator after you set the encoding.

Cheers! Andreas

No comment at all?

I'm sorry for the long time I took to respond, was on vacation. Will try to fix the issue next weekend and let you know, when I reached anything good.

Any news about this issue yet?

Maybe it'd help to set the ticket status to "open" :)

With the memory problem still unsolved the situation became actually worse and the component moved from slow to not usable at all. Well, that is in my scenario, but I suppose others may be affected as well.

I second the suggestion to re-open the ticket. I am curious why the XMLWriter generator is now the default (if XMLWriter is enabled). Given the "new-ness" of this feature and the concerns about its memory usage, I think the DOMDocument generator should be used unless there is a call to {{Zend_XmlRpc_Value::setGenerator(new Zend_XmlRpc_Generator_XMLWriter());}}

This is just my monthly reminder to have a look at this ticket or at least give a comment if you're going to work on this.

We experience similar issues with 1.10.2. XmlRpc is pretty unstable, next to unusable for us.

I've been trying all thoughts of things - from caching the definitions (two ways) to {{XmlWriter}} etc.. None of them seem to have a greater impact in my setup.

My issue is as follows:

We have an service layer in our application which we use to authenticate people on the backend and also grab useful information from the backend for customer support and so on. One of the biggest things is a search by name across our user data. The relevant SQL query takes 0.08 seconds to run which is acceptable for customer support, but it returns 150 records.

Each record is an array of username, first, last, email, account type and when the account was created and last modified. Now whenever I have a large result set, I keep getting socket timeouts (10 seconds). I've tried using a custom {{Zend_Http_Client}} on {{Zend_XmlRpc_Client}} with a higher timeout to no avail.

We recently upgraded from 1.8.1 to 1.10.2 and this is when those issues started. I've been profiling a lot over the last two days and it seems that the XML generating is taking an awful long time on the server, but exceeding 10 seconds is still insane.

I've also tried to {{setSkipSystemLookup(true)}} - which didn't help either.

It seems to work stable and consistent with result sets smaller than 30-40 records. The funky thing is that this used to work fine. We've been using the {{Zend_XmlRpc}} components for over two years in production.

I've ruled out network connectivity and DNS etc. too.

This is related to ZF-9504. The new "Generator" method of doing things in 1.10 is causing an exponential growth in memory usage when using DOMDocument based generation. This too increases script execution time.

With the fix to ZF-9504 I think this should be considered resolved as well. After this patch I'm able to return XMLResponses with a 1500+ item resultset in < 4 seconds on my laptop.

My resultset looks like:

array(
    'values' => array(
        0 => array(
            'id' => 1,
            'name' => 'Foobar',
            'group_id' => 7,
            'priority' => 0,
        ),
       // ...
       1550 => array(/* ... */),
    ),
    'last_modified' => '2010-01-01 00:00:00',
)

Performance is nearly identical for both DOMDocument and XMLWriter, ~19M total mem usage and ~3.7s wall-time.