Skip to end of metadata
Go to start of metadata

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

<ac:macro ac:name="unmigrated-inline-wiki-markup"><ac:plain-text-body><![CDATA[

Zend Framework: Zend_Scheduler Component Proposal

Proposed Component Name Zend_Scheduler
Developer Notes http://framework.zend.com/wiki/display/ZFDEV/Zend_Scheduler
Proposers Matthew Ratzloff
Revision 0.3 - 9 December 2006: Scheduling rules are fully-functional now and are built on Zend_Date. Combined separate Rule_* classes into Zend_Scheduler_Task_Rule.

Subversion: http://framework.zend.com/svn/laboratory/Zend_Scheduler/

(wiki revision: 21)

Table of Contents

1. Overview

Zend_Scheduler is a request-based job scheduling component designed to allow easy scheduling of tasks based on simple temporal constraints, without the use of OS-based applications such as Crontab.

2. References

  • Crontab - Partial basis of constraint syntax

3. Component Requirements, Constraints, and Acceptance Criteria

  • Common task creation interface
  • Modular design

4. Dependencies on Other Framework Components

  • Zend_Controller_Front
  • Zend_Controller_Request_Abstract
  • Zend_Controller_Router
  • Zend_Exception

5. Theory of Operation

Users create a Zend_Scheduler instance, then add named Zend_Scheduler_Task instances to the scheduler. Multiple requests can be assigned to any given task, and a limit can be placed on the number of tasks to execute in any given request.

6. Milestones / Tasks

7. Class Index

  • Zend_Scheduler
  • Zend_Scheduler_Exception
  • Zend_Scheduler_Task
  • Zend_Scheduler_Task_Rule
  • Zend_Scheduler_Backend_Interface
  • Zend_Scheduler_Backend_File
  • Zend_Scheduler_Backend_Db

8. Use Cases

9. Class Skeletons

]]></ac:plain-text-body></ac:macro>

]]></ac:plain-text-body></ac:macro>

Labels:
zend_scheduler zend_scheduler Delete
scheduling scheduling Delete
crontab crontab Delete
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.
  1. Jul 14, 2006

    <p>Is this going to add the "command" to the real crontab?</p>

    <p>What if a person doesn't have access to the crontab? Or what if they can only get a single line item added?</p>

    <p>I think since they really want things to be web focused this should more be a integrated solution with a schedular and a runner.</p>

    <p>This way the runner can be run through a cron, or through a seperate "poor mans" solution that runs each page load.</p>

    1. Jul 14, 2006

      <p>Sure. It's my understanding that crontab access is not unusual in hosted environments. I don't understand why it would be, since it runs under your own username.</p>

      <p>If someone doesn't have access to Crontab I'm sure you could pass that into a Zend_Scheduler_Runner class and then call Zend_Scheduler_Runner->run() in the bootstrap file.</p>

      1. Jul 14, 2006

        <p>As someone who has worked for a couple hosting providers crontab access is not a norm and is not provided with some popular control panels by default.</p>

        <p>A large percent of hosting companys do not like people running background processes.</p>

        <p>Also how can they use a runner class when the whole basis of your class is storing the data in the cronfile/windows schedular.</p>

        <p>Also permissions will limit any scripts ability to run this, mosts apache processes/php scripts do not have much in the way of permissions and a lot of time are not running with the users permissions but instead with nobody or apache.</p>

        <p>Those that do run as the proper user are normally chrooted to that user and again have limited access to run any programs.</p>

        1. Jul 14, 2006

          <blockquote>
          <p>Also how can they use a runner class when the whole basis of your class is storing the data in the cronfile/windows schedular.</p></blockquote>

          <p>I don't understand. One can create any interface they want with Zend_Scheduler_Interface. That could easily include a file-based system that is executed with additional functionality. You would probably not have both storage and execution functions in one class, like I said in my previous reply, but you could split them up. Pass a class that manages a crontab-like file (independent of Crontab) and then have a class essentially mimic Crontab by calling it with every page request.</p>

          <blockquote>
          <p>Also permissions will limit any scripts ability to run this, mosts apache processes/php scripts do not have much in the way of permissions and a lot of time are not running with the users permissions but instead with nobody or apache.</p></blockquote>

          <p>Well, that's username/password. It would have to be set up with sudo...</p>

          <p>I guess I'm not opposed to Zend_Scheduler recreating all the functionality of crontab in PHP--it would be more difficult to write but the interface would be universal. Users could just register the (one) task in Crontab or whatever manually. Of course, that process would have to execute once a minute.</p>

          1. Jul 14, 2006

            <p>A user can't normally set sudo operations, only root/admin so this would require admin assistance to install.</p>

            <p>The process would only need to be run as often as the developer feels is needed.</p>

            <p>If a website only has need for daily tasks they could set it up to run daily, Where something like a game might need cleanup/updates every 5 minutes.</p>

            <p>Its very rare in the website world that something needs to be run once a minute, heck its rare to have to run a normal cron that often.</p>

            1. Jul 14, 2006

              <p>Maybe this would work, best of both worlds, Move all the crontab/windows schedular stuff into a "storage" container style setup</p>

              <p>Zend_Scheduler_Storage_Db<br />
              Zend_Scheduler_Storage_File<br />
              Zend_Scheduler_Storage_CronTab<br />
              Zend_Scheduler_Storage_Windows</p>

              <p>Storage methods CronTab and Windows don't require runners, matter fact if you try to start a runner using those storage methods they should exception out.</p>

              <p>$task->command would only exist for CronTab/Windows<br />
              $task->callback would exist for the other storage methods</p>

              <p>$task->name should follow Zend naming conventions like Nightly_Cron this way storage methods can use that as a "key" in arrays/methods.</p>

              <p>I think the whole 0/2 - crontab convention should be tossed, windows people don't understand it, a more literal interface should be used in my opinion</p>

              1. Jul 14, 2006

                <p>This is getting too complex on the user side.</p>

                <p>What if tasks are something like adding routes--it's all called from the bootstrap file and you can comment them out, add them at will, put them in a class and pull them from your config or a database? I think that's really simple and intuitive. It also would require zero knowledge of OS-specific scheduling programs by Zend_Scheduler.</p>

                <p>Let's pretend the user only has one task.</p>

                <ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
                $scheduler = new Zend_Scheduler();

                $task = new Zend_Scheduler_Task();
                $task->... // set task to run every other Monday
                $scheduler->addTask($task);

                $scheduler->run();
                ]]></ac:plain-text-body></ac:macro>

                <p>Now this covers both bases. If you don't have access to Crontab, Zend_Scheduler::run() can run every time someone loads the page (poor man's scheduler). If you have access to Crontab, you hit whatever action this is under (you might want to specify something other than index/index for the purpose of filtering access_log) with one line that the user (manually) adds with `crontab -e` (or `schtasks /create` or whatever). Now you have an easily customizable scheduler with a "set it and forget it" line in Crontab.</p>

                <p>As far as I can tell, this will have to go through the HTTP server for a couple of reasons:</p>

                <ul>
                <li>Access to database, config settings, etc. without duplicating all of that</li>
                <li>Accomodate the web-oriented focus of Zend Framework</li>
                </ul>

                <p>Hmm... I really like that idea. It matches other components in style and doesn't require a sledgehammer (a class for storage methods, methods of interacting with the various OS-specific schedulers). Best of all, instead of an OS-specific syntax for setting time values, or an abstraction layer that converted everything to something else, it has a unified syntax that's cross-platform.</p>

                <p>The only real work, then, is defining and interpreting a scheme to set time values.</p>

                <p>What do you think?</p>

                1. Jul 15, 2006

                  <p>Looks good, just 2 quick notes</p>

                  <p>Ability to catch up is really crucial for people who can't setup a cron file and have to rely on webpage loading. If a page isn't loaded for 5 days I may need to run 5 days worth of crons. This however requires some sort of timing/storage method and is an edge case so its something that maybe there should be notes for but not needed from the start.</p>

                  <p>Needs to be some sort of input/output object for storage use, This way storage can be left up to the developer</p>

                  <p>$scheduler = new Zend_Scheduler();<br />
                  $task = new Zend_Scheduler_Task();<br />
                  $task->... // set task to run every other monday<br />
                  $scheduler->addTask($task);<br />
                  $storage = $scheduler->serialize();</p>

                  <p>Then<br />
                  $scheduler = new Zend_Scheduler($storage);<br />
                  Would unserialize and have all the tasks ready to use.</p>

                  <p>This allows for those that need it, easy editing and changing of the schedules.</p>

                  1. Jul 15, 2006

                    <p>Richard,</p>

                    <p>I understand the need for the first point--the "catch-up" functionality. I think this is an interesting idea and would be something to look into further after an initial release (like how RewriteRouter has gradually been refined with each new release).</p>

                    <p>Re: the second point, I think most people are not going to need more than a handful of tasks, and so easy editing and changing of the schedules would be handled by altering the task code itself, like you might alter a route. I use the RewriteRouter analogy a lot because it seems to match this new model pretty closely. If you happen to have a lot of routes, there would be nothing stopping you from putting them into a SchedulerController and not having them in the bootstrap file (or wherever).</p>

                    <p>Keeping in mind that these additional things could be added onto an existing class later, I'd like to focus on the timing interface.</p>

                    <p>Also, keeping with the web focus, I think commands should be actions, not shell commands. If they are shell commands, they should be wrapped in an action. I also think multiple actions should be allowed.</p>

                    <ac:macro ac:name="code"><ac:plain-text-body><![CDATA[
                    $task = new Zend_Scheduler_Task("My task");
                    $task->runs("amazon", "cache"); // optional "values" array parameter not used
                    ->runs("catalog", "refresh");
                    ->every("30 minutes");
                    ->every("Monday, Wednesday, Friday");
                    ->every("August, October, December, March");
                    ->beginning("2006-07-15 00:00:00"); // uses strtotime()
                    ->ending("2007-07-15 00:00:00");
                    $scheduler->addTask($task);
                    ]]></ac:plain-text-body></ac:macro>

                    1. Jul 15, 2006

                      <p>Er, without the semicolons, of course.</p>

                    2. Jul 15, 2006

                      <p>The easiest thing to do would be to have the commands be callbacks, a like the idea of the optional values parameter though.</p>

                      <p>I think for the sake of programming and reducing the number of checks you need to do you should break the ->every out a bit more</p>

                      <p>->interval("30 minutes");<br />
                      ->days("Mon, Wed, Fri"); /* Assumes every day if not set<br />
                      ->months("Aug, Oct, Dec"); /* Assumes every month if not set<br />
                      ->modifiers("First, Last, Second, Third, Fourth");</p>

                      <p>IE<br />
                      ->days("Mon");<br />
                      ->months("Aug");<br />
                      ->modifiers("First");</p>

                      <p>Translates to The first monday of aug, With no time specifications it would run at 00:01</p>

                      <p>This way each function can focus on checking for what it needs to check for and theres no extra overhead.</p>

          2. Nov 13, 2006

            <p>Unofficial comments ...</p>

            <p>There are at least two ways to enable actual scheduling of tasks created by this component. As you mention, a real task scheduler might have an entry that runs the PHP scheduler component, once every minute.</p>

            <p>Given the interest in programmatic interfaces for other tools and paradigms, perhaps many will see value in having a PHP API to drive the creation and management of *NIX style crontab configuration files. However, I see more value in providing an API that simplifies scheduling PHP tasks for a complex web application, like automatically pruning a forum on a bulletin board to the most recent 90 days worth of posts. The IPB and vBulletin forum systems are good examples of popular, robust PHP applications with such scheduling services builtin. They do not rely on, nor need an external program, like crontab, to function correctly. Using cached values, and some form of locking to accomplish atomicity (e.g. moving a token file from a "done" directory to an "in progress" directory), there are ways to offer functionality similar to cron without actually using an external service.</p>

            <p>If this proposal included functionality to support scheduling PHP tasks (similar to IPB or vBulletin), I think the value of this component would increase.</p>

            1. Nov 13, 2006

              <p>.. for higher volume websites that always receive at least one page view per minute.</p>

              <p>Note that the "token" approach allows for minimal overhead, so that the complete "cron" system would only be require()'d once per minute, instead of per request.</p>

  2. Nov 14, 2006

    <ac:macro ac:name="note"><ac:parameter ac:name="title">Moving to Laboratory</ac:parameter><ac:rich-text-body>
    <p>We've reviewed this proposal. It shows promise, but we are not considering it for inclusion in Zend Framework 1.0. But we encourage you to develop this idea further in the Laboratory, and it may be considered for a future release of Zend Framework.</p>

    <p>The discussion makes it clear that the use cases need some further development. Please show in the use cases section some scenarios under which this component would be useful. Some of the comment threads do provide this, but it needs to be summarized.</p></ac:rich-text-body></ac:macro>

  3. Nov 14, 2006

    <p>Unofficial comments...</p>

    <p>One idea for this component would be to take advantage of the new MVC architecture. The scheduler would allow the user to define an HTTP request to a local ZF application in a declarative way. The user would define the controller/action/parameters, and the schedule. But no callback is necessary. All the logic would be in the web app, as though someone hit it through their browser interface. </p>

    <p>The scheduler component (run by cron or another scheduler) would then instantiate a Zend_Http_Request object and invoke the Zend_Controller with the request.</p>

    <p>This paradigm allows requests to be handled by the same code in the web app, either by a schedule, or by the "simulated schedule" of running the job during a real web hit to the app. </p>

    <p>This could have legitimate use cases, for instance when you want a task to be run when an admin requests a certain report, or else every hour, whichever occurs sooner.</p>

    <p>Also note that MVC apps aren't necessarily exposed to the internet. They can be invoked in batch mode or from the command-line, in the new MVC architecture. So you could use the same paradigm described above to run PHP apps that aren't part of web apps.</p>

  4. Dec 01, 2006

    <p>Thanks for your suggestions, everyone. This has gone through so many permutations that the original proposal has almost no resemblance anymore to what I have in mind now (which is closer to my later comments combined with Bill and Gavin's valuable input). The new MVC architecture makes some interesting things possible.</p>

    <p>Today I started coding what I have in mind. Once my CLA is taken care of I'll update the proposal with the revised skeletons and use cases, although of course they will change as necessary.</p>

    <p>When I experimented with this a few months ago I ran into issues with when to parse the timing rules when dealing with delayed execution, possibility of serialization, etc. I think I've worked that out in my head now, so hopefully we'll see some progress on it in the near future.</p>

  5. Dec 05, 2006

    <p>The proposal has been completely revised. It takes into account all the feedback received to date.</p>

    <p>Thoughts?</p>

  6. Feb 01, 2007

    <p>I really think this is an important class to add to Zend Framework. I've been looking for something that does this for a long time. I saw the link here from Bill Karwin's <a href="http://framework.zend.com/wiki/display/ZFPROP/Zend_Console_Getopt+-+Bill+Karwin?showComments=false">Zend_Console_Getopt</a> proposal, who recommended me to here as a way to somehow run batch processes that access the Zend Framework's classes or access Controller/Actions from the command line.</p>

    <p>Either from the command line or from once a web-client triggers an action that calls the Zend_Scheduler, I can use this right now.  There are many classes that I need access to, that I don't want to rewrite that are already written in the Zend Framework.  If I was to create a batch processor outside of the zend framework, I'll have to create a whole new application that does the same functionality in a non-MVC environment.  I don't think MVC frameworks should be limited to web applications only.</p>

    <p>The batch processing I am doing takes too long for our clients to wait for, and I don't want to increase my php timeout value to a large amount of time.</p>

    <p>Also, I think a lot of developers do have access to crontab and can schedule tasks because they have their own servers doing the intensive processing and/or can request access to crontab for their tasks.</p>

  7. Feb 02, 2007

    <p>I see that this is using Zend_Date. Can someone point me in the right direction to download this? The only one I can find is Thomas Weidner's <a href="http://framework.zend.com/wiki/display/ZFPROP/Zend_Date+Proposal+-+Thomas+Weidner">Zend_Date proposal</a>.</p>

  8. Feb 02, 2007

    <p>Hi Jeff,</p>

    <p>Zend_Date has been moved to the Zend Framework core, at least in Subversion.</p>

    <p>Currently I'm rethinking certain aspects of Zend_Scheduler. The version available from the Laboratory is actually not the newest version--I've got a newer version on my personal computer, with unit tests. However, I haven't had a lot of time to work on it.</p>

    <p>At this point I'm trying to determine if I should create a simple Zend_Controller_ConsoleRouter to pair with this proposal, which would depend on Zend_Console_Getopt. Currently, tasks are processed within the current request. If there were a way to call actions directly from the command line, tasks could complete independent from the current request.</p>

    <p>Also, the initial vision of this class was that it would be a poor man's scheduler, and that tasks could be initiated when a user requested a page. There are Crontab-like time rules in place for that reason, to determine when these tasks could execute. There is also a feature that limits the number of tasks to execute in any given request and adds the rest to a queue.</p>

    <p>But there is an inherent flaw in this design; if you want something to execute every 20 minutes, you can't very well say <code>$task->setMinutes('0/20')</code>, because there's not a guarantee that someone will visit at that exact time on a site that would be more inclined to use something like this (lower traffic, no direct access to Crontab). If you expanded it to, say, <code>$task->setMinutes('0-5, 20-25, 40-45')</code>, the task could potentially execute multiple times within those periods. This means that there needs to be a way to limit not only the number of tasks that can execute within a request, but also the number of times a task executes in a given amount of time. But how does one represent that in the API? This is what I've been struggling with lately, and what has delayed a more finalized version of the class.</p>

    <p>All suggestions are welcome.</p>

  9. Feb 05, 2007

    <p>Could you have a function to execute at a specific time, such as $task->setTime('5:30, 14:25'); Then it will only execute at that time instead of giving the minutes to execute. So it will only execute that one time at that time. If it's already been set to execute at 5:30 then it won't add it to the queue again, or something like that?</p>

    1. Feb 05, 2007

      <p>I reviewed the scheduler and saw that you can already do this by just setting minutes, hours, etc. Maybe a switch to tell it not to add it to the queue if it's already in the queue. </p>

      1. Feb 05, 2007

        <p>I think you misunderstand how this (currently) works. This is a request-based scheduler, meaning it executes every time there is a page request. It's primarily for people without access to Crontab. It isn't a daemon like Crontab, which resides in memory and can execute programs at specified times.</p>

        1. Feb 05, 2007

          <p>ok, that makes sense, because I was looking in the code to where it ran cron jobs, or schtasks, but couldn't find it.. and that's what the whole debate about people not having access to crontab came into play. I thought it was implemented that way and people wanted it taken out because they didn't have access to it. </p>

          <p>Thanks for the clarification. </p>

          <p>So, to understand this correctly, if someone does page request between the intervals specified when doing a $tasks->setMinutes, $tasks->setHours, etc. Then it will run. If they do not access the page within those intervals it will not run? </p>

          1. Feb 05, 2007

            <p>Right. And so therefore I'm at a standstill because I think this is fundamentally flawed, at least in its current state. Ideally I'd like to create a daemon-like process that acts like a cross-platform Crontab, to get the best of both worlds (the initial proposal and what it evolved into). The scheduling rule parser is complete, and in the latest version (not yet checked into Subversion) it works perfectly.</p>

            <p>The only way I can see this is as a separate script that must be invoked and exist in memory with no Apache or PHP timeout.</p>

            <p>Not necessary but ideal would be, as I said before, to be able to call actions from the command line, so that they can execute asynchronously from the request that spawns them. The question, of course, is how to handle routing those requests without using, I don't know, wget or something.</p>

            1. Feb 05, 2007

              <p>Yeah, I see the problem. I noticed a new proposal for a <a href="http://framework.zend.com/wiki/display/ZFPROP/Zend_Controller_Request_Cli">command line interface</a>, but don't know when that will be available. I think you could do a lot with CLI access to your actions.</p>

              <p>I'm somewhat new to the Zend Framework, and haven't figured out how to pass the right stuff to zend that would normally go through a browser.  I found that I can manually set the $_SERVER['REQUEST_URI'] to what controller/action/params I want to access, but zend keeps crashing in various spots.  I don't know exactly what is required for the main index.php to run, but I am working on it because I do need command line access right now.  If you know, or anyone knows what is necessary, could you post a reply. </p>

              <p>All I want to do is include the main /document_root/index.php in my own file and set the required params to run /controller/actions/params</p>

              <p>If I figure it out I'll post it here, and maybe if the current proposal for a CLI doesn't go through I'll do my own proposal.  </p>

              1. Feb 05, 2007

                <p>Use your own <code>Zend_Controller_Request_Http</code> object and manually set the values for controller and action (along with any parameters you have) via <code>Zend_Console_Getopt</code>. Then, when Matthew completes <code>Zend_Controller_Request_Console</code>, refactor to use that instead.</p>

                1. Feb 13, 2007

                  <p>When I started using Zend Framework, it had already been installed, and an application running using it. I didn't have the experience of setting it up myself, so I'm a bit lost when it comes to the core details of how things work. I don't know where to find a discussion group for the Zend Framework, in which I looked in group.google.com and couldn't find an active community. I have browsed around in the documentation a bit, but I find it to be too wordy. It would be nice to have API documentation that is concise and more as a reference than a manual, like php.net. If you know of a place that I can just ask questions, could you post it here?</p>

                  <p>I finally figured out that when I was running the code that I want to run from the command line, it was using a different include path than when I ran it from the browser. I got around this by throwing ini_set("include_path", ".;../library/;../application"); at the beginning of my script. Zend now can figure out where things are located without using the browser, and I no longer get file not found errors.</p>

                  <p>I'm now at the point to try to use my own Zend_Controller_Request_Http object. Do I just create my object, set its parameters, then pass it to Front controller's Dispatch() function as its first param?</p>

                  1. Feb 13, 2007

                    <p>Great! That worked. Now I can move my script outside the document_root/ and change the ini_set() location to indicate where to find the main directories needed. </p>

                    <p>Thanks for the help!</p>

  10. Aug 28, 2007

    <p>Hi there, i'm wondering if there would be a way for one to connect to the scheduler and fire tasks on-the-fly, as opposed to being forced to register all tasks at scheduler startup</p>

    <p>this could be done like the scheduler listening to a tcp port or a socket</p>

    <p>also, dunno if it's planned but there should be a way to tell the scheduler to reload its task table (from the file or db backend)</p>

    1. Aug 30, 2007

      <p>Hi Geoffrey,</p>

      <p>Work has largely stopped on this component for a little while (in the past few months I've switched jobs, bought a house, and am currently in the process of moving/renovation). On-the-fly scheduling is planned but not on the slate for an initial release. Reloading is a good idea as well.</p>