After spending some time experimenting with the Magento cron system I now have some good results and some helpful hints from a well placed source. During my tests I setup a cron command to launch the Magento cron.php file at intervals of 5, 10, 15, 30 and 60 minutes.
Here is what I have found out:
- Magento spends around 1 minute to index 1000 products.
- Look at your cron_schedule table in the database and make sure that scheduled tasks are NOT overlapping - each task must have a status 'success' before a new task is launched with status 'pending'.
This is simple guidance that will allow you to make an informed decision concerning the Magento scheduled task settings at: system > configuration > system > ‘Cron (Scheduled Tasks)’ tab.
Here are the settings I am currently using for some of my clients stores:
Generate schedules every: 60
Schedule ahead for: 1
Missed if not run within: 60
History cleanup every: 120
Success history lifetime: 120
Failure history lifetime: 120
My cron job is set to run every fifteen minutes.
The above settings have been configured to allow Magento to generate and clean schedules within a 2-hour time frame.
Please note: I have my reservations about the Magento cron system in its default state because a few core modules have cron configurations that will be initiated on each and every cron jon run which leads to a bloated cron_schedule table and overlapping cron schedules, 'success/pending' conflicts. Mainly the conflicts arise from the Magento job codes labelled 'catalogindex_run_queued' and 'newsletter_send_all' in the cron_schedule table. These schedules originate from their associated modules, using dreamweaver or textpad do a search for <schedule><cron_expr> and you will see the problem with the CatalogIndex and Newsletter module's config.xml files. The 2 config files have schedules that are set to initiate on each cron run, therefore you have the potential problem that schedules will overlap causing the Magento cron system to fail.
The problem could be resolved by making a custom module for these cron settings. However, it would be much more flexible to have the schedule settings available within the admin interface for the CatalogIndex and Newsletter modules. So, until this situation is resolved use either a custom module or scale back your usage of cron jobs to allow catalog indexing to complete with a status of 'success' in the cron_schedule table.




5 Comments so far | Add new comment
This post seems to conflict with your earlier post that suggested making cron jobs run every minute... which should we use?
Hello, please use the information provided in this post to configure your Magento cron jobs. This is the most up to date set up I have and from my experience it works very well.
Great post, explained a lot along with your previous one on Cron jobs in Magento!
I'd like to mention though that in Magento 1.4.0.1 (probably already from 1.4.0.0) 'catalogindex_run_queued' is not run anymore by default and even 'newsletter_send_all' is sent 'only' every 5 minutes.
Could you please also confirm my suspicion when using the settings you recommend above along with 1.4.0.1?
If the very first schedule generation happens at 15th, 30th or 45th minute of the hour then each subsequent generation will happen on those same minutes and will miss any job which would be scheduled on 0 minute of any hour. Unfortunately I'm not aware how scheduling worked in earlier versions but it doesn't seem to be fine now with your settings.
Please tell me I'm misunderstanding something :)
Thank you,
Chris
Hi,
it's no problem that the job "catalogindex_run_queued" is scheduled via XML for every minute: it only checks a flag in the database, it its set the catalog is reindexed.
The flag ist set via method Mage_CatalogIndex_Model_Indexer::queueIndexing() which is currently not used at all.
One can call this method and the next minute (assuming cron is called every minute and the XML cron definition for this job has not been changed) the catalog is reindexed asynchronously.
Sebastian
Chris and Sebastian,
Thanks for your comments and feedback. Very useful insights. Generally, the cron system works OK when running cron jobs once per minute, but if anything goes wrong during the cron run or you have a third party module installed that doesn't clear it's cron schedules properly then you end up with a very bloated cron_schedule table. Also, I like to keep things nimble where Magento is concerned and have been told directly by their support team that Magento spends around 1 minute to index 1000 products.
So if you have 100,000 products and run cron once per minute then you will have a lot of failed cron jobs in your database (100 perhaps) and will be constantly running the indexing process. Maybe not ideal, but the above settings seem to work out fine on a mixture of sites both small and large.