A quick start guide to the timed Task framework APScheduler in Python

  • 2020-06-07 04:48:52
  • OfStack

preface

As we all know, timing task is a common form of scheduling in programming languages, and many scheduling modules have emerged in Python. This paper will briefly introduce the basic usage of APScheduler.

1. APScheduler is introduced

APScheduler is an python timed task framework based on Quartz, which realizes all the functions of Quartz and is 10 minutes convenient to use. Tasks are provided based on date, fixed time intervals, and type crontab, and can be persisted.

APScheduler provides a variety of schedulers for developers to use according to their actual needs; At the same time, it also provides different storage mechanisms, which can easily cooperate with third-party external persistence mechanisms such as Redis and database. In a word, it is very powerful and easy to use.

In the world of Python, another well-known scheduling module is Celery, which is also very powerful. It is called a distributed scheduler, so that interested readers can do their own research.

Website address: document http: / / apscheduler readthedocs. io/en latest /

Installation package location: https: / / pypi python. org/pypi APScheduler /

In a system, how do you install it? In fact, it is very simple and can be installed directly based on pip:


pip install APScheduler 

2. Main scheduling classes of APScheduler

There are several important concepts in APScheduler that need to be understood:

1. Trigger (trigger)

Contains scheduling logic, each job has its own trigger to determine which job will run next, based on parameters defined in trigger such as point in time, frequency, time interval, and so on. Triggers are completely stateless except for their own initial configuration.

2. Job storage (job store)

To store scheduled jobs, the default job store is simply to store the job in memory, and other job stores are to store the job in the database. Data for 1 job is serialized when stored in the persistent job store and deserialized at load time. The scheduler cannot share storage with 1 job. job store supports mainstream storage mechanisms: redis, mongodb, relational databases, memory, etc

3. Actuator (executor)

Processing jobs are typically run by submitting specified callable objects in the job to a thread or into the city. The actuator notifies the scheduler when the job is complete. Pooling allows for more efficient use of cpu's computing resources for different types of job tasks.
The scheduler (scheduler)

Typically there is only one scheduler in an application, and the scheduler provides the appropriate interface to handle this. Configuring the job store and executor can be done in the scheduler, such as adding, modifying, and removing jobs.

Here are some commonly used schedulers under simple column 1:

BlockingScheduler: Only available within your current process to share computing resources with your current process BackgroundScheduler: Scheduling in the background does not affect the current system computing operation AsyncIOScheduler: If async module is used on the current system, an asynchronous scheduler is required GeventScheduler: This schedule is required if gevent is used TornadoScheduler: If Tornado is used, the current scheduler is used TwistedScheduler: Scheduler for Twister applications QtScheduler: Scheduler for Qt

Therefore, the scheduler of APscheduler is closely related to the underlying implementation mechanism and needs to be dynamically selected according to the current calculation model.

3. job management

Job is the core of APScheduler. It undertakes the work and tasks that need to be performed at present. It can dynamically add/modify/delete/query and other operations during the system operation.

3.1 New additions to Job

There are two ways to add job:

Dynamic increment based on add_job

Code examples:


sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour='0-9', minute="*", second="*/4") 

Dynamically decorates the actual function of job based on the modifier scheduled_job

Code examples:


@sched.scheduled_job('cron', id='my_job_id', day='last sun') 
def some_decorated_task(): 
 print("I am printed at 00:00:00 on the last Sunday of every month!") 

3.2 Removal job


job = scheduler.add_job(myfunc, 'interval', minutes=2) 
job.remove() 
Same, using an explicit job ID: 
 
scheduler.add_job(myfunc, 'interval', minutes=2, id='my_job_id') 
scheduler.remove_job('my_job_id') 

Dynamically removes a specific job based on job id.

3.3 Suspend and resume operations

Suspended operation:

apscheduler.job.Job.pause()

apscheduler.schedulers.base.BaseScheduler.pause_job()

Recovery operation:

apscheduler.job.Job.resume()

apscheduler.schedulers.base.BaseScheduler.resume_job()

3.4. Get the job list

Gets a list of scheduled jobs that you can use get_jobs() To complete, it returns all instances of job. Or use print_jobs() To output a list of all formatted jobs.

3.5. Modify job job

Can be achieved by apscheduler.job.Job.modify() or modify_job() To dynamically modify job's attribute information, except that job id cannot be modified.


job.modify(max_instances=6, name='Alternate name') 

In addition, we can also pass apscheduler.job.Job.reschedule() or reschedule_job() Dynamically reset trigger, as shown below:


scheduler.reschedule_job('my_job_id', trigger='cron', minute='*/5') 

3.6. Turn off the scheduler

By default, the scheduler waits for all running jobs to complete and closes all schedulers and job stores. If you don't want to wait, set the wait option to False.


scheduler.shutdown() 
scheduler.shutdown(wait=False) 

4. APScheduler code example

The decorator is used here to show the use of a schedule:


from apscheduler.schedulers.blocking import BlockingScheduler 
 
sched = BlockingScheduler() 
 
@sched.scheduled_job('interval', seconds=3) 
def timed_job(): 
 print('This job is run every three minutes.') 
 
@sched.scheduled_job('cron', day_of_week='mon-fri', hour='0-9', minute='30-59', second='*/3') 
def scheduled_job(): 
 print('This job is run every weekday at 5pm.') 
 
print('before the start funciton') 
sched.start() 
print("let us figure out the situation") 

Code description:

In this code, BlockingScheduler, which shares computing resources in the current process, is used with a total of two schedulers, one of which executes at intervals of three seconds.

The other scheduler is modeled after cron and is executed between 0 and 9 o 'clock in the day from Week 1 to Week 5, at a frequency of 3 seconds, between 30 and 59 minutes.

Examples based on normal code are as follows:


from apscheduler.schedulers.background import BackgroundScheduler 
from apscheduler.schedulers.blocking import BlockingScheduler 
 
import datetime 
import time 
import logging 
 
def job_function(): 
 print "Hello World" + " " + str(datetime.datetime.now()) 
 
if __name__ == '__main__': 
 log = logging.getLogger('apscheduler.executors.default') 
 log.setLevel(logging.INFO) # DEBUG 
 
 fmt = logging.Formatter('%(levelname)s:%(name)s:%(message)s') 
 h = logging.StreamHandler() 
 h.setFormatter(fmt) 
 log.addHandler(h) 
 
 print('start to do it') 
 
 sched = BlockingScheduler() 
 
 # Schedules job_function to be run on the third Friday 
 # of June, July, August, November and December at 00:00, 01:00, 02:00 and 03:00 
 sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour='0-9', minute="*", second="*/4") 
 
 sched.start() 

5. Thinking about an unusual problem

Timing task 1 did not work properly while executing the following code:


from apscheduler.schedulers.background import BackgroundScheduler 
from apscheduler.schedulers.blocking import BlockingScheduler 
 
import datetime 
import time 
 
def job_function(): 
 print "Hello World" + " " + str(datetime.datetime.now()) 
 
if __name__ == '__main__': 
 print('start to do it') 
 sched = BlockingScheduler() 
 sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour='0-9', minute="*", second="*/4") 
 sched.start() 

The error message of the code is:


sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour='0-9', minute="*", second="*/4") 
0

In the literal sense, logger does not exist in logging module, so it needs to be added.

Add the corresponding logging information:


sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour='0-9', minute="*", second="*/4") 
1

sched.add_job(job_function, 'cron', day_of_week='mon-fri', hour='0-9', minute="*", second="*/4") 
2

Later, the author made a new execution. Even if the contents of logging were removed, the execution could still be normal. Therefore, it can be speculated that a dynamic introduction of the dependent package logging is needed.

6. Summary

APScheduler is a very powerful and easy to use class library. It provides many tools and flexible extension points for us to solve problems easily and quickly. As long as you add several web pages, you can create a powerful task scheduling system, right?


Related articles: