之前講到利用celery異步處理一些耗時或者耗資源的任務,但是近來分析數據的時候發現一個奇怪的現象,即是某些數據重復了,自然想到是異步任務重復執行了。
查閱之后發現,到如果一個任務太耗時,任務完成時間超過了broker的時間(Redis默認為一小時)則任務會被再次分配到worker.
Visibility Timeout
The visibility timeout defines the number of seconds to wait for the worker to acknowledge the task before the message is redelivered to another worker. Be sure to see Caveats below.
This option is set via the
broker_transport_options
setting:app.conf.broker_transport_options = {'visibility_timeout': 3600} # 1 hour.
The default visibility timeout for Redis is 1 hour.
問題在於我的應用中的異步任務耗時絕不可能超過Redis默認的一小時,那么問題出在這個“Acknowledge”了,一開始我的理解是這個acknowledge是worker收到了broker發送的任務。但是通過查看workererr.log 發現:
[2019-03-01 14:20:30,695: INFO/MainProcess] Received task: task_async[4e0378e2-ff5d-4394-a842-ece2d1c8118a] ETA:[2019-03-01 15:44:03.692831+08:00]
[2019-03-01 15:23:58,477: INFO/MainProcess] Received task: task_async[4e0378e2-ff5d-4394-a842-ece2d1c8118a] ETA:[2019-03-01 15:44:03.692831+08:00]
[2019-03-01 15:44:04,620: INFO/ForkPoolWorker-2] Task task_async[4e0378e2-ff5d-4394-a842-ece2d1c8118a] succeeded in 0.003580662072636187s: None
[2019-03-01 15:44:04,621: INFO/ForkPoolWorker-1] Task task_async[4e0378e2-ff5d-4394-a842-ece2d1c8118a] succeeded in 0.004984764964319766s: None
1. 重復執行的任務被發送了多次 (時間間隔為1小時)
2. worker多次接收到同樣的任務(同ID),並且幾乎一樣的ETA(預計執行時間)
3. 在ETA到達之后,這個任務會被多個子線程認領並執行,每次執行時間並不長
所以為什么14:20:30任務接收到之后15:23:58任務再次發送呢,問題在約“Acknowledge”(認領)並不是以“Received”為結束標志的,看celery對於acknowledge的解釋:
acknowledged
Workers acknowledge messages to signify that a message has been handled. Failing to acknowledge a message will cause the message to be redelivered. Exactly when a transaction is considered a failure varies by transport. In AMQP the transaction fails when the connection/channel is closed (or lost), but in Redis/SQS the transaction times out after a configurable amount of time (the
visibility_timeout
).
所以說是以“Handled”來進行判定而非任務已被接收,所以會出現當我的定時任務在一小時后才執行的情況下,第一次發送的任務雖然接受了但是並未執行(Acknowledge),所以一小時后任務再次被發送。
解決這個問題的時候回看celery開篇教程中的一段:
Ideally task functions should be idempotent: meaning the function won’t cause unintended effects even if called multiple times with the same arguments. Since the worker cannot detect if your tasks are idempotent, the default behavior is to acknowledge the message in advance, just before it’s executed, so that a task invocation that already started is never executed again.
最佳實踐中的任務應該是冪等的!
總結起來:
1. Task received的時候並不是acknowledge的時候,而task執行才是acknowledge (任務才會從broker隊列中移除).
2. 我的任務都是定時任務(超過一小時),所以我設置visibility_time 超出我的定時,則重復執行不會再發生.
3. 如果任務很長或者跨度很長,如果對於只執行一次有嚴格要求,可以參考celery_once.
4. 還是要仔細閱讀官方文檔!!
Ref:
Scheduled tasks are being duplicated
https://github.com/cameronmaske/celery-once
http://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html#visibility-timeout
https://github.com/celery/django-celery/issues/176