Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send task to queue in bulk - Celery Executor #8854

Open
mik-laj opened this issue May 13, 2020 · 3 comments · May be fixed by #30049
Open

Send task to queue in bulk - Celery Executor #8854

mik-laj opened this issue May 13, 2020 · 3 comments · May be fixed by #30049

Comments

@mik-laj
Copy link
Member

mik-laj commented May 13, 2020

Description

Hello,
I recently took care of CeleryExecutor. I managed to optimize the status retrieval by using bulk operations. Instead of fetching the status for each task using a separate query, one is sent for all tasks. This has accelerated this process more than 100 times in many cases.
#7542
However, we still use single requests in many processes to send tasks to the queue. This is very effective because of network latency.

def _send_tasks_to_celery(self, task_tuples_to_send):
# Use chunks instead of a work queue to reduce context switching
# since tasks are roughly uniform in size
chunksize = self._num_tasks_per_send_process(len(task_tuples_to_send))
num_processes = min(len(task_tuples_to_send), self._sync_parallelism)
with Pool(processes=num_processes) as send_pool:
key_and_async_results = send_pool.map(
send_task_to_executor,
task_tuples_to_send,
chunksize=chunksize)
return key_and_async_results

It would be nice if it could be done as a bulk request in a single request. For Redis, this means using Pipeline.
https://github.com/andymccurdy/redis-py#pipelines

Can it be done easily in Celery?

Best regards,
Kamil

@mik-laj mik-laj added the kind:feature Feature Requests label May 13, 2020
@mik-laj
Copy link
Member Author

mik-laj commented May 13, 2020

@auvipy Can you look at it? You're a Celery expert. I think Celery doesn't support it yet, but I might be wrong.

@mik-laj mik-laj changed the title Send task to queue in bulk May 13, 2020
@auvipy
Copy link
Contributor

auvipy commented May 13, 2020

celery redis need more care actually :) with my current time and other priorities in celery i didnt contribute much on redis part. I'm more focused on amqp 1.0 and kafka support and asyncio based worker....

@mik-laj mik-laj added area:Scheduler including HA (high availability) scheduler area:performance labels May 14, 2020
@kurtqq
Copy link
Contributor

kurtqq commented Mar 8, 2022

this can be a good improvement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment