How do I manage long-running tasks in my application infrastructure?

Last updated: February 9, 2026

Context

When building applications with long-running tasks (tasks that can take up to 24 hours to complete), it's important to understand how to structure your infrastructure to handle these tasks efficiently alongside regular backend operations like database queries.

Answer

Long-running tasks can be run either in the same infrastructure as your other services or on a separate infrastructure. Here are the key considerations and best practices:

Infrastructure Setup

Running tasks in the same infrastructure is recommended if:

  • You ensure there are sufficient resources for all components

  • You implement proper task recovery mechanisms

  • You configure the infrastructure for high availability

Key Requirements for Long-Running Tasks

  1. Implement task recovery mechanisms:

    • Log task status to a database or persistent disk (for clusters on AWS)

    • Implement logic to resume tasks from their last known state

  2. Configure for high availability:

    • Deploy at least 3 replicas of the application

    • Set up health checks

    • Implement graceful shutdown procedures (check how this must be implemented in our docs)