Contents
Backend Workflows Explained: Jobs, Queues, Retries, and Cron Schedules

Backend Workflows Explained: Jobs, Queues, Retries, and Cron Schedules

Authored by Cameron Booth

Reviewed by Kelly Weaver

Last updated: January 30, 2026

If you've ever stared at a backend system wondering "where should this business logic live?", you're not alone.

That moment when you're building a feature and suddenly realize you need to coordinate five different things: send an email, update a database, call a third-party API, maybe wait for approval, and somehow handle failures gracefully.

The truth is, most developers learn backend patterns through trial and error, building systems that work... until they don't. Until the approval workflow breaks. Until the retry logic creates infinite loops. Until debugging becomes archaeology.

This guide covers the practical patterns that actually work in production: where to put business logic, how to handle long-running processes, when to use background jobs, and how to build systems that don't turn into spaghetti code.

Where should I put business logic—frontend, backend, or database?

💡
Business logic location

Business logic belongs in the backend, with strict validation in the database and lightweight presentation logic in the frontend.

Here's the hierarchy that works:

  • Frontend: Presentation logic only. Form validation for user experience, UI state management, and data formatting. Never trust the frontend for business rules.
  • Backend: Core business logic, orchestration, and workflow coordination. This is where approval processes live, where complex calculations happen, and where you coordinate between different systems.
  • Database: Data integrity constraints, referential integrity, and simple business rules that can be expressed as constraints. Think foreign keys, unique constraints, and check constraints.

The biggest mistake I see? Putting business logic in the frontend because "it's faster to ship." That decision will haunt you the moment you need a mobile app, a third-party integration, or literally any other client.

How do I implement approval workflows that don't break?

💡
Workflows that don't break

You should model approval workflows as explicit state machines with clear transitions, not as boolean flags scattered across your codebase.

A proper approval workflow needs:

  • State tracking: Use an enum or status column with explicit states like draft, pending_approval, approved, rejected, changes_requested.
  • Transition rules: Define which states can transition to which other states. A rejected item shouldn't jump directly to approved without going through review again.
  • Audit trail: Track who changed what when. Use a separate workflow_events table that logs every state transition with user_id, a timestamp, and optional comments.
  • Role-based permissions: Different users can perform different transitions. Not everyone who can create can approve.

Here's what this looks like in practice:

-- Don't do this
ALTER TABLE documents ADD COLUMN is_approved BOOLEAN DEFAULT FALSE;
-- Do this instead
ALTER TABLE documents ADD COLUMN status workflow_status DEFAULT 'draft';
CREATE TYPE workflow_status AS ENUM ('draft', 'pending_approval', 'approved', 'rejected', 'changes_requested');

The state machine approach scales. Boolean flags don't.

How do I handle retries when something fails?

💡
Handling retries

Implement exponential backoff with jitter, maximum retry limits, and dead letter queues for failures that can't be recovered.

Retries are not just "try again." They're a system design decision that affects everything downstream.

  • Exponential backoff: Start with short delays (1 second) and increase exponentially (2s, 4s, 8s, 16s). This prevents overwhelming downstream systems.
  • Add jitter: Random variation in retry timing prevents the "thundering herd" problem where all failed requests retry at exactly the same time.
  • Maximum attempts: Set a hard limit (usually 3-5 attempts). After that, move the job to a dead letter queue for manual investigation.
  • Idempotency: Make sure retrying the same operation multiple times doesn't create duplicate effects. Use idempotency keys for critical operations.
  • Circuit breakers: If a downstream service is failing consistently, stop sending requests temporarily rather than continuing to retry and making things worse.

The key insight: retries are not about persistence. They're about graceful degradation.

How do I run long-running jobs without blocking everything?

💡
Long-running jobs

Use background job queues with proper job scheduling, progress tracking, and timeout handling.

Long-running jobs should never run in the request-response cycle. Ever.

  • Background queues: Jobs that take more than a few seconds belong in a background queue (Redis, Xano’s Tasks, or managed services).
  • Progress tracking: For jobs that users care about, implement progress tracking. Store job status and completion percentage in a dedicated table.
  • Timeouts: Set reasonable timeouts. A job that's been running for 6 hours is probably stuck, not just slow.
  • Chunking: Break large operations into smaller chunks. Instead of processing 10,000 records at once, process them in batches of 100.
  • Graceful shutdown: Handle interruption gracefully. If a job gets killed, it should be able to resume from where it left off.

The pattern that works: queue the job immediately, return a job ID to the client, let them poll for status or use webhooks for completion notifications.

When should I use background jobs vs. API calls?

💡
Background jobs vs. API calls

Use background jobs for operations that are slow, unreliable, or don't need immediate feedback. Use synchronous API calls for operations that must complete before the user continues.

Background jobs are good for:

  • Sending emails or notifications
  • Processing uploads or large datasets
  • Third-party API calls that might fail
  • Operations that can take more than 2-3 seconds
  • Batch operations

Synchronous API calls are good for:

  • User authentication
  • Data validation that affects the next step
  • Operations that must complete for the UI to make sense
  • Simple database reads/writes that are fast and reliable

The test: if the user needs to see the result immediately to continue their workflow, it's synchronous. If they can continue working while it happens in the background, it's async.

How do I build event-driven workflows that don't become chaos?

💡
Event-driven workflows

Design event-driven systems around business events, not technical events. Use explicit event schemas and centralized event routing.

Event-driven architecture is powerful, but it can quickly become impossible to debug if you don't establish clear patterns.

  • Business events, not technical events: Publish UserRegistered, OrderCompleted, PaymentFailed—not DatabaseRowInserted or APICallMade.
  • Explicit schemas: Every event should have a defined structure. Use a schema registry or at least document your event formats.
  • Single responsibility: Each event handler should do one thing well. Don't create handlers that update three different systems.
  • Idempotency: Event handlers might receive the same event multiple times. Make sure they can handle duplicates gracefully.
  • Event sourcing (when appropriate): For complex domains, consider storing events as the source of truth rather than just using them for notifications.

The key is treating events as a first-class part of your system design, not an afterthought.

How do I track state across multiple steps?

💡
Tracking state

Use explicit workflow orchestration with persistent state tracking, not implicit coordination through side effects.

Multi-step workflows need explicit coordination:

  • Workflow tables: Create dedicated tables to track workflow progress. Each workflow instance gets a row with current step, completion status, and relevant metadata.
  • Step definitions: Define your workflow steps explicitly in code or configuration. Each step knows what it does and what comes next.
  • Compensation actions: For each step that changes external state, define how to undo it if later steps fail.
  • Timeouts and recovery: Set timeouts for each step and define what happens when they expire.
  • Visibility: Build admin interfaces to see workflow status and intervene when things get stuck.

The Saga pattern is your friend here—it provides a structured way to coordinate distributed transactions across multiple services.

How do I debug complex backend logic?

💡
Debugging

Implement structured logging, distributed tracing, and explicit error boundaries with context preservation.

Debugging backend workflows is detective work. Give yourself good clues:

  • Structured logging: Log with consistent structure (JSON) and include correlation IDs that tie related operations together.
  • Trace everything: Use distributed tracing to follow requests across service boundaries. Tools like OpenTelemetry make this easier.
  • Context propagation: Pass request context (user ID, request ID, workflow ID) through your entire call stack.
  • Error boundaries: Catch errors at clear boundaries and add context about what was happening when they occurred.
  • State snapshots: For complex workflows, log state transitions so you can see exactly where things went wrong.

The goal is to be able to answer "what was the system trying to do when this failed?" without having to reproduce the bug.

How do I prevent spaghetti code in my workflows?

💡
Preventing spaghetti code

Establish clear boundaries between workflow orchestration, business logic, and external integrations. Use dependency injection and explicit interfaces.

Spaghetti code happens when everything is connected to everything else:

  • Separation of concerns: Keep workflow orchestration separate from business logic. The orchestrator coordinates; individual services do the work.
  • Explicit dependencies: Use dependency injection rather than creating dependencies inside your workflow code.
  • Interface-based design: Define clear interfaces between components. This makes testing easier and reduces coupling.
  • Single responsibility: Each workflow step should have one clear purpose. If you can't explain what a step does in one sentence, it's probably doing too much.
  • Configuration over code: For business rules that change frequently, use configuration files or database settings rather than hard-coding them.

The test: if changing one business rule requires touching five different files, your code is too coupled.

How do I change business rules without redeploying everything?

💡
Changing business rules

Externalize business rules into configuration, use feature flags for gradual rollouts, and implement rule engines for complex decision logic.

Business rules change more often than you think they will:

  • Configuration-driven rules: Store business rules in configuration files or database tables that can be updated without code changes.
  • Feature flags: Use feature flags to enable new business logic for specific users or percentages of traffic before rolling out fully.
  • Rule engines: For complex decision logic, consider a rule engine that lets business users modify rules through a UI.
  • Version your rules: Keep track of when rules changed and who changed them. Business logic changes should be auditable.
  • Gradual rollouts: Never change critical business logic for all users at once. Roll out changes gradually and monitor the impact.

The goal is to separate "what the system does" from "how the system does it" so business changes don't require engineering changes.

How does Xano make workflows simpler?

💡
Workflows in Xano

Xano provides visual workflow orchestration with built-in state management, automatic retry logic, and integrated background job processing.

Here's what I love about building workflows in Xano:

  • Visual workflow builder: You can see the entire workflow as a flowchart, making it easier to understand and debug complex business logic.
  • Built-in state management: Workflow state is automatically tracked and persisted. You don't need to build your own workflow tables.
  • Automatic retry handling: Failed steps can be configured to retry automatically with exponential backoff, without writing custom retry logic.
  • Integrated job queue: Background jobs are built into the platform. No need to set up Redis or manage separate queue infrastructure.
  • Real-time monitoring: You can see workflow execution in real-time and get alerts when things fail.
  • Database integration: Workflows can directly interact with your database without additional API layers.

The biggest advantage? You can focus on business logic instead of plumbing. The platform handles the infrastructure concerns so you can concentrate on what makes your product unique.

Ready to build workflows that actually work?

💡
Build workflows that work

The patterns in this guide aren't theoretical—they're battle-tested approaches that work in production systems handling real traffic and real money.

The key insight is that good workflow architecture is about making the right tradeoffs: synchronous vs asynchronous, immediate vs eventual consistency, flexibility vs reliability.

Start with simple patterns and evolve them as your needs become clearer. Don't over-engineer from day one, but don't ignore these patterns until you're drowning in technical debt either.

And if you want to skip the infrastructure complexity and focus on business logic, try Xano for free—it handles the plumbing so you can focus on what actually matters.