Skip to main content

Spend Update Queue Full Warnings

Overview​

The "Spend update queue is full" warning occurs in high-volume LiteLLM proxy deployments when the internal spend tracking queue reaches capacity. This is a protective mechanism to prevent memory issues during traffic spikes.

Warning Message​

WARNING:litellm.proxy.db.db_transaction_queue.spend_update_queue:Spend update queue is full. Aggregating entries to prevent memory issues.

Root Cause​

The spend update queue has a default maximum size of 10,000 entries (MAX_SIZE_IN_MEMORY_QUEUE=10000). When this limit is reached:

  1. New spend tracking entries are aggregated instead of queued individually
  2. This prevents memory exhaustion but may slightly delay spend updates
  3. The warning indicates your deployment is processing requests faster than the database can handle spend updates

Solutions​

1. Increase Queue Size​

Set the MAX_SIZE_IN_MEMORY_QUEUE environment variable to a higher value:

MAX_SIZE_IN_MEMORY_QUEUE=50000

Considerations:

  • Higher values use more memory
  • Monitor system memory usage when increasing
  • Recommended for deployments with consistent high traffic

2. Database Performance Optimization​

Optimize your database configuration and connection handling for better write performance.

Database-specific optimizations:

  • PostgreSQL: Increase max_connections, tune shared_buffers
  • MySQL: Optimize innodb_buffer_pool_size
  • SQLite: Consider switching to PostgreSQL/MySQL for high volume

3. Horizontal Scaling​

Deploy multiple proxy instances with load balancing. This distributes the spend tracking load across multiple queues, reducing the pressure on any single instance's spend update queue.

Best Practices​

  1. Set appropriate queue size based on your traffic patterns
  2. Monitor database performance regularly
  3. Use connection pooling for database connections
  4. Consider read replicas for spend analytics queries
  5. Implement alerting on queue utilization > 80%

Impact Assessment​

  • No CPU impact: If CPU usage remains normal during warnings, the current queue size is likely adequate
  • CPU spikes with warnings: Increase MAX_SIZE_IN_MEMORY_QUEUE to reduce aggregation overhead
  • Frequent warnings: Consider database optimization or horizontal scaling
# Environment variables
MAX_SIZE_IN_MEMORY_QUEUE: 10000 # Default queue size