Async job scheduling
@Facebook (Private)
[Blog] Asynchronous computing- https://engineering.fb.com/2020/08/17/production-engineering/async/
- Problem with simple priority queues:
- large use cases dominate
- bad jobs stuck
- uneven utilization between peaks & valleys
- Building to scale
- introduce delay tolerance
- Capacity optimization:
- classifying use cases:
- daily traffic: predictable
- major events: semi-predictable
- Incident response: short and spikey, unpredictable
- Time shifting:
- Predictive - which data may need, precomputes and cache
- Deferred compute
- batching:
- reduce # of requests to other components
- potential high cache reuse and code warmup
- classifying use cases:
- Capacity policy: quota and rate limiting
- CPU instruction utilization and memory limit - when exceeded, throttle and send alert
- rate limit on intake