OpenAI interview question

Design a distributed rate limiter for an LLM inference API