← Lessons

quiz vs the machine

Gold1340

Concurrency

Rate Limiting With A Semaphore

Use permits to cap concurrent access and throttle request rate.

4 min read · core · beat Gold to climb

Rate Limiting With A Semaphore

A semaphore holds a fixed number of permits. A thread must acquire a permit before proceeding and release it when done. If no permit is free, the thread waits. This makes a semaphore a natural tool for limiting how many operations run concurrently.

A semaphore initialized with one permit acts like a mutex. Initialized with N permits it allows up to N concurrent holders, which is exactly what you want to cap concurrency against a fragile downstream service: never more than N in flight at once.

For rate limiting by requests per second rather than concurrency, a semaphore can be combined with replenishment. Start with a bucket of permits and refill them on a timer. Callers acquire a permit per request; when the bucket empties they wait until the next refill. This is the token bucket idea expressed with semaphore permits.

  • tryAcquire Take a permit without blocking, returning failure if none is free, so callers can shed load.
  • acquire Block until a permit frees up.
  • release Return a permit for others.

A subtle bug is forgetting to release on an error path, which slowly leaks permits until the limiter deadlocks. Always release in a finally style block.

Key idea

A semaphore caps concurrency by handing out a fixed pool of permits, and with timed refills it enforces a request rate.

Check yourself

Answer to earn rating on the learn ladder.

1. A semaphore initialized with N permits guarantees what?

2. What bug slowly breaks a semaphore based limiter?