The simplest approach to build a rate limiter is the "fixed window" implementation in which we cap the maximum number of requests in a fixed window of time. For exmaple, if the window size is 1 hour, we can "fix" it at the top of the hour, like 12:00-12:59, 1:00-1:59, and so forth.
The procedure to implement a fixed window rate limiter is fairly simple, for each request we:
The fixed window recipe ignores the cost of the request (all requests are created equal) and in this particular implementation it uses a single quota for all all users. This simple implementation minimizes the CPU and I/O utilization but that comes with some limitations. It is possible to experience spikes near the edges of the window, since APIs users might program their requests in a "use or lose it" approach.
One way to minimize the spikiness in this scheme is to have multiple time windows of different granularity. For example, you can rate limit at the hour and minute levels, say, allowing a maximum of 2,000 request per hour, and a maximum of 33 requests per minute.
This basic recipe using Redis Strings, a minute-size window and a quota of 20 requests is outlined on the Redis Blog. I'll summarize it here before we jump into out Spring Reactive implementation:
GET [user-api-key]:[current minute number]
 such as GET "u123:45"
MULTI
 and EXEC
) increment the key and set the expiry to 59 seconds into the future.MULTI
INCR [user-api-key]:[current minute number]
EXPIRE [user-api-key]:[current minute number] 59
EXEC
5. Otherwise, fulfill the request.
Ok, now that we know the basic recipe, let's implement it in Spring