Considerations About the WAIT command in Redis Enterprise
Last updated 18, Apr 2024
Purpose
This document aims to understand the difference between the behavior of WAIT in OSS and Redis Enterprise (RE) and the known issues and side cases.
Scope
The WAIT commands can be used to develop a client using a Redis Client library and point it at Redis OSS and Enterprise deployments.
Details
The WAIT command works in the same way in both OSS and RE. How many replicas it waits for before responding back to the client depends on what parameters are passed to the WAIT command by the application. In RE, it will always be 1 since a RE database only supports a single HA shard; however, note that it does not guarantee that the write was replicated to the HA shard.
If WAIT responds with a 1 that means the HA shard acknowledged it has committed the write. If WAIT times out or responds with a 0 then we do not know for sure what happened hence there is no guarantee possible. The application would need to make a decision about whether to ignore, repeat the operation, or reconcile what happened.
WAIT with only one replica has serious implications for write availability, especially in the context of failovers, maintenance, time to fully sync replicas, etc. There are expected periods where the WAIT will timeout and report that it could not replicate even to 1 replica. In Redis Enterprise, the cases where WAIT will respond with zero are greatly expanded and include "regular" maintenance activities. Because of that, several questions should be addressed at design time to deal with the behavior of WAIT in the presence of maintenance activities.
- How long should the WAIT timeout be?
- What should an application do when WAIT cannot be fulfilled?
References
The description of the WAIT feature can be read in the article "Use the WAIT command for strong consistency".