# 6.824 Lecture 8 - Zookeeper

## Linearizability

• Continued from 6.824 Lecture 7 - Fault Tolerance with Raft - Part 2
• If all you have are writes, it’s hard to say if the system really did anything
• This history is linearizable because a total order exists that matches the conditions:
• If two events are non-overlapping, the earlier one must come first in the order
• Every read sees the most recent write in the order
• This history is not linearizable because C1’s experience implies that 2 must’ve been written before 1, and C2’s experience implies the opposite:
• A linearizable system cannot serve stale reads:
• Here the upward arrow marks the client re-sending a request because the original request timed out:
• Assume the server actually processed the original request but the response was lost
• The retried request receives a cached response (so 3, not 4)
• You can either view retries as low-level machinery (in which case this is legal) or view them as actual repeated requests (in which case this violates linearizability)

## Zookeeper

• Why ZK?
• ZK is an API for a general-purpose coordination service
• Does ZK’s performance scale linearly with the number of servers in the ZK cluster?
• The entire point of adding servers is to improve fault tolerance, so the answer would ordinarily be no
• But ZK optimizes for reads, so, the answer is actually yes for read-heavy workloads
• Adding servers may make some things slower (increased network/leader contention)
• ZK + ZAB is roughly similar to “generic application” + Raft:
• No sharding / partitioning
• Writes in the per-client ordering are ordered the same way in the global/total order of linearizable writes
• Reads observe the state of the “filesystem” at various points on the “log”.
• Subsequent reads within a session may go forward within the log but never backward.
• The zxid field is used to maintain this guarantee even when the client switches replicas

## FAQ

Q: How does linearizability differ from serializability?

A: The usual definition of serializability is much like linearizability, but without the requirement that operations respect real-time ordering. Have a look at this explanation: http://www.bailis.org/blog/linearizability-versus-serializability/