Version vector

A version vector is a mechanism for tracking changes to data in a distributed system, where multiple agents might update the data at different times. The version vector allows the participants to determine if one update preceded another (happened-before), followed it, or if the two updates happened concurrently (and therefore might conflict with each other). In this way, version vectors enable causality tracking among data replicas and are a basic mechanism for optimistic replication. In mathematical terms, the version vector generates a preorder that tracks the events that precede, and may therefore influence, later updates.

Version vectors maintain state identical to that in a vector clock, but the update rules differ slightly; in this example, replicas can either experience local updates (e.g., the user editing a file on the local node), or can synchronize with another replica:

  • Initially all vector counters are zero.
  • Each time a replica experiences a local update event, it increments its own counter in the vector by one.
  • Each time two replicas a and b synchronize, they both set the elements in their copy of the vector to the maximum of the element across both counters: . After synchronization, the two replicas have identical version vectors.

Pairs of replicas, a, b, can be compared by inspecting their version vectors and determined to be either: identical (), concurrent (), or ordered ( or ). The ordered relation is defined as: Vector if and only if every element of is less than or equal to its corresponding element in , and at least one of the elements is strictly less than. If neither or , but the vectors are not identical, then the two vectors must be concurrent.

Version vectors[1] or variants are used to track updates in many distributed file systems, such as Coda (file system) and Ficus, and are the main data structure behind optimistic replication.[2]

Other mechanisms

  • Hash Histories [3] avoid the use of counters by keeping a set of hashes of each updated version and comparing those sets by set inclusion. However this mechanism can only give probabilistic guarantees.
  • Concise Version Vectors [4] allow significant space savings when handling multiple replicated items, such as in directory structures in filesystems.
  • Version Stamps [5] allow tracking of a variable number of replicas and do not resort to counters. This mechanism can depict scalability problems in some settings, but can be replaced by Interval Tree Clocks.
  • Interval Tree Clocks[6] generalize version vectors and vector clocks and allows dynamic numbers of replicas/processes.
  • Bounded Version Vectors [7] allow a bounded implementation, with bounded size counters, as long as replica pairs can be atomically synchronized.
  • Dotted Version Vectors [8] address scalability with a small set of servers mediating replica access by a large number of concurrent clients.

References

  1. ^ Douglas Parker, Gerald Popek, Gerard Rudisin, Allen Stoughton, Bruce Walker, Evelyn Walton, Johanna Chow, David Edwards, Stephen Kiser, and Charles Kline. Detection of mutual inconsistency in distributed systems. Transactions on Software Engineering. 1983
  2. ^ David Ratner, Peter Reiher, and Gerald Popek. Dynamic version vector maintenance. Technical Report CSD-970022, Department of Computer Science, University of California, Los Angeles, 1997
  3. ^ ByungHoon Kang, Robert Wilensky, and John Kubiatowicz. The Hash History Approach for Reconciling Mutual Inconsistency. ICDCS, pp. 670-677, IEEE Computer Society, 2003.
  4. ^ Dahlia Malkhi and Doug Terry. Concise Version Vectors in WinFS.Distributed Computing, Vol. 20, 2007.
  5. ^ Paulo Almeida, Carlos Baquero and Victor Fonte. Version Stamps: Decentralized Version Vectors. ICDCS, pp. 544-551, 2002.
  6. ^ Paulo Almeida, Carlos Baquero and Victor Fonte. Interval Tree Clocks. OPODIS, Lecture Notes in Computer Science, Vol. 5401, pp. 259-274, Springer, 2008.
  7. ^ José Almeida, Paulo Almeida and Carlos Baquero. Bounded Version Vectors. DISC: International Symposium on Distributed Computing, LNCS, 2004.
  8. ^ Nuno Preguiça, Carlos Baquero, Paulo Almeida, Victor Fonte and Ricardo Gonçalves. Brief Announcement: Efficient Causality Tracking in Distributed Storage Systems With Dotted Version Vectors. ACM PODC, pp. 335-336, 2012.

Content Disclaimer

Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.

  1. The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
  2. There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
  3. It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
  4. Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
  5. Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.