Concurrency Control in Distributed Database Concurrency control schemes dealt with handling of data as part of concurrent transactions. Various locking protocols are used for handling concurrent transactions in centralized database systems. There are no major differences between the schemes in centralized and distributed databases. The only major difference is that the way the lock manager should deal with the replicated data. The following topics discusses about several possible schemes that are applicable to an environment where data can be replicated in several sites. We shall assume the existence of the shared and exclusive lock modes. Locking protocols 1. Single lock manager approach 2. Distributed lock manager approach a) Primary Copy protocol b) Majority protocol c) Biased protocol d) Quorum Consensus protocol Some assumptions: Our distributed database system consists of n sites (servers/computers in different locations) Data are replicated in two or more sites Single Lock Manager - Concurrency Control in Distributed Database Concurrency Control in Distributed Database - Single Lock Manager Approach In this approach, the distributed database system which consists of several sites, maintains a single lock manager at a chosen site as shown in Figure 1.Observe Figure 1 for Distributed Sites S1, S2, …, S6 with Site S3 chosen as Lock-Manager Site. Figure 1 - Single Lock Manager Approach
The technique works as follows; When a transaction request for locking some data items, the request must be forwarded to the chosen lock manager site for locks. This is done by the Transaction manager of site where the request is initiated. The lock manager at the chosen lock-manager site decides to grant the lock request immediately based on the usual procedure. [That is, if a lock is already held on the requested data item by some other transactions in an incompatible mode, lock cannot be granted. If the data item is free or data item is locked in a compatible mode, the lock manager grants the lock] If lock request granted, the transaction can read from any site where the replica is available. On successful completion of transaction, the Transaction manager of initiating site can release the lock through unlock request to the lock-manager site. Example – Transaction handling in Single Lock-Manager Approach: Figure 2 - Steps of Transaction handling in Single Lock Manager approach Let us assume that the Transaction T1 is initiated at Site S5 as shown in Figure 2 (Step 1). Also, assume that the requested data item D is replicated in Sites S1, S2, and S6. The steps are numbered in the Figure 2. According to the discussion above, the technique works as follows; • Step 2 -The initiator site S5’s Transaction manager sends the lock request to lock data item D to the lock-manager site S3. The Lock-manager at site S3 will look for the availability of the data item D. • Step 3 - If the requested item is not locked by any other transactions, the lock-manager site responds with lock grant message to the initiator site S5. Step 4 - As the next step, the initiator site S5 can use the data item D from any of the sites S1, S2, and S6 for completing the Transaction T1. Step 5 -After successful completion of the Transaction T1, the Transaction manager of S5 releases the lock by sending the unlock request to the lock-manager site S3. • •
Advantages: • • Locking can be handled easily. We need two messages for lock (one for request, the other for grant), and one message for unlock requests. Also, this method is simple as it resembles the centralized database. Deadlocks can be handled easily. The reason is, we have one lock manager who is responsible for handling the lock requests. Disadvantages: • • The lock-manager site becomes the bottleneck as it is the only site to handle all the lock requests generated at all the sites in the system. Highly vulnerable to single point-of-failure. If the lock-manager site failed, then we lose the concurrency control. Distributed Lock Manager - Concurrency Contorl in Distributed Database Distributed Lock-Manager Approach In this approach, the function of lock-manager is distributed over several sites. [Every DBMS server (site) has all the components like Transaction Manager, LockManager, Storage Manager, etc.] In Distributed Lock-Manager, every site owns the data which is stored locally. • This is true for a table that is fragmented into n fragments and stored in n sites. In this case, every fragment is unique from every other fragment and completely owned by the site in which it is stored. For those fragments, the local Lock-Manager is responsible to handle lock and unlock requests generated by the same site or by other sites. • If the data stored in a site is replicated in other sites, then a site cannot own the data completely. In such case, we cannot handle any lock request for a data item stored in a site as the case of fragmented data. If we handle like fragmented data, it leads to inconsistency problems as there are multiple copies stored in several sites. This case can be handled using several protocols which are specifically designed for handling lock requests on replicated data. The protocols are, o o o o Primary Copy Protocol Majority Based Protocol Biased Protocol Quorum Consensus Protocol Advantages: • Simple implementation is required for the data which are fragmented. They can be handled as in the case of Single Lock-Manager approach. • For replicated data, again the work can be distributed over several sites using one of the above listed protocols. • Lock-Manager site is not the bottleneck as the work of lock-manager is distributed over several sites.
Disadvantages: • Handling of Deadlock is difficult, because, a transaction T1 which acquired a lock on a data item Q at site S1 may be waiting for lock on another data item R as site S2. This wait is genuine or a deadlock has occurred is not easily identifiable. Primary Copy Protocol: Assume that we have the data item Q which is replicated in several sites and we choose one of the replicas of data item Q as the Primary Copy (only one replica). The site which stores the Primary Copy is designated as the Primary Site. Any lock requests generated for data item Q at any sites must be routed to the Primary site. Primary site’s lock-manager is responsible for handling lock requests, though there are other sites with same data item and local lock-managers. We can choose different sites as lock-manager sites for different data items. How does Primary Copy protocol work? Figure 1 shows the Primary In the figure, Copy protocol implementation. • Q, R, and S are different data items that are replicated. • Q is replicated in sites S1, S2, S3 and S5 (represented in blue colored text). Site S3 is designated as Primary site for Q (represented in purple colored text). • R is replicated in sites S1, S2, S3, and S4. Site S6 is designated as Primary site for R. • S is replicated at sites S1, S2, S4, S5, and S6. Site S1 is designated as Primary site for S.