×
In order to succeed, we must first believe that we can.
--Your friends at LectureNotes
Close

Data Structures through C++

by Naresh BabuNaresh Babu
Type: NoteOffline Downloads: 122Views: 3261Uploaded: 3 months ago

Share it with your friends

Suggested Materials

Leave your Comments

Contributors

Naresh Babu
Naresh Babu
Hashing is a technique that is used to uniquely identify a specific object from a group of similar objects. Some examples of how hashing is used in our lives include: Hashing is implemented in two steps: 1. An element is converted into an integer by using a hash function. This element can be used as an index to store the original element, which falls into the hash table. 2. The element is stored in the hash table where it can be quickly retrieved using hashed key. hash = hashfunc(key) index = hash % array_size In this method, the hash is independent of the array size and it is then reduced to an index (a number between 0 and array_size − 1) by using the modulo operator (%). A hash function is any function that can be used to map a data set of an arbitrary size to a data set of a fixed size, which falls into the hash table. The values returned by a hash function are called hash values, hash codes, hash sums, or simply hashes. To achieve a good hashing mechanism, It is important to have a good hash function with the following basic requirements: 1. Easy to compute: It should be easy to compute and must not become an algorithm in itself. 2. Uniform distribution: It should provide a uniform distribution across the hash table and should not result in clustering. 3. Less collisions: Collisions occur when pairs of elements are mapped to the same hash value. These should be avoided. Note: Irrespective of how good a hash function is, collisions are bound to occur. Therefore, to maintain the performance of a hash table, it is important to manage collisions through various collision resolution techniques. In open addressing, instead of in linked lists, all entry records are stored in the array itself. When a new entry has to be inserted, the hash index of the hashed value is computed and then the array is examined (starting with the hashed index). If the slot at the hashed index is unoccupied, then the entry record is inserted in slot at the hashed index else it proceeds in some probe sequence until it finds an unoccupied slot. The probe sequence is the sequence that is followed while traversing through entries. In different probe sequences, you can have different intervals between successive entry slots or probes. When searching for an entry, the array is scanned in the same sequence until either the target element is found or an unused slot is found. This indicates that there is no such key in the table. The name "open addressing" refers to the fact that the location or address of the item is not determined by its hash value. Linear probing is when the interval between successive probes is fixed (usually to 1). Let’s assume that the hashed index for a particular entry is index. The probing sequence for linear probing will be: index = index % hashTableSize index = (index + 1) % hashTableSize
index = (index + 2) % hashTableSize index = (index + 3) % hashTableSize and so on… Hash collision is resolved by open addressing with linear probing. Since CodeMonk and Hashing are hashed to the same index i.e. 2, store Hashing at 3 as the interval between successive probes is 1. Implementation of hash table with linear probing Assumption • • • There are no more than 20 elements in the data set. Hash function will return an integer from 0 to 19. Data set must have unique elements. Hash collision is resolved by open addressing with linear probing. Since CodeMonk and Hashing are hashed to the same index i.e. 2, store Hashing at 3 as the interval between successive probes is 1. Implementation of hash table with linear probing Assumption • • • There are no more than 20 elements in the data set. Hash function will return an integer from 0 to 19. Data set must have unique elements.
• (1,20) • (2,70) • (42,80) • (4,25) • (12,44) • (14,32) • (17,11) • (13,78) • (37,98) Sr.No. Key Hash Array Index 1 1 1 % 20 = 1 1 2 2 2 % 20 = 2 2 3 42 42 % 20 = 2 2 4 4 4 % 20 = 4 4 5 12 12 % 20 = 12 12 6 14 14 % 20 = 14 14 7 17 17 % 20 = 17 17
8 13 13 % 20 = 13 13 9 37 37 % 20 = 17 17 Sr.No. Key Hash Array Index After Linear Probing, Array Index 1 1 1 % 20 = 1 1 1 2 2 2 % 20 = 2 2 2 3 42 42 % 20 = 2 2 3 4 4 4 % 20 = 4 4 4 5 12 12 % 20 = 12 12 12 6 14 14 % 20 = 14 14 14 7 17 17 % 20 = 17 17 17 8 13 13 % 20 = 13 13 13 9 37 37 % 20 = 17 17 18

Lecture Notes