If we have a table organization and a search technique which retrieve the key in a single access ,it would be very efficient. To do so, the position of the key in the table should not depend upon the other keys but the location should be calculated on the basis of the key itself .Such an organization and search technique is called hashing .
In hashing the address or location of an identifier X is obtained by using some function f(X) which gives address of X in a table.
HASHING TERMINOLOGY
Hash function :- A function that transforms a key X into a table index is called a hash function.
Hash address:- The address of X computed by the hash function is called the hash address or home address of X.
Hash Table :- The memory available to maintain the symbol table is sequential . This is referred to as hash table .
Bucket :- Each hash table is partitioned into b buckets ht[0]......ht[b-1].
Each bucket is capable of holding 's' records . Thus , a bucket consists of 's' slots. When s=1 , each bucket can hold 1 record.
The function f(X) maps and identifier X into one of the 'b' buckets i.e. from 0 to b-1.
Synonyms :- Usually , the total number of possible values for identifier X is much larger than the hash table size. Therefore the hash function f(X) must map several identifiers into the same bucket .
Two identifiers I1 and I2 are synonyms if f(I1) = f(I2)
Collision :- When two non identical identifiers are mapped into the same bucket , a collision is said to occur , i.e. f(I1)= f(I2). Hence all synonyms occupy the same bucket.
Overflow :- An overflow is said to occur when an identifier gets mapped onto a full bucket.
When s=1 ,i.e. ,a bucket contains only one record ,collision and overflow occur simultaneously . Such a situation is called a Hash clash.
Load factor :- If n is the total number of identifiers in the table and t is the table size , the load factor is : lf = n/t
It represents the fraction of the table that is occupied .
HASHING FUNCTION
A hashing function f transforms an identifier X into a bucket address in the hash table .The desirable properties of such a function are as follows.
i. It should be easily computable .
ii. It should minimize the number of collisions .
iii. The hash function should compute the address,which depends on all or most of the characters in the identifier .
iv. It should yield uniform bucket addresses for random inputs . Such a function is called a uniform hash function.
Several uniform hash functions are in use some of them are
1 Mid Square
2 Division
3 Folding
4 Digit Analysis
No comments:
Post a Comment