Sam Heather Sam Heather - 1 month ago 13
Python Question

Checking my understanding of Hash Tables

I'm a first year computer science student. I am trying to teach myself hash tables for an interview. After reading a few pieces about them, I thought the best way of seeing if I'd got it would be to implement my own hash table in Python. So that's what I've done. Please could somebody look at it and let me know what you think? Have I correctly understood what it is I am meant to be doing with a hash table?

storage_array = []

def show_menu():
menu_option = int(raw_input("Enter 1 to store data, Enter 2 to retrieve data: "))
if (menu_option == 1):
store_data()
elif (menu_option == 2):
retrieve_data()

def store_data():
key_for_data = raw_input("Please enter the key for the data you want to store: ")
actual_data = raw_input("Please enter the data you want to store: ")
ascii_count = generate_hash(key_for_data)
print ascii_count
storage_array[ascii_count] = actual_data
print "The data:'", actual_data, "'has been stored at index:'", ascii_count, "' which is the ascii count of:'", key_for_data, "'"
show_menu()

def retrieve_data():
key_for_data = raw_input("Enter the key for the data you want to retrieve: ")
ascii_count = generate_hash(key_for_data)
if (storage_array[ascii_count] == None):
print "No data was stored under this key"
else:
print "The data you requested for key:'", key_for_data, "'with ASCII count:'", ascii_count, "' is:'", storage_array[ascii_count], "'"
show_menu()

def generate_hash(input):
character_list = list(input)
ascii_count = 0
for character_index in range(0,len(character_list)):
ascii_count += ord(character_list[character_index])
return ascii_count

def initiate_list():
for repeat_index in range(0,1000):
storage_array.append(None)
print "List initiated with index's to 1000"

initiate_list()
show_menu()


##Or is it meant to hash the key like a dictionary and then store
##the value for that key in the hashed value in the hash table?

Answer

It looks like you have the general concept correct. The hash table takes an arbitrary key and turns it into an index into an array via some special method.

A couple points:

First, and most important: your generate_hash function can return an index that is invalid if the sum of the ord()s of the key is greater than 1000.

To fix this, have generate_hash return ascii_count % 1000. If you don't know what % means, go read up on the modulus operator (don't worry, it's not too complicated).

Second, also important: think about what happens if you use the following two keys: ab and ba. What you're doing isn't necessarily wrong, but it's important to understand the behavior of your hash table when different keys collide.

Third, less important: your for loops don't have to work like they do in C/C++. You could change

for character_index in range(0,len(character_list)):
        ascii_count += ord(character_list[character_index])

to

for character in character_list:
        ascii_count += ord(character)

Python for loops are pretty fancy :)

All in all, it looks great!