RSS Feed


Comments RSS

Python Dictionaries, a native Hash Table data type

All programming languages make data structures available for data storage.   These can be very simple and small as registers and memory byte, word, or multi-word variables as in various Assembly languages.  They can also be potentially large and complex as memory pointer linked lists or multidimensional arrays that are available in many high level languages.   Python provides for many types of data structures as well, the best known being its List type.   While this is definitely the workhorse of the language, Python also provides something even more flexible and powerful – Dictionaries.   In most programming languages, a dictionary would also be known as a “hash table”.   A hash table differs from an array or list in that it allows for the access of data by named indices that are defined by the programmer, rather than by integer indices only.   In most programming languages, these hash tables are cobbled to together by the programmer and often require a lot of hand coding and handling of pointer references.   Python makes dictionaries available as a base data type and one only has to create the dictionary variable and begin assigning data to it.  That’s it!  No additional coding is required to create the dictionary or maintain it aside form what ever operations are desired to add, delete, or otherwise maintain its contents – and native functions are provided such as, but not limited to, ‘del’, ‘keys’, ‘values’,  ‘items’, and ‘pop’.

Here is an example.  It is a nested dictionary, or dictionary of dictionaries, that contains a variety of data.

>>> my_dictionary = {‘3262812’: {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]}, ‘3231023’: {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}, ‘3257313’: {‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}}

First we create a dictionary called my_dictionary.   Notice that dictionaries use {} for their enclosing brackets, not [] used by lists.   However, dictionaries DO use [] for later reference and access to the dictionary as will be seen below.   For example, the first entry in our dictionary shown here is ‘3257313’: {‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}.   The components of this dictionary entry is a part number – ‘3257313’ which is a KEY to this dictionary and another dictionary with keys ‘SAP_state’, ‘REV’, and ‘SPECS’.  ‘SAP_state’ gives access to a string variable, ‘REV’ to a string variable, and ‘SPECS’ to a list variable that contains strings.

Access to data of the dictionary and had by one or more keys as shown.

>>> my_dictionary[‘3257313’]
{‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}
In this case, we accessed a main key, which yield all the dictionary contents for that key.

>>> my_dictionary[‘3257313’][‘REV’]
Here we access via the a main key and subkey ‘REV’.

>>> my_dictionary[‘3257313’][‘SAP_state’]
Here we access via the a main key and subkey ‘SAP_state’.

>>> my_dictionary[‘3257313’][‘SPECS’]
[‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]
Here we access via the a main key and subkey ‘SPECS’

>>> my_dictionary[‘3257313’][‘SPECS’][3]
Here we go three levels deep into the dictionary and access via the main key ‘3257313’, subkey ‘SPECS’ and then the index [3] of the list found in ‘SPECS’.

Here are examples of what is returned when the ‘keys’, ‘values’, and ‘items’ methods of the dictionary are used.

>>> my_dictionary.keys()
dict_keys([‘3257313’, ‘3231023’, ‘3262812’])

>>> my_dictionary.values()
dict_values([{‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}, {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}, {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]}])

>>> my_dictionary.items()
dict_items([(‘3257313’, {‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}), (‘3231023’, {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}), (‘3262812’, {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]})])

Another dictionary function that should be mentioned is ‘del’.  It is used as a procedural function to delete entries from the dictionary.  For instance, if the dictionary looks like this:
{‘3257313’: {‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}, ‘3231023’: {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}, ‘3262812’: {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]}}

Then del my_dictionary[‘3257313’] would delete the complete entry in the dictionary referenced by the key ‘3257313’, giving the following result:
>>> del my_dictionary[‘3257313’]
>>> my_dictionary
{‘3231023’: {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}, ‘3262812’: {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]}}

The same entry deletion operation could also be achieved by using the ‘pop’ method.   The difference is that pop will allow the removed items associated with the key to be assigned to another variable.
>>> my_dictionary
{‘3257313’: {‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}, ‘3231023’: {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}, ‘3262812’: {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]}}
>>> removed_item = my_dictionary.pop(‘3257313’)
>>> removed_item
{‘SAP_state’: ‘True’, ‘REV’: ‘O’, ‘SPECS’: [‘Q00021’, ‘Q00020’, ‘Q00025’, ‘M25070’, ‘Q50151’, ‘Q50153’]}
>>> my_dictionary
{‘3231023’: {‘SAP_state’: ‘True’, ‘REV’: ‘C’, ‘SPECS’: [‘Q00022’, ‘Q00026’, ‘Q00027’, ‘E50002’]}, ‘3262812’: {‘SAP_state’: ‘False’, ‘REV’: ‘B’, ‘SPECS’: [‘Q00021’, ‘Q00024’]}}

Dictionaries are highly flexible and very easily used hash tables in Python.  The advantage of Python dictionaries is direct access to data via named references without having to perform searches via a built in data type.   Just provide a valid key value and Python does the rest.

Hang 10!

As readers of the blog will know, I am a real Python programming enthusiast.  It is gratifying to discover that I am apparently riding quite a large wave of growth in popularity and usage of Python.   This article from Dr. Dobbs Journal discusses this and other interesting programming trends revealed by the Tiobe Programming Community Index.   The Rise and Fall of Languages in 2010