RSS Feed

RSS
Comments RSS

HOWTO-Build a searchable dictionary from a .csv using Python

##Building a searchable dictionary from a .csv using Python
##by Bill Allen    10/20/2010

##Sample contents of test.csv, the first row was a header row in the file
##    Material,Material description,Dv,Matl Group
##    256213,IRON KIT STEWART & STEVENSON TRIPLEX,FL,Z-OTF989
##    256214,IRON KIT STEWART & STEVENSON QUINTAPLEX,FL,Z-OTF989

>>> import csv                        #import the csv module for handling of Comma Separated Value files
>>> g = {}                          #create an empty dictionary
>>> reader = csv.reader(open(“test.csv”, ‘r’))         #for the csv module, this reader object is much like a database cursor object
>>> reader_list = list(reader)                    #create a usable list from the reader object (the entire test.csv file contents)

#As read in from the file, each list entry is a list of 4 strings.
#Each could be thought of as a record consisting of:
#[part_number, description, product_line, product_type]
>>> print(reader_list[1])
[‘256213’, ‘IRON KIT STEWART & STEVENSON TRIPLEX’, ‘FL’, ‘Z-OTF989’]
>>> print(reader_list[2])
[‘256214’, ‘IRON KIT STEWART & STEVENSON QUINTAPLEX’, ‘FL’, ‘Z-OTF989’]
>>>

#Now we start building our dictionary.
#In this example, we want the part_number, description, and product_type.
#For each entry in the dictionary the part_number is the key value ((THINK HASH TABLES!)).
#The description and the product_type will be stored under each key as a tuple so each part
#of the dictionary entry is still independently accessible.
>>> g[reader_list[1][0]] = (reader_list[1][1], reader_list[1][3])
>>> g[reader_list[2][0]] = (reader_list[2][1], reader_list[2][3])

#Here is our resulting dictionary.
>>> print(g)
{‘256213’: (‘IRON KIT STEWART & STEVENSON TRIPLEX’, ‘Z-OTF989’), ‘256214’: (‘IRON KIT STEWART & STEVENSON QUINTAPLEX’, ‘Z-OTF989’)}

#Print out individual dictionary entries by part_number.
>>> print(g[‘256213’])
(‘IRON KIT STEWART & STEVENSON TRIPLEX’, ‘Z-OTF989’)
>>> print(g[‘256214’])
(‘IRON KIT STEWART & STEVENSON QUINTAPLEX’, ‘Z-OTF989’)
>>>

#Print out description and product_type information independently by part_number.
#All the data that was stored in the dictionary is fully accessible!
>>> print(g[‘256214’][0])
IRON KIT STEWART & STEVENSON QUINTAPLEX
>>> print(g[‘256214’][1])
Z-OTF989
>>>

#Now we can test search the dictionary for part_numbers.
#Just some simple Boolean tests here to demonstrate.
>>> ‘256214’ in g
True
>>> ‘256213’ in g
True
>>> ‘256216’ in g
False
>>>

#Now for a sample program.
#We will build the dictionary and then print out the first 9 entries.

import csv
parts_dict = {}
reader = csv.reader(open(“test.csv”, ‘r’))
reader_list = list(reader)

#Build the dictionary.
for i in range(1,len(reader_list)):
     parts_dict[reader_list[i][0]] = (reader_list[i][1], reader_list[1][3])

#Print out the first 9 entries with little bit of formatting thrown in.
for j in range(1,10):
     print(“key>”,reader_list[j][0],” – data>”,parts_dict[str(reader_list[j][0])], sep=”)

>>>>
key>256213 – data>(‘IRON KIT STEWART & STEVENSON TRIPLEX’, ‘Z-OTF989’)
key>256214 – data>(‘IRON KIT STEWART & STEVENSON QUINTAPLEX’, ‘Z-OTF989’)
key>256224 – data>(‘IRON KIT STEWART & STEVENSON QUINT PUMP’, ‘Z-OTF989’)
key>3100695 – data>(‘+A RING 2.00 LS20 BRS’, ‘Z-OTF989’)
key>3100720 – data>(‘LF020 LS20 UNIRAW BENT’, ‘Z-OTF989’)
key>3100784 – data>(‘JP060 OLPO SIR F .625 BALL’, ‘Z-OTF989’)
key>3100789 – data>(‘JP060 OLPO SIR M .625 BALL’, ‘Z-OTF989’)
key>3100791 – data>(‘JP100 OLPO SIR M .625 BALL’, ‘Z-OTF989’)
key>3100849 – data>(‘JP120 OLPO SIR F .625 BALL’, ‘Z-OTF989’)









*