Dictionaries

Dictionaries#

(Click here for the German version of this page)

Dictionaries in general#

Dictionaries, also known as hash maps, are a data structure, which can be used to store key-value pairs. If one wants to, one can imagine that a list is just a dictionary, where the indices 0, 1, …, N are the keys of the key-value pairs.

Keys#

In contrast to regular lists, dictionaries are not restricted to numerical indices such as 0, 1, …, N, but instead can be of any data type, data structure or class, as long as they are immutable. Immutable means that internal values cannot be changed without having to create a new object. Strings in Python are immutable. This fact can be confirmed like this:

word = "dictionary"
print(word[0])
# --> "d"

wort[0] = "D"
# --> TypeError: 'str' object does not support item assignment

Trying to alter the string results in an error, that the data type str (string) does not allow value assignment. The reason is that strings are implemented as being immutable.

In contrast, lists are mutable, since they can be modified directly using e.g., the append() method to add more elements.

my_list = [1, 2, 3]
print(my_list)
# --> [1, 2, 3]

my_list.append(4)
print(my_list)
# --> [1, 2, 3, 4]

This way, the value of my_list was changed without having to assign a new object. Every object, which is defined to be immutable, implements the __hash__() method. If this method is not explicitly defined, it is often inherited. This method uses hashing to assign a unique identification number to an object. Strings do have this method, while lists do not.

word = "dictionary"
print(word.__hash__())
# --> 4122803001462702718

my_list = [1, 2, 3]
print(my_list.__hash__())
# --> 'NoneType' object is not callable
# The __hash__() method does not exist for lists

If one still wants to use something like a list, then there is the data type “Tuple”, which is an explicitly immutable list. More information will be provided on the chapter regarding tuples.

Furthermore, the keys must be unique. All keys require their own hash value, respectively. But keys do not have to be of the same data type, as long as the data types are immutable.

Values#

In contrast to keys, values do not have restrictions. The value can be of an arbitrary data type, both immutable, mutable, unique and not unique.

Creation of a dictionary#

The Python syntax for creating a dictionary used curly braces and colons to define dictionaries. The curly braces define the object itself, and the key-value pairs are defined with colons using the syntax Key: Value. Multiple key-value pairs are separated using commas, just like with a list. In the following example, keys and values respectively are shown to be able to have different data types. In this example, both integers and strings, respectively.

digits = {
    1: "One",
    "Two": 2,
    3: "Three"
}
print(digits)
# --> {1: 'One', 'Two': 2, 3: 'Three'}

Dictionaries themselves are not immutable. This means that dictionaries can be used as values in other dictionaries, but cannot themselves be used as keys in other dictionaries. Nested dictionaries are very interesting, since they can be used to represent nested objects. For those interested, they may look up the term JSON.

After the creation of a dictionary, additional entries can still be added. The syntax for that is Dictionary_name[Key] = Value.

digits = {
    1: "One"
}

digits[2] = "Two"
print(digits)
# --> {1: 'One', 2: 'Two'}

The values of existing keys can also be changed afterwards.

digits = {
    1: "One",
    2: "Two"
}
digits[2] = "Dos"
print(digits)
# --> {1: 'One', 2: 'Dos'}

Reading and overwriting values#

To read the value for a given key, there are primarily two methods. Either one uses the method seen above, using the index syntax, such as with lists, where the key is provided as the index.

translation = {
    "One": "Uno",
    "Two": "Dos"
}

print(translation["One"])
# --> "Uno"

or you use the method Dictionary_name.get(Key), which is designed for this.

translation = {
    "One": "Uno",
    "Two": "Dos"
}

print(translation.get("One"))
# --> "Uno"

Which of those two should you use?

Accessing a value using the index syntax works just like accessing an element in a list by index. If the key is not present in the dictionary, one receives an error.

Instead, the method Dictionary_name.get(Key) returns None if the key does not exist in the dictionary. Additionally, one can easily define what should be returned instead of None if the key does not exist.

digits = {
    1: "One",
    2: "Two"
}

print(digits[3])
# --> KeyError: 3

print(digits.get(3))
# --> None

# If the key 3 does not exist,
# then "Does not exist" should be returned.
print(digits.get(3, "Does not exist"))
# --> "Does not exist"

But one is not able to modify values using the Dictionary_name.get(Key) method.

translation = {
    "One": "Uno",
    "Two": "Dos"
}

translation.get("One") = "Eins"
# --> SyntaxError: cannot assign to function call here.

Generally, it is recommended to use Dictionary_name.get(Key) if one just wants to retrieve a value, and Dictionary_name[Key] if one also wants to modify the value. One can use both together without a problem.

inventory = {
    "Bananas": 5,
    "Pears": 3
}

inventory["Pears"] = inventory.get("Pears", 0) + 3
print(inventory.get("Pears"))
# --> 6

# Lemons do not currently exist in the dictionary.
# With inventory.get("Lemons", 0)
# the value 0 is returned for lemons, 
# since they currently do not exist in the dictionary.
# Afterwards, 5 is added to their value.
inventory["Lemons"] = inventory.get("Lemons", 0) + 5
print(inventory.get("Lemons"))
# --> 5

Outputting all existing keys and values in a dictionary#

If one has a dictionary and e.g., wants to iterate over all the keys, values or key-value pairs, then the three methods Dictionary_name.keys(), Dictionary_name.values(), Dictionary_name.items() can be used to retrieve said information. The return values are similar to lists but are technically speaking not real lists. But one can just cast them to real lists with list().

keys = list(Dictionary_name.keys())

Dictionary_name.keys() returns all keys. Dictionary_name.values() returns all values. Dictionary_name.items() returns all key-value pairs as a “list” of tuples (immutable list indicated with round brackets instead of square brackets).

translation = {
    "One": "Uno",
    "Two": "Dos"
}

print(list(translation.keys()))
# --> ["One", "Two"]
print(list(translation.values()))
# --> ["Uno", "Dos"]
key_value_pairs = list(translation.items())
print(key_value_pairs)
# --> [("One", "Uno"), ("Two", "Dos")]
for key_value_pair in key_value_pairs:
    key = key_value_pairs[0]
    value = key_value_pairs[1]

Exercise#

Imagine that you have a text and want to count how often each word occurs in the text. To reduce complexity, all words are written in lower case and are already separated into a list. Punctuation was also removed.

The original text was:

Cats like to play, cats sleep a lot. Dogs bark loudly, but dogs run fast. Playing is fun!

words = ["cats", "like", "to", "play", 
         "cats", "sleep", "a", "lot", 
         "dogs", "bark", "loudly", "but", 
         "dogs", "run", "fast", "playing",
         "is", "fun"]
         
occurences = ...

# How often does the word "cats" appear?
print(...)

# How often does the word "lot" appear?
print(...)

# How many unique words are there?
print(...)

# How many words are there in total?
# Please try to solve it without using len(words)!
print(...)

Ellipsis
Ellipsis
Ellipsis
Ellipsis

Solution

words = ["cats", "like", "to", "play", 
         "cats", "sleep", "a", "lot", 
         "dogs", "bark", "loudly", "but", 
         "dogs", "run", "fast", "playing",
         "is", "fun"]
         
occurences = {}
for word in words:
    occurences[word] = occurences.get(word, 0) + 1

# How often does the word "cats" appear?
print(occurences.get("cats"))
# --> 2

# How often does the word "lot" appear?
print(occurences.get("lot"))
# --> 1

# How many unique words are there?
print(len(occurences.keys()))
# --> 16

# How many words are there in total?
# Please try to solve it without using len(words)!
print(sum(occurences.values()))
# --> 18

One can iterate over all words and for each word see if the word was already seen (occurences=0 if not) and then increment that amount.

Since all keys are unique, the number of keys corresponds to the number of unique words. If a word occurs multiple times, the value of the same key will be incremented.

The dictionary occurences counts how often a given word appears. When calculating the sum of all the values, then the result corresponds to the number of total words.