- Introduction
- Directories and files description
- Installation
- Usage
- Results: Comparaisons between methods
- Methods: Python code
- References
- License
This project's goal is to compute average run times of different methods of reversing a dictionary's keys and values in Python 2.7 & 3:
- method 1: makes use of dictionary comprehension, and the dictionary must contain unique values
- method 2: the dictionary doesn't contain unique values and saves all the keys with the same values in a list
- method 3: makes use of
map(reversed, iter), useful when the type and order of the original dictionary must be preserved (e.g.OrderedDict)
I ran some tests, and method 1 is the one that provides the best average run time for reversing the keys and values of a dictionary. See section Comparaisons between different methods for the average run times of the different methods based on the number of items.
You can also check my blog post Python tips: reverse a dictionary where I discuss the implementation of the different methods and the results of the comparaisons of the methods based on their average run times.
Note: I tested the code with Python 2.7.15 and 3.6.5
compute_avg_run_time.py: it is the main script that will build the rightshell-based python command for computing the average run time of a dict-reversing method. It is run by providing it the right options through the command-line. It will execute eitherrun_python2_method.pyif the wanted method is Python2-based orrun_python3_method.pyif the method is Python3-based. It can be called withpython2orpython3through the command-line.run_python2_method.py: gets executed bycompute_avg_run_time.py, and calls the right Python 2 dict-reversing method that is defined inmethods.py.run_python3_method.py: gets executed bycompute_avg_run_time.py, and calls the right Python 3 dict-reversing method that is defined inmethods.py.reverse_dict/: a package where everything that is needed to compute the average run time of a method is defined such as the dict-reversing methods, and the arguments accepted by the methods through the command-line.
To use the main Python script
compute_avg_run_time.py:
- Clone the repository and extract it
- You can now run the main script by providing it the right options:
$ python compute_avg_run_time.py [-h] [--version] [OPTIONS]
Go to the section Usage for more details on the script options and examples of usage.
python compute_run_time.py [-h] [--version] [OPTIONS]
-
-h,--help
show the help message and exit -
-m METHOD_NAME,--method_name METHOD_NAME
Name of the method that reverses a dictionary's keys and values:method_01_py2: Python 2.7 version of method 1
method_02_py2: Python 2.7 version of method 2
method_03_py2: Python 2.7 version of method 3
method_01_py3: Python 3 version of method 1
method_02_py3: Python 3 version of method 2
method_03_py3: Python 3 version of method 3
(default:method_01_py3) -
-ui,--use_items
When working on Python 2, usesdict.items()instead of the more efficientdict.iteritems().
(default:False) -
-usd,--use_setdefault
Usedict.setdefault()instead ofdict.get()when populating the dictionary.
(default:False) -
-ni NUMBER_ITEMS,--number_items NUMBER_ITEMS
Number of items in the dictionary.
(default:1000) -
-nt NUMBER_TIMES,--number_times NUMBER_TIMES
Number of times the dictionary's keys and values will be reversed. Each time, the run time of the reversal is computed and at the end of all the tries, the average run time is computed.
(default:10) -
-p PRECISION,--precision PRECISION
Decimal precision used when displaying number results.
(default:8) -
-pd,--print_dicts
Print the original and reversed dictionaries at the end.
(default:False) -
-unu,--use_non_uniques
Initialize the original dictionary with non-unique values.
(default:False) -
-uod,--use_ordered_dict
UseOrderedDictinstead ofdictfor both dictionaries (original and inverse).
(default:False) -
-v,--version
Show program's version and exit.
Try 1000 times the method 1 with Python 2 on 10 items using
dict.items():
$ python3 compute_avg_run_time.py -m method_01_py2 -ni 10 -nt 1000 -ui -pd
Output:
Method name: method_01_py2
#1 Run time: 0.00000501
#2 Run time: 0.00000381
#3 Run time: 0.00000310
#4 Run time: 0.00000286
#5 Run time: 0.00000310
[...]
#995 Run time: 0.00000286
#996 Run time: 0.00000215
#997 Run time: 0.00000310
#998 Run time: 0.00000191
#999 Run time: 0.00000286
#1000 Run time: 0.00000215
Avg run time: 0.00000269 seconds
Original dict:
{'k10': 'v10', 'k3': 'v3', 'k2': 'v2', 'k1': 'v1', 'k7': 'v7', 'k6': 'v6', 'k5': 'v5', 'k4': 'v4', 'k9': 'v9', 'k8': 'v8'}
Inverse dictionary:
{'v10': 'k10', 'v1': 'k1', 'v2': 'k2', 'v3': 'k3', 'v4': 'k4', 'v5': 'k5', 'v6': 'k6', 'v7': 'k7', 'v8': 'k8', 'v9': 'k9'}
Notes:
- The main script
compute_avg_run_time.pywas called withpython3even though we are executing a Python2-based dict-reversing method (method_01_py2). As was explained previously,compute_avg_run_time.py, which can be run withpython2orpython3, executesrun_python2_method.py(a Python 2 script) which will callmethod_01_py2. - From the content of the dictionaries, we see that the order of
insertion was not fully respected (
{'v10': 'k10'}is at the beginning) as can be expected sincemethod_01_py2is a Python2-based method that is using adictas the data structure. If anOrderedDictwould have been used (with the option-uod), then the initial order of insertion would have been maintained in the dictionaries.
Try 1000 times the method 2 with Python 3 on 9 items using
dict.setdefault():
$ python compute_avg_run_time.py -m method_02_py3 -ni 9 -nt 1000 -usd -pd -unu
Output:
Method name: method_02_py3
#1 Run time: 0.00001127
#2 Run time: 0.00000531
#3 Run time: 0.00000447
#4 Run time: 0.00000418
#5 Run time: 0.00000409
[...]
#995 Run time: 0.00000372
#996 Run time: 0.00000380
#997 Run time: 0.00000375
#998 Run time: 0.00000372
#999 Run time: 0.00000371
#1000 Run time: 0.00000368
Avg run time: 0.00000386 seconds
Original dict:
{'k1': 'v1', 'k2': 'v2', 'k3': 'v3', 'k4': 'v4', 'k5': 'v1', 'k6': 'v2', 'k7': 'v3', 'k8': 'v4', 'k9': 'v5'}
Inverse dictionary:
{'v1': ['k1', 'k5'], 'v2': ['k2', 'k6'], 'v3': ['k3', 'k7'], 'v4': ['k4', 'k8'], 'v5': ['k9']}
Notes:
- In my work environment,
pythonpoints toPython 3.6.5 - Method 2 works also with non-unique values (
-unu), i.e. keys might have the same values in the originaldict.
Try 1000 times the method 3 with Python 3 on 10 items using
OrderedDict:
$ python compute_avg_run_time.py -m method_03_py3 -ni 10 -nt 1000 -uod -pd
Output:
Method name: method_03_py3
#1 Run time: 0.00001149
#2 Run time: 0.00000642
#3 Run time: 0.00000523
#4 Run time: 0.00000480
#5 Run time: 0.00000483
[...]
#995 Run time: 0.00000447
#996 Run time: 0.00000453
#997 Run time: 0.00000452
#998 Run time: 0.00000445
#999 Run time: 0.00000447
#1000 Run time: 0.00000452
Avg run time: 0.00000495 seconds
Original dict:
OrderedDict([('k1', 'v1'), ('k2', 'v2'), ('k3', 'v3'), ('k4', 'v4'), ('k5', 'v5'), ('k6', 'v6'), ('k7', 'v7'), ('k8', 'v8'), ('k9', 'v9'), ('k10', 'v10')])
Inverse dictionary:
OrderedDict([('v1', 'k1'), ('v2', 'k2'), ('v3', 'k3'), ('v4', 'k4'), ('v5', 'k5'), ('v6', 'k6'), ('v7', 'k7'), ('v8', 'k8'), ('v9', 'k9'), ('v10', 'k10')])
Note:
- An
OrderedDict()is used instead but if a simpledictwould have been used, the inverse dictionary would have respected the original insertion order also since starting from Python 3.6, the insertion order is an implementation detail (though not a guarantee in Python 3.6). However, as of Python 3.7, you can be guaranteed that insertion order is supported fordict.
2018-09-16:
-
I updated the code base and re-ran the python commands to re-populate the following two tables with new results. The changes consisted in implementing the important missing option
--use_setdefaultwhich I thought was already implemented when I generated the results the first time. Big mistake on my part! Thus the old results for the rows Method 2: setdefault were generated actually withdict.get()instead ofdict.setdefault():( -
Also, I factorized methods.py where the different
dict-reversing methods are defined by putting all the common code from Python 2 & 3 methods into base classes.
The following tables present the average run times (in µ seconds) of the different methods of reversing a dictionary. As it can be seen, method 1 is the big winner, offering the best average run times in Python 2 & 3. More details on the results can be found on my blog post Python tips: reverse a dictionary.
The python commands for each methods are to be found in commands.md.
of reversing a
dict in Python 2.7
| Py2 Method | Avg time (µsec), 1k items, 100k times | Avg time (µsec), 10k items, 1k times | Avg time (µsec), 100k items, 1k times |
|---|---|---|---|
Method 1: dict comprehension, iteritems() |
|||
Method 1: dict comprehension, items() |
256.79 | 4125.60 | 70901.29 |
Method 2: dict.get, iteritems() |
838.27 | 10410.03 | 127703 |
Method 2: setdefault, iteritems() |
656.03 | 8539.27 | 108347.76 |
Method 3: map(reversed, iterable), iteritems() |
895.95 | 10788.73 | 146115.25 |
of reversing a
dict in Python 3
| Py3 Method | Avg time (µsec), 1k items, 100k times | Avg time (µsec), 10k items, 1k times | Avg time (µsec), 100k items, 1k times |
|---|---|---|---|
Method 1: dict comprehension |
|||
Method 2: dict.get |
417.17 | 4949.22 | 66599.29 |
Method 2: setdefault |
333.02 | 4171.31 | 58628.69 |
Method 3: map(reversed, iterable) |
311.59 | 3366.20 | 46960.80 |
dict.iteritems()
my_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5}
inv_dict = {v: k for k, v in my_dict.iteritems()}dict.items()
my_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5}
inv_dict = {v: k for k, v in my_dict.items()}my_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5}
inv_dict = {v: k for k, v in my_dict.items()}dict.get()
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {}
for k, v in my_dict.iteritems():
inv_dict[v] = inv_dict.get(v, [])
inv_dict[v].append(k)dict.get()
from collections import OrderedDict
def reverse_dict(orig_dict):
inv_dict = {}
for k, v in orig_dict.items():
inv_dict[v] = inv_dict.get(v, [])
inv_dict[v].append(k)
my_dict = OrderedDict({1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'})
reverse_dict(my_dict)dict.setdefault()
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {}
for key, value in my_dict.iteritems():
inv_dict.setdefault(value, []).append(key)dict.setdefault()
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {}
for key, value in my_dict.items():
inv_dict.setdefault(value, []).append(key)def reverse_mapping(f):
return f.__class__(map(reversed, f.iteritems()))
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'd', 5: 'e'}
inv_dict = reverse_mapping(my_dict)def reverse_mapping(f):
return f.__class__(map(reversed, f.items()))
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'd', 5: 'e'}
inv_dict = reverse_mapping(my_dict)The code is licensed under the MIT license. See the license for more details.