Skip to content

Code for computing average running times of different methods of reversing a dict in Python 2.7 & 3

License

Notifications You must be signed in to change notification settings

ool3/python_reverse_dict

Repository files navigation

python_reverse_dict

Introduction

This project's goal is to compute average run times of different methods of reversing a dictionary's keys and values in Python 2.7 & 3:

  • method 1: makes use of dictionary comprehension, and the dictionary must contain unique values
  • method 2: the dictionary doesn't contain unique values and saves all the keys with the same values in a list
  • method 3: makes use of map(reversed, iter), useful when the type and order of the original dictionary must be preserved (e.g. OrderedDict)

I ran some tests, and method 1 is the one that provides the best average run time for reversing the keys and values of a dictionary. See section Comparaisons between different methods for the average run times of the different methods based on the number of items.

You can also check my blog post Python tips: reverse a dictionary where I discuss the implementation of the different methods and the results of the comparaisons of the methods based on their average run times.

Note: I tested the code with Python 2.7.15 and 3.6.5

Directories and files description

  • compute_avg_run_time.py : it is the main script that will build the right shell-based python command for computing the average run time of a dict-reversing method. It is run by providing it the right options through the command-line. It will execute either run_python2_method.py if the wanted method is Python2-based or run_python3_method.py if the method is Python3-based. It can be called with python2 or python3 through the command-line.
  • run_python2_method.py: gets executed by compute_avg_run_time.py, and calls the right Python 2 dict-reversing method that is defined in methods.py.
  • run_python3_method.py: gets executed by compute_avg_run_time.py, and calls the right Python 3 dict-reversing method that is defined in methods.py.
  • reverse_dict/: a package where everything that is needed to compute the average run time of a method is defined such as the dict-reversing methods, and the arguments accepted by the methods through the command-line.

Installation

To use the main Python script compute_avg_run_time.py:

  • Clone the repository and extract it
  • You can now run the main script by providing it the right options:
    $ python compute_avg_run_time.py [-h] [--version] [OPTIONS]

Go to the section Usage for more details on the script options and examples of usage.

Usage

python compute_run_time.py [-h] [--version] [OPTIONS]

Options

  • -h, --help
    show the help message and exit

  • -m METHOD_NAME, --method_name METHOD_NAME
    Name of the method that reverses a dictionary's keys and values:

    method_01_py2: Python 2.7 version of method 1
    method_02_py2: Python 2.7 version of method 2
    method_03_py2: Python 2.7 version of method 3
    method_01_py3: Python 3 version of method 1
    method_02_py3: Python 3 version of method 2
    method_03_py3: Python 3 version of method 3
    (default: method_01_py3)

  • -ui, --use_items
    When working on Python 2, uses dict.items() instead of the more efficient dict.iteritems().
    (default: False)

  • -usd, --use_setdefault
    Use dict.setdefault() instead of dict.get() when populating the dictionary.
    (default: False)

  • -ni NUMBER_ITEMS, --number_items NUMBER_ITEMS
    Number of items in the dictionary.
    (default: 1000)

  • -nt NUMBER_TIMES, --number_times NUMBER_TIMES
    Number of times the dictionary's keys and values will be reversed. Each time, the run time of the reversal is computed and at the end of all the tries, the average run time is computed.
    (default: 10)

  • -p PRECISION, --precision PRECISION
    Decimal precision used when displaying number results.
    (default: 8)

  • -pd, --print_dicts
    Print the original and reversed dictionaries at the end.
    (default: False)

  • -unu, --use_non_uniques
    Initialize the original dictionary with non-unique values.
    (default: False)

  • -uod, --use_ordered_dict
    Use OrderedDict instead of dict for both dictionaries (original and inverse).
    (default: False)

  • -v, --version
    Show program's version and exit.

Examples of usage

Example 1: method 1 (Python 2.7)

Try 1000 times the method 1 with Python 2 on 10 items using dict.items():
$ python3 compute_avg_run_time.py -m method_01_py2 -ni 10 -nt 1000 -ui -pd

Output:

Method name: method_01_py2
#1 Run time: 0.00000501
#2 Run time: 0.00000381
#3 Run time: 0.00000310
#4 Run time: 0.00000286
#5 Run time: 0.00000310
[...]
#995 Run time: 0.00000286
#996 Run time: 0.00000215
#997 Run time: 0.00000310
#998 Run time: 0.00000191
#999 Run time: 0.00000286
#1000 Run time: 0.00000215
Avg run time: 0.00000269 seconds

Original dict:
{'k10': 'v10', 'k3': 'v3', 'k2': 'v2', 'k1': 'v1', 'k7': 'v7', 'k6': 'v6', 'k5': 'v5', 'k4': 'v4', 'k9': 'v9', 'k8': 'v8'}
Inverse dictionary:
{'v10': 'k10', 'v1': 'k1', 'v2': 'k2', 'v3': 'k3', 'v4': 'k4', 'v5': 'k5', 'v6': 'k6', 'v7': 'k7', 'v8': 'k8', 'v9': 'k9'}

Notes:

  • The main script compute_avg_run_time.py was called with python3 even though we are executing a Python2-based dict-reversing method (method_01_py2). As was explained previously, compute_avg_run_time.py, which can be run with python2 or python3, executes run_python2_method.py (a Python 2 script) which will call method_01_py2.
  • From the content of the dictionaries, we see that the order of insertion was not fully respected ({'v10': 'k10'} is at the beginning) as can be expected since method_01_py2 is a Python2-based method that is using a dict as the data structure. If an OrderedDict would have been used (with the option -uod), then the initial order of insertion would have been maintained in the dictionaries.

Example 2: method 2 (Python 3)

Try 1000 times the method 2 with Python 3 on 9 items using dict.setdefault():
$ python compute_avg_run_time.py -m method_02_py3 -ni 9 -nt 1000 -usd -pd -unu

Output:

Method name: method_02_py3
#1 Run time: 0.00001127
#2 Run time: 0.00000531
#3 Run time: 0.00000447
#4 Run time: 0.00000418
#5 Run time: 0.00000409
[...]
#995 Run time: 0.00000372
#996 Run time: 0.00000380
#997 Run time: 0.00000375
#998 Run time: 0.00000372
#999 Run time: 0.00000371
#1000 Run time: 0.00000368
Avg run time: 0.00000386 seconds

Original dict:
{'k1': 'v1', 'k2': 'v2', 'k3': 'v3', 'k4': 'v4', 'k5': 'v1', 'k6': 'v2', 'k7': 'v3', 'k8': 'v4', 'k9': 'v5'}
Inverse dictionary:
{'v1': ['k1', 'k5'], 'v2': ['k2', 'k6'], 'v3': ['k3', 'k7'], 'v4': ['k4', 'k8'], 'v5': ['k9']}

Notes:

  • In my work environment, python points to Python 3.6.5
  • Method 2 works also with non-unique values (-unu), i.e. keys might have the same values in the original dict.

Example 3: method 3 (Python 3)

Try 1000 times the method 3 with Python 3 on 10 items using OrderedDict:
$ python compute_avg_run_time.py -m method_03_py3 -ni 10 -nt 1000 -uod -pd

Output:

Method name: method_03_py3
#1 Run time: 0.00001149
#2 Run time: 0.00000642
#3 Run time: 0.00000523
#4 Run time: 0.00000480
#5 Run time: 0.00000483
[...]
#995 Run time: 0.00000447
#996 Run time: 0.00000453
#997 Run time: 0.00000452
#998 Run time: 0.00000445
#999 Run time: 0.00000447
#1000 Run time: 0.00000452
Avg run time: 0.00000495 seconds

Original dict:
OrderedDict([('k1', 'v1'), ('k2', 'v2'), ('k3', 'v3'), ('k4', 'v4'), ('k5', 'v5'), ('k6', 'v6'), ('k7', 'v7'), ('k8', 'v8'), ('k9', 'v9'), ('k10', 'v10')])
Inverse dictionary:
OrderedDict([('v1', 'k1'), ('v2', 'k2'), ('v3', 'k3'), ('v4', 'k4'), ('v5', 'k5'), ('v6', 'k6'), ('v7', 'k7'), ('v8', 'k8'), ('v9', 'k9'), ('v10', 'k10')])

Note:

  • An OrderedDict() is used instead but if a simple dict would have been used, the inverse dictionary would have respected the original insertion order also since starting from Python 3.6, the insertion order is an implementation detail (though not a guarantee in Python 3.6). However, as of Python 3.7, you can be guaranteed that insertion order is supported for dict.

Results: Comparaisons between methods

Major updates

2018-09-16:

  • I updated the code base and re-ran the python commands to re-populate the following two tables with new results. The changes consisted in implementing the important missing option --use_setdefault which I thought was already implemented when I generated the results the first time. Big mistake on my part! Thus the old results for the rows Method 2: setdefault were generated actually with dict.get() instead of dict.setdefault() :(

  • Also, I factorized methods.py where the different dict-reversing methods are defined by putting all the common code from Python 2 & 3 methods into base classes.


The following tables present the average run times (in µ seconds) of the different methods of reversing a dictionary. As it can be seen, method 1 is the big winner, offering the best average run times in Python 2 & 3. More details on the results can be found on my blog post Python tips: reverse a dictionary.

The python commands for each methods are to be found in commands.md.

Table 1 Average running times of different methods
of reversing a dict in Python 2.7
Py2 Method Avg time (µsec), 1k items, 100k times Avg time (µsec), 10k items, 1k times Avg time (µsec), 100k items, 1k times
Method 1: dict comprehension, iteritems()

217.96

2665.56

45301.06

Method 1: dict comprehension, items() 256.79 4125.60 70901.29
Method 2: dict.get, iteritems() 838.27 10410.03 127703
Method 2: setdefault, iteritems() 656.03 8539.27 108347.76
Method 3: map(reversed, iterable), iteritems() 895.95 10788.73 146115.25

Table 2 Average running times of different methods
of reversing a dict in Python 3
Py3 Method Avg time (µsec), 1k items, 100k times Avg time (µsec), 10k items, 1k times Avg time (µsec), 100k items, 1k times
Method 1: dict comprehension

89.97

905.46

18487.63

Method 2: dict.get 417.17 4949.22 66599.29
Method 2: setdefault 333.02 4171.31 58628.69
Method 3: map(reversed, iterable) 311.59 3366.20 46960.80

Methods: Python code

Method 1: unique-values, solution based on dict comprehension


Method 1: Python 2.7 with dict.iteritems()
my_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5}
inv_dict = {v: k for k, v in my_dict.iteritems()}

Method 1: Python 2.7 with dict.items()
my_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5}
inv_dict = {v: k for k, v in my_dict.items()}

Method 1: Python 3
my_dict = { 'a': 1, 'b':2, 'c': 3, 'd':4, 'e':5}
inv_dict = {v: k for k, v in my_dict.items()}

Method 2: non-unique values


Method 2: Python 2.7 with dict.get()
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {}
for k, v in my_dict.iteritems():
    inv_dict[v] = inv_dict.get(v, [])
    inv_dict[v].append(k)

Method 2: Python 3 with dict.get()
from collections import OrderedDict

def reverse_dict(orig_dict):
    inv_dict = {}
    for k, v in orig_dict.items():
        inv_dict[v] = inv_dict.get(v, [])
        inv_dict[v].append(k)

my_dict = OrderedDict({1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'})
reverse_dict(my_dict)

Method 2: Python 2.7 with dict.setdefault()
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {}
for key, value in my_dict.iteritems():
    inv_dict.setdefault(value, []).append(key)

Method 2: Python 3 with dict.setdefault()
my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'a', 5: 'c'}
inv_dict = {}
for key, value in my_dict.items():
    inv_dict.setdefault(value, []).append(key)

Method 3: type and order preserved


Method 3: Python 2.7
def reverse_mapping(f):
    return f.__class__(map(reversed, f.iteritems()))

my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'd', 5: 'e'}
inv_dict = reverse_mapping(my_dict)

Method 3: Python 3
def reverse_mapping(f):
    return f.__class__(map(reversed, f.items()))

my_dict = {1: 'a', 2:'b', 3: 'c', 4: 'd', 5: 'e'}
inv_dict = reverse_mapping(my_dict)

References

License

The code is licensed under the MIT license. See the license for more details.

About

Code for computing average running times of different methods of reversing a dict in Python 2.7 & 3

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages