Skip to content

Dinofish32/ComplexVectorBenchmarking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evaluating and Benchmarking Complex Numbers vs. NumPy Vectors for 2D Computation

Abstract

The 'vector' is used extensively across all domains of science, engineering, and technology. In game development, vectors represent movement of objects and particle systems, making graphics look as real as possible. Scientific research relies heavily on vectors to represent data, such as in quantum mechanics, electromagnetism, and fluid dynamics. More fundamentally, vectors are quantities that have both magnitude and direction.

Vectors have traditionally been represented in Python using NumPy arrays. NumPy is a library that provides a way to efficiently manipulate data in Python and performs extremely well for vector operations. A 2D vector can be represented using np.array([x, y]). The library also provides many useful functions to manipulate vectors, allowing us to perform operations like addition, rotation, and so on.

Another way a vector can be represented is using complex numbers, a feature built into Python. A complex number is written in the form a+bi, where a and b are real numbers and i is the imaginary unit. This allows us to represent a pair of real numbers (a, b) as a single complex number. Since 2D vectors can be represented as a coordinate pair <a, b>, we can represent a 2D vector as a complex number, with the real part representing the x coordinate and the imaginary part representing the y coordinate.

Vectors on a graph

Vectors on a graph. Powered by Geogebra.

Vectors have traditionally been represented in these fields using tuples, NumPy arrays, or custom classes. However, the use of complex numbers to represent vectors has not been heavily explored. This paper investigates whether complex numbers provide a more efficient representation of 2D vectors compared to NumPy arrays in terms of computation and memory usage across various vector operations. We will use different metrics to compare the performance of the two methods, including generation time, access time, addition time, dot product time, magnitude calculation time, and rotation time. Understanding the most efficient vector representation has major implications in performance-critical applications, such as physics simulations, computer graphics, and machine learning.

Methodology

Experimental Setup

This experiment consisted on comparing the performance of complex numbers and NumPy arrays in terms of generation time, access time, addition time, dot product time, magnitude calculation time, rotation time, and memory usage. The experiments were conducted on a Windows 11 machine with a 12th Gen Intel(R) Core(TM) i7-1255U and 12GB of RAM. The experiments were conducted using Python 3.11.1 and NumPy 1.26.4.

Implementation

The benchmarking code can be found in the main.py file, while the disassembly of each can be found in internal_steps.py. The code is divided into several sections, each of which contains the code for a different operation. "NumPy vector" "and "vector" will be used interchangeably in this paper.

The code for each operation is run 10 times and the average time is taken. The code for each operation is run with 1,000,000 vectors. The time.perf_counter() function is used to measure the time taken for each operation. Here's an explanation of each metric and how it was measured.

  • Generation: The time it takes to generate a complex number and a NumPy vector.
  • Access: The time it takes to access both components of a complex number and a NumPy vector.
  • Addition: The time it takes to add two complex numbers and two NumPy vectors.
  • Dot Product: The time it takes to calculate the dot product of two complex numbers and two NumPy vectors. This was done using c1.real * c2.real + c1.imag * c2.imag for the complex numbers and v = v1[0] * v2[0] + v1[1] * v2[1] for the NumPy vector. Traditionally this is done with the np.dot(v1, v2) function, but a function call would unfairly increase the time for the NumPy vectors.
  • Magnitude: The time it takes to calculate the magnitude of a complex number and a NumPy vector. This was done with abs(z) for complex numbers and np.linalg.norm(v) for the NumPy vectors.
  • Rotation: The time it takes to rotate a complex number and a NumPy vector by an arbitrary degree. For the complex numbers, this was done by multiplying the number by complex(np.cos(theta), np.sin(theta)), which is a representation of cosθ + isinθ. For the vector in NumPy, a linear algebra approach was used, creating the rotation matrix np.array([[np.cos(theta), -np.sin(theta)], [np.sin(theta), np.cos(theta)]]) to left-multiply to the vector.
  • Memory Usage: The memory usage of an array of 1 million complex numbers and an array of 1 million NumPy vectors. This was done with the memory_profiler library, and the number of megabytes for each array was compared.

Results

Benchmarking Results

Metric Complex Time (s) NumPy Vector Time (s)
Generation 0.3457 0.0152
Access 0.0424 0.1736
Addition 0.0338 0.4369
Dot Product 0.0978 0.3149
Magnitude 0.0702 1.8189
Rotation 0.0356 1.0493

Performance Comparison Performance comparison between Complex Numbers & Vectors across time efficiency metrics

The metric times and the performance comparison graph reveal a surprising outcome: complex numbers have much greater generation times, but consistently beat NumPy vectors in all other metrics. The greatest discrepancy can be seen in the rotation calculation, where complex numbers beat NumPy vectors by 29.49 times. NumPy vectors, however, beat complex numbers in generation time by 22.67 times. It is also interesting to notice that for all other metrics but generation complex numbers have < 0.1s times.

Benchmarking Memory

Metric Complex Numbers Vectors
Single Element Memory 32 B 16 B
1-million Vector Array 15.3 MB 15.3 MB

Benchmarking the memory reveals that complex numbers require twice the amount of storage compared to vectors; however, a 1-million NumPy array of both vector types has the same amount of memory.

Analysis

Explanation of Results

Let's first look at the huge difference between generation times between both vector types. This discrepancy can be attributed to the underlying implementations of each object. Complex numbers are a built-in object in Python, so there is additional overhead when creating the object that leads to the large generation time. NumPy arrays, on the other hand, have a flexible dual-component structure that holds raw data and information about the data. This makes creating arrays and doing operations on them relatively fast.

Next, we can observe the difference in access times between both vector types. Complex numbers are more than 4 times faster to access than NumPy vectors. This can be justified by how data in both vectors are arranged in memory. Using the dis module in python to disassemble the steps required for accessing both, we can try to justify the difference.

>>> python internal_steps.py
===Disassembly of Complex Number Access===
  8           0 RESUME                   0

  9           2 LOAD_FAST                0 (c_vector)
              4 LOAD_ATTR                0 (real)
             14 STORE_FAST               1 (real)

 10          16 LOAD_FAST                0 (c_vector)
             18 LOAD_ATTR                1 (imag)
             28 STORE_FAST               2 (imag)
             30 LOAD_CONST               0 (None)
             32 RETURN_VALUE

===Disassembly of NumPy Vector Access===
 12           0 RESUME                   0

 13           2 LOAD_FAST                0 (n_vector)
              4 LOAD_CONST               1 (0)
              6 BINARY_SUBSCR
             16 STORE_FAST               1 (x)

 14          18 LOAD_FAST                0 (n_vector)
             20 LOAD_CONST               2 (1)
             22 BINARY_SUBSCR
             32 STORE_FAST               2 (y)
             34 LOAD_CONST               0 (None)
             36 RETURN_VALUE

From the breakdown, we can make two observations:

  • Accessing NumPy vectors requires more steps
  • NumPy Vectors use the step BINARY_SUBSCR instead of LOAD_ATTR. BINARY_SUBSCR refers to binary subscript and is used for indexing operations, like v[0] and v[1] in this case. LOAD_ATTR is used for attribute access, like z.real and z.imag, and from our data we can conclude it is faster.

For the other operations measured, some time will be taken to access the data in each vector type, so complex numbers will inherently have an advantange. Every indexing operation for calculations will take more time compared to attribute access. This can be used to explain the difference between the times observed in the graph.

Implications

The implications of these are wide-ranging.

The rotation performance for complex numbers makes it particularly valuable for 2D game engines, which utilize rotation calculations all the time. Signal processing, vector field calculations, and some aspects of quantum mechanics would also benefit from this great boost in time from this change in vectors. In general, this could revolutionize speech at which research is done in many fields in engineering and science, since the tested operations are extensively used in those fields.

Limitations & Improvements

One potential limitation of this study is the lack of general research in the field comparing the performance of complex numbers to vectors in NumPy. This prevented me from comparing my results other studies.

Moreover, since complex numbers can only have 2 parts - real and imaginary - the results cannot be extended to multi-dimensional vectors. All applications of complex numbers can only be restricted to 2D applications.

Finally, I am looking to research within a certain field in science or engineering to get a more specific look into the performance difference.

Conclusion

For applications requiring frequent vector operations (like game engines, physics simulations, or real-time graphics), using complex numbers could provide significant performance benefits. Further research into the field would be needed to ensure repeatability of the results. However, if this significant increase in speed is still present, there may need to be some reworking required for many scientific libraries to incorporate complex numbers. Through all of it, the difference in generation times should be of major concern, since it is many magnitudes greater for complex numbers.

Regardless, these implications suggest that developers should carefully consider their specific use case when choosing between complex numbers and NumPy vectors, rather than defaulting to NumPy for all vector operations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages