From 114c31c266c0e529c1ae379e4959b55ce1308811 Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Sun, 20 Apr 2025 16:08:43 -0400 Subject: [PATCH 01/19] feat(spring2025/weeksixteen/teamtwo/index.qmd): create folder and file for the blog post --- .../spring2025/weeksixteen/teamtwo/index.qmd | 45 +++++++++++++++++++ 1 file changed, 45 insertions(+) create mode 100644 allhands/spring2025/weeksixteen/teamtwo/index.qmd diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd new file mode 100644 index 0000000..c57e36a --- /dev/null +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -0,0 +1,45 @@ +--- +author: [Hemani Alaparthi, Duru Akbas, Williem Bennet, Faaris Cheema, Vivian Potts] +title: How does the runtime efficiency of linear search and balanced binary search tree (tree based approach) algorithms compare under varying dataset sizes and target element positions? +page-layout: full +categories: [post, linear search, binary search, binary search tree, search] +date: "2025-04-24" +date-format: long +toc: true +format: + html: + code-links: + - text: Github Repository + icon: github + href: https://github.com/hemanialaparthi/lvb +--- + +# Introduction + +# TODO + +## Motivation + +# TODO + +# Method + +## Approach + +# TODO + +# Data + +# TODO + +# Results + +# TODO + +# Conclusion + +# TODO + +# Future Work + +# TODO \ No newline at end of file From 17d626d23d1a2436373c43a286e42565714845f3 Mon Sep 17 00:00:00 2001 From: Hemani Alaparthi <143897209+hemanialaparthi@users.noreply.github.com> Date: Mon, 21 Apr 2025 01:53:59 -0400 Subject: [PATCH 02/19] feat(index.qmd): add a rough draft for the intro, motivation && approach --- .../spring2025/weeksixteen/teamtwo/index.qmd | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index c57e36a..3ec4ff5 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -16,14 +16,24 @@ format: # Introduction -# TODO +This study examines how `linear search`, `binary search`, and `balanced binary search tree (BST)` algorithms perform under varying conditions, specifically looking at: + +1. How dataset size affects performance scaling +2. How target element position impacts search efficiency ## Motivation -# TODO +Search algorithm selection significantly impacts application performance, particularly as datasets grow. While linear search offers simplicity with O(n) time complexity, binary search trees promise O(log n) efficiency when properly balanced. However, theoretical advantages don't always translate directly to real-world performance. This study systematically analyzes how these algorithms compare across different dataset sizes and when searching for targets located in different positions within the data structure, providing insights for optimal algorithm selection in practical applications. # Method +For this experiment, we developed a benchmarking tool that allows for systematic comparison between search algorithms across different data structures. The tool measures execution time while controlling for: + +1. Data structure type (unsorted list vs. binary search tree) +2. Search algorithm (linear search vs. BST search) +3. Dataset size (with automatic doubling between runs) +4. Target position (`beginning`, `middle`, `end`, `random`, or `nonexistent`) + ## Approach # TODO @@ -42,4 +52,4 @@ format: # Future Work -# TODO \ No newline at end of file +# TODO From 9ecef05d0160a090f7ec61ba9b40a3294b2b6a02 Mon Sep 17 00:00:00 2001 From: Hemani Alaparthi <143897209+hemanialaparthi@users.noreply.github.com> Date: Mon, 21 Apr 2025 23:37:53 -0400 Subject: [PATCH 03/19] feat(index.qmd): add linear algorithm section & add motivation section --- .../spring2025/weeksixteen/teamtwo/index.qmd | 30 +++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 3ec4ff5..50d7f89 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -23,7 +23,7 @@ This study examines how `linear search`, `binary search`, and `balanced binary s ## Motivation -Search algorithm selection significantly impacts application performance, particularly as datasets grow. While linear search offers simplicity with O(n) time complexity, binary search trees promise O(log n) efficiency when properly balanced. However, theoretical advantages don't always translate directly to real-world performance. This study systematically analyzes how these algorithms compare across different dataset sizes and when searching for targets located in different positions within the data structure, providing insights for optimal algorithm selection in practical applications. +Search algorithm selection significantly impacts application performance, particularly as datasets grow. While linear search offers simplicity with O(n) time complexity, binary search trees promise O(log n) efficiency when properly balanced. However, theoretical advantages don't always translate directly to real-world performance. This study systematically analyzes how these algorithms compare across different dataset sizes and when searching for targets located in different positions within the data structure, providing insights for optimal algorithm selection in practical applications. As datasets grow and performance requirements become more stringent, the choice between linear search, binary search, and tree-based approaches can significantly impact application responsiveness. This study aims to provide empirical data to guide these decisions. # Method @@ -36,7 +36,33 @@ For this experiment, we developed a benchmarking tool that allows for systematic ## Approach -# TODO +### Linear Search + +```cmd +def linear_search(dataset: List[Any], target: Any) -> Optional[int]: + """Perform a linear search on the dataset. + + Args: + dataset: List to search through + target: Element to search for + + Returns: + int: Index of the target element, or None if not found + """ + # Iterate through the dataset + for i, item in enumerate(dataset): + if item == target: + return i + + # Target not found + return None +``` + +Linear search sequentially checks each element until finding the target or reaching the end. It works on both sorted and unsorted data with `O(n)` time complexity. + +### Binary Search + +### Binary Search Tree # Data From 61126f6f51d9c2ea9b98c6b9bdafed4ea0783802 Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Mon, 21 Apr 2025 23:51:38 -0400 Subject: [PATCH 04/19] feat(index.qmd): fix motivation section and approach section and fix the headers --- .../spring2025/weeksixteen/teamtwo/index.qmd | 28 +++++++++++++++---- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 50d7f89..abf91b3 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -23,7 +23,9 @@ This study examines how `linear search`, `binary search`, and `balanced binary s ## Motivation -Search algorithm selection significantly impacts application performance, particularly as datasets grow. While linear search offers simplicity with O(n) time complexity, binary search trees promise O(log n) efficiency when properly balanced. However, theoretical advantages don't always translate directly to real-world performance. This study systematically analyzes how these algorithms compare across different dataset sizes and when searching for targets located in different positions within the data structure, providing insights for optimal algorithm selection in practical applications. As datasets grow and performance requirements become more stringent, the choice between linear search, binary search, and tree-based approaches can significantly impact application responsiveness. This study aims to provide empirical data to guide these decisions. +Search algorithm selection significantly impacts application performance, particularly as datasets grow. While `linear search` offers simplicity with `O(n)` time complexity, `binary search` and `binary search tree` promise`O(log n)` efficiency. However, theoretical advantages don't always translate directly to real-world performance. + +This study systematically analyzes how these algorithms compare across different dataset sizes and when searching for targets located in different positions within the data structure. As applications process increasingly large datasets under tight performance constraints, the choice between linear search, binary search, and tree-based approaches can dramatically affect responsiveness. Our research provides empirical data to guide these crucial implementation decisions. # Method @@ -34,9 +36,9 @@ For this experiment, we developed a benchmarking tool that allows for systematic 3. Dataset size (with automatic doubling between runs) 4. Target position (`beginning`, `middle`, `end`, `random`, or `nonexistent`) -## Approach +# Approach -### Linear Search +## Linear Search ```cmd def linear_search(dataset: List[Any], target: Any) -> Optional[int]: @@ -60,13 +62,27 @@ def linear_search(dataset: List[Any], target: Any) -> Optional[int]: Linear search sequentially checks each element until finding the target or reaching the end. It works on both sorted and unsorted data with `O(n)` time complexity. -### Binary Search +## Binary Search + +TODO -### Binary Search Tree +## Binary Search Tree + +TODO # Data -# TODO +We conducted experiments with datasets of 1,000 and 5,000 elements, using both sorted and unsorted arrays. For each algorithm, we measured: + +1. Performance across different dataset sizes +2. Impact of target position (beginning, middle, end, or nonexistent) +3. Runtime consistency across multiple searches (100 and 500 searches) + +# Data Tables + +### MacOS + +### Windows # Results From 0fc98805f535f7bb8660d01e19b8b8b13742533b Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Mon, 21 Apr 2025 23:56:49 -0400 Subject: [PATCH 05/19] fix(index.qmd): fix the backtick in the methods section --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index abf91b3..d92c6d0 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -31,8 +31,8 @@ This study systematically analyzes how these algorithms compare across different For this experiment, we developed a benchmarking tool that allows for systematic comparison between search algorithms across different data structures. The tool measures execution time while controlling for: -1. Data structure type (unsorted list vs. binary search tree) -2. Search algorithm (linear search vs. BST search) +1. Data structure type (`unsorted list`, `sorted list` and `binary search tree`) +2. Search algorithm (`linear search` vs. `binary search` vs. `BST search`) 3. Dataset size (with automatic doubling between runs) 4. Target position (`beginning`, `middle`, `end`, `random`, or `nonexistent`) From c5251f0eb29335b01e2a691945f012aef91143cb Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 00:01:37 -0400 Subject: [PATCH 06/19] fix!: backticks in the data section --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index d92c6d0..00ca55c 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -72,11 +72,11 @@ TODO # Data -We conducted experiments with datasets of 1,000 and 5,000 elements, using both sorted and unsorted arrays. For each algorithm, we measured: +We conducted experiments with datasets of `1,000` and `5,000` elements, using both sorted and unsorted arrays. For each algorithm, we measured: 1. Performance across different dataset sizes -2. Impact of target position (beginning, middle, end, or nonexistent) -3. Runtime consistency across multiple searches (100 and 500 searches) +2. Impact of target position (`beginning`, `middle`, `end`, or `nonexistent`) +3. Runtime consistency across multiple searches (`100` and `500` searches) # Data Tables From ee3858e14074a6aabdad06d187e5e98be09ded4f Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 00:35:12 -0400 Subject: [PATCH 07/19] feat(index.qmd): shorten RQ to make it shorter --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 00ca55c..9724a60 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -1,6 +1,6 @@ --- author: [Hemani Alaparthi, Duru Akbas, Williem Bennet, Faaris Cheema, Vivian Potts] -title: How does the runtime efficiency of linear search and balanced binary search tree (tree based approach) algorithms compare under varying dataset sizes and target element positions? +title: How do linear search, binary search, balanced BSTs compare in runtime efficiency across varying dataset sizes and target positions? page-layout: full categories: [post, linear search, binary search, binary search tree, search] date: "2025-04-24" From d28fec1c44bfe48d800d21d71b50109cf499ab26 Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 00:46:12 -0400 Subject: [PATCH 08/19] feat(index.qmd): add to the future work section --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 9724a60..e367df7 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -94,4 +94,6 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo # Future Work -# TODO +- **Floating-point comparison overhead might affect the relative performance advantages** + - Decimal comparisons introduce additional computational overhead and potential precision issues that could affect performance differently across algorithms. This would be particularly relevant for scientific computing and financial applications where decimal data is common. + From a9b73c0103ff5c1d53530ee12f8520689f09cd9d Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 00:47:10 -0400 Subject: [PATCH 09/19] feat(index.qmd): add to the future work section and create a references section --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index e367df7..757031d 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -94,6 +94,10 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo # Future Work +Several areas for future investigation could provide additional insights: + - **Floating-point comparison overhead might affect the relative performance advantages** - Decimal comparisons introduce additional computational overhead and potential precision issues that could affect performance differently across algorithms. This would be particularly relevant for scientific computing and financial applications where decimal data is common. +# References + From 963c7d374053315ff6d785afe6d54b45fb447515 Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 00:49:48 -0400 Subject: [PATCH 10/19] feat(index.qmd): add references! --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 2 ++ 1 file changed, 2 insertions(+) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 757031d..eb915c7 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -101,3 +101,5 @@ Several areas for future investigation could provide additional insights: # References +- [Linear vs Binary Search](https://www.geeksforgeeks.org/linear-search-vs-binary-search/) +- [Binary Search Tree](https://www.geeksforgeeks.org/introduction-to-binary-search-tree/) From fdc285dac186aecd8e799981a9782aba8fc770cd Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 11:08:30 -0400 Subject: [PATCH 11/19] fix(index.qmd): fix the intro, motivation and methods to remove redundancies --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index eb915c7..183e85b 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -21,11 +21,17 @@ This study examines how `linear search`, `binary search`, and `balanced binary s 1. How dataset size affects performance scaling 2. How target element position impacts search efficiency -## Motivation +# Motivation -Search algorithm selection significantly impacts application performance, particularly as datasets grow. While `linear search` offers simplicity with `O(n)` time complexity, `binary search` and `binary search tree` promise`O(log n)` efficiency. However, theoretical advantages don't always translate directly to real-world performance. +While `linear search` `(O(n))` offers simplicity and `binary search`/`balanced BSTs` `(O(log n))` promise theoretical efficiency, real-world performance depends heavily on: -This study systematically analyzes how these algorithms compare across different dataset sizes and when searching for targets located in different positions within the data structure. As applications process increasingly large datasets under tight performance constraints, the choice between linear search, binary search, and tree-based approaches can dramatically affect responsiveness. Our research provides empirical data to guide these crucial implementation decisions. +1. Dataset growth patterns + +2. Target location distributions + +3. Hardware/platform characteristics + +Our benchmarking provides empirical insights to guide algorithm selection in production systems where theoretical models may not reflect actual behavior. # Method From 346f2e41e7e64e80393cc01380c6f7a28efcccbe Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Tue, 22 Apr 2025 14:30:30 -0400 Subject: [PATCH 12/19] fix(index.qmd): the headings in the file --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 183e85b..66e916a 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -90,9 +90,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo ### Windows -# Results - -# TODO +### Results # Conclusion From e0c0cbd0abb77a0cd601b566c28d9d5dde4edb29 Mon Sep 17 00:00:00 2001 From: duruakbas <143649934+duruakbas@users.noreply.github.com> Date: Tue, 22 Apr 2025 23:38:58 -0400 Subject: [PATCH 13/19] Implemented BST in the blog post --- .../spring2025/weeksixteen/teamtwo/index.qmd | 61 ++++++++++++++++++- 1 file changed, 60 insertions(+), 1 deletion(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 66e916a..1c24a8a 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -74,7 +74,66 @@ TODO ## Binary Search Tree -TODO +```cmd +class Node: + """Creates the Node class for the BST.""" + + def __init__(self, value): + self.l_child = None + self.r_child = None + self.data = value + + +class BinarySearchTree: + """Initializing the BST.""" + + def __init__(self): + self.root = None + + def insert(self, value): + """Inserts the value into the tree.""" + if self.root is None: + self.root = Node(value) + else: + self.insert_recursive(self.root, value) + + def insert_recursive(self, node, value): + """Inserts the value into the tree recursively.""" + if value < node.data: + if node.l_child is None: + node.l_child = Node(value) + else: + self.insert_recursive(node.l_child, value) + else: # noqa: PLR5501 + if node.r_child is None: + node.r_child = Node(value) + else: + self.insert_recursive(node.r_child, value) + + def search(self, target): + """Search for a value in the BST and return if found.""" + return self.search_recursive(self.root, target) + + def search_recursive(self, node, target): + """Search for a value in the BST recursively.""" + if node is None: + return False + if node.data == target: + return True + elif target < node.data: + return self.search_recursive(node.l_child, target) + else: + return self.search_recursive(node.r_child, target) +``` + +The Binary Search Tree, is such an efficient approach for searching projects, with a balanced BST the time complexity is O(logn). The logic behind the BST is that the function, starts with a root node (starting node) and from there comparing the inputs to the target value, the smaller values go to the left as a child and the bigger values go to the right as a child. The BST uses recursion when adding the "children" therefore, the data is always sorted and balanced as it is being implemented to the tree. When searching for a target value, it constantly compares the current node's value and the target value to get to the target. Here is what a balanced BST looks like: + + 10 + / \ + 5 15 + / \ / \ + 3 7 12 18 +the smaller numbers go to the left, and the bigger numbers go to the right. It is very efficient because you do not necessarily go through the majority of the values, depending on the value. # Data From da52e1f6da8692e88cf027ee8d90a5ddb3937928 Mon Sep 17 00:00:00 2001 From: duruakbas <143649934+duruakbas@users.noreply.github.com> Date: Tue, 22 Apr 2025 23:45:22 -0400 Subject: [PATCH 14/19] update --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 1 + 1 file changed, 1 insertion(+) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 1c24a8a..3969bc9 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -166,3 +166,4 @@ Several areas for future investigation could provide additional insights: - [Linear vs Binary Search](https://www.geeksforgeeks.org/linear-search-vs-binary-search/) - [Binary Search Tree](https://www.geeksforgeeks.org/introduction-to-binary-search-tree/) +- [Binary Search Tree](https://www.geeksforgeeks.org/binary-search-tree-set-1-search-and-insertion/) From aaf9a09d301da9540206c49283eac62a4a535750 Mon Sep 17 00:00:00 2001 From: Faarisc <89533657+Faarisc@users.noreply.github.com> Date: Wed, 23 Apr 2025 00:36:38 -0400 Subject: [PATCH 15/19] Update index.qmd --- .../spring2025/weeksixteen/teamtwo/index.qmd | 176 ++++++++++++++++++ 1 file changed, 176 insertions(+) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 3969bc9..ae784c7 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -145,12 +145,188 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo # Data Tables + ### MacOS +#### Data Set Size and Searches Comparison (Random Target) + +##### Un-sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|-----------------|------------------|------------------|----------|--------------|----------| +| 1,000 | unsorted_list | linear_search | random | 100 | 0.004141 | 5 | +| 1,000 | unsorted_list | linear_search | random | 500 | 0.019249 | 5 | +| 5,000 | unsorted_list | linear_search | random | 100 | 0.010662 | 5 | + +##### Sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|------------------|------------------|----------|--------------|----------| +| 1,000 | sorted_list | linear_search | random | 500 | 0.136840 | 5 | +| 1,000 | sorted_list | linear_search | random | 100 | 0.005462 | 5 | +| 5,000 | sorted_list | linear_search | random | 500 | 0.144115 | 5 | + +##### Sorted list - Binary search (Iterative) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 1,000 | sorted_list | binary_search_iterative | random | 100 | 0.000084 | 5 | +| 1,000 | sorted_list | binary_search_iterative | random | 500 | 0.000347 | 5 | +| 5,000 | sorted_list | binary_search_iterative | random | 100 | 0.000077 | 5 | + +##### Sorted list - Binary search (Recursive) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 1,000 | sorted_list | binary_search_recursive | random | 100 | 0.000095 | 5 | +| 1,000 | sorted_list | binary_search_recursive | random | 500 | 0.000478 | 5 | +| 5,000 | sorted_list | binary_search_recursive | random | 100 | 0.000533 | 5 | + +##### Binary Sesrch Tree - bst search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|---------------------|------------------|------------------|----------|--------------|----------| +| 1,000 | binary_search_tree | bst_search | random | 100 | 0.000063 | 5 | +| 1,000 | binary_search_tree | bst_search | random | 500 | 0.000311 | 5 | +| 5,000 | binary_search_tree | bst_search | random | 100 | 0.000097 | 5 | + + +#### Target Position Comparison (Fixed Dataset Size = 5,000) + +##### Un-sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|-----------------|------------------|------------------|----------|--------------|----------| +| 5,000 | unsorted_list | linear_search | beginning | 500 | 0.012992 | 5 | +| 5,000 | unsorted_list | linear_search | middle | 500 | 0.058549 | 5 | +| 5,000 | unsorted_list | linear_search | end | 500 | 0.066914 | 5 | +| 5,000 | unsorted_list | linear_search | nonexistent | 500 | 0.287232 | 5 | + +##### Sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|------------------|------------------|----------|--------------|----------| +| 5,000 | sorted_list | linear_search | beginning | 500 | 0.013443 | 5 | +| 5,000 | sorted_list | linear_search | middle | 500 | 0.145900 | 5 | +| 5,000 | sorted_list | linear_search | end | 500 | 0.264241 | 5 | +| 5,000 | sorted_list | linear_search | nonexistent | 500 | 0.279474 | 5 | + +##### Sorted list - Binary search (Iterative) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 5,000 | sorted_list | binary_search_iterative | beginning | 500 | 0.000368 | 5 | +| 5,000 | sorted_list | binary_search_iterative | middle | 500 | 0.000371 | 5 | +| 5,000 | sorted_list | binary_search_iterative | end | 500 | 0.000376 | 5 | +| 5,000 | sorted_list | binary_search_iterative | nonexistent | 500 | 0.000436 | 5 | + +##### Sorted list - Binary search (Recursive) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 5,000 | sorted_list | binary_search_recursive | beginning | 500 | 0.000533 | 5 | +| 5,000 | sorted_list | binary_search_recursive | middle | 500 | 0.000542 | 5 | +| 5,000 | sorted_list | binary_search_recursive | end | 500 | 0.000532 | 5 | +| 5,000 | sorted_list | binary_search_recursive | nonexistent | 500 | 0.000613 | 5 | + +##### Binary Sesrch Tree - bst search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|---------------------|------------------|------------------|----------|--------------|----------| +| 5,000 | binary_search_tree | bst_search | beginning | 500 | 0.000467 | 5 | +| 5,000 | binary_search_tree | bst_search | middle | 500 | 0.000472 | 5 | +| 5,000 | binary_search_tree | bst_search | end | 500 | 0.000482 | 5 | +| 5,000 | binary_search_tree | bst_search | nonexistent | 500 | 0.000825 | 5 | + ### Windows +#### Data Set Size and Searches Comparison (Random Target) + +##### Un-sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|-----------------|------------------|------------------|----------|--------------|----------| +| 1,000 | unsorted_list | linear_search | random | 100 | 0.010524 | 5 | +| 1,000 | unsorted_list | linear_search | random | 500 | 0.050583 | 5 | +| 5,000 | unsorted_list | linear_search | random | 100 | 0.026159 | 5 | + +##### Sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|------------------|------------------|----------|--------------|----------| +| 1,000 | sorted_list | linear_search | random | 100 | 0.013514 | 5 | +| 1,000 | sorted_list | linear_search | random | 500 | 0.064779 | 5 | +| 5,000 | sorted_list | linear_search | random | 500 | 0.323415 | 5 | + +##### Sorted list - Binary search (Iterative) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 1,000 | sorted_list | binary_search_iterative | random | 100 | 0.000154 | 5 | +| 1,000 | sorted_list | binary_search_iterative | random | 500 | 0.000766 | 5 | +| 5,000 | sorted_list | binary_search_iterative | random | 100 | 0.000172 | 5 | + +##### Sorted list - Binary search (Recursive) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 1,000 | sorted_list | binary_search_recursive | random | 100 | 0.000202 | 5 | +| 1,000 | sorted_list | binary_search_recursive | random | 500 | 0.000937 | 5 | +| 5,000 | sorted_list | binary_search_recursive | random | 100 | 0.000214 | 5 | + +##### Binary Sesrch Tree - bst search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|---------------------|------------------|------------------|----------|--------------|----------| +| 1,000 | binary_search_tree | bst_search | random | 100 | 0.000123 | 5 | +| 1,000 | binary_search_tree | bst_search | random | 500 | 0.000528 | 5 | +| 5,000 | binary_search_tree | bst_search | random | 100 | 0.000159 | 5 | + +#### Target Position Comparison (Fixed Dataset Size = 5,000) + +##### Un-sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|-----------------|------------------|------------------|----------|--------------|----------| +| 5,000 | unsorted_list | linear_search | beginning | 500 | 0.029487 | 5 | +| 5,000 | unsorted_list | linear_search | middle | 500 | 0.134085 | 5 | +| 5,000 | unsorted_list | linear_search | end | 500 | 0.164473 | 5 | +| 5,000 | unsorted_list | linear_search | nonexistent | 500 | | 5 | + +##### Sorted list - Linear search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|------------------|------------------|----------|--------------|----------| +| 5,000 | sorted_list | linear_search | beginning | 500 | 0.032096 | 5 | +| 5,000 | sorted_list | linear_search | middle | 500 | | 5 | +| 5,000 | sorted_list | linear_search | end | 500 | | 5 | +| 5,000 | sorted_list | linear_search | nonexistent | 500 | | 5 | + +##### Sorted list - Binary search (Iterative) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 5,000 | sorted_list | binary_search_iterative | beginning | 500 | 0.001463 | 5 | +| 5,000 | sorted_list | binary_search_iterative | middle | 500 | 0.001475 | 5 | +| 5,000 | sorted_list | binary_search_iterative | end | 500 | 0.001463 | 5 | +| 5,000 | sorted_list | binary_search_iterative | nonexistent | 500 | 0.001604 | 5 | + +##### Sorted list - Binary search (Recursive) +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|----------------|--------------------------|------------------|----------|--------------|----------| +| 5,000 | sorted_list | binary_search_recursive | beginning | 500 | 0.001726 | 5 | +| 5,000 | sorted_list | binary_search_recursive | middle | 500 | 0.001722 | 5 | +| 5,000 | sorted_list | binary_search_recursive | end | 500 | 0.001515 | 5 | +| 5,000 | sorted_list | binary_search_recursive | nonexistent | 500 | 0.002112 | 5 | + +##### Binary Sesrch Tree - bst search +| Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | +|--------------|---------------------|------------------|------------------|----------|--------------|----------| +| 5,000 | binary_search_tree | bst_search | beginning | 500 | 0.001381 | 5 | +| 5,000 | binary_search_tree | bst_search | middle | 500 | 0.001367 | 5 | +| 5,000 | binary_search_tree | bst_search | end | 500 | 0.001326 | 5 | +| 5,000 | binary_search_tree | bst_search | nonexistent | 500 | 0.001766 | 5 | + ### Results +All-in-all our results show that there are increased search times for Linear Search, while Binary Search and BST search times show relatively stable performance. This reinforces the difference in time complexity: linear search is O(n), and binary search is O(log n). + +Binary Search (iterative, recursive, or BST) outperforms Linear Search in terms of time complexity. Linear search takes longer, especially as the dataset grows, whereas binary search and BST search times remain relatively low and constant. + +#### Effect of Dataset size on search time + +Linear Search - Both sorted and unsorted lists experience a significant increase in search time as the dataset size increases, especially for the larger dataset (5000 entries). This is expected as linear search examines each element one by one, which results in a time complexity of O(n). + +Binary Search - Both iterative and recursive as well as Binary Search Tree searches have much lower average time even when the dataset size increases. This is because these algorthims benefit from the time complexity of (O(log n)), which makes them much more efficient than linear search as the dataset size increases. + +#### Effect of target position on search time + +Linear Search - On both unsorted and sorted lists, the target position significantly influences the search time. This behavior alligns with what is expeced for linear search where searching for a target at the "end" or "nonexistent" positions takes much longer, as more elements need to be examined or the search needs to traverse the entire list. + +Binary Search - Iterative and Recursive on sorted lists experience nearly constant search times for "beginning," "middle," and "end" target positions. This is because binary search always halves the search space with each step, so the position of the target within the list does not substantially affect the search time. Even for a "nonexistent" target, binary search only needs to perform a few comparisons, resulting in a very small increase in time. + # Conclusion # TODO From ea5ae4671599353c7bd0ad705b2907edef7fd0ef Mon Sep 17 00:00:00 2001 From: hemanialaparthi Date: Wed, 23 Apr 2025 02:10:56 -0400 Subject: [PATCH 16/19] feat(index.qmd): complete the future work, results, and movework into conclusion. --- .../spring2025/weeksixteen/teamtwo/index.qmd | 42 +++++++++++-------- 1 file changed, 25 insertions(+), 17 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index ae784c7..b21ad25 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -151,6 +151,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo #### Data Set Size and Searches Comparison (Random Target) ##### Un-sorted list - Linear search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|-----------------|------------------|------------------|----------|--------------|----------| | 1,000 | unsorted_list | linear_search | random | 100 | 0.004141 | 5 | @@ -158,6 +159,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | unsorted_list | linear_search | random | 100 | 0.010662 | 5 | ##### Sorted list - Linear search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|----------------|------------------|------------------|----------|--------------|----------| | 1,000 | sorted_list | linear_search | random | 500 | 0.136840 | 5 | @@ -165,6 +167,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | sorted_list | linear_search | random | 500 | 0.144115 | 5 | ##### Sorted list - Binary search (Iterative) + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|----------------|--------------------------|------------------|----------|--------------|----------| | 1,000 | sorted_list | binary_search_iterative | random | 100 | 0.000084 | 5 | @@ -172,23 +175,25 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | sorted_list | binary_search_iterative | random | 100 | 0.000077 | 5 | ##### Sorted list - Binary search (Recursive) + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|----------------|--------------------------|------------------|----------|--------------|----------| | 1,000 | sorted_list | binary_search_recursive | random | 100 | 0.000095 | 5 | | 1,000 | sorted_list | binary_search_recursive | random | 500 | 0.000478 | 5 | | 5,000 | sorted_list | binary_search_recursive | random | 100 | 0.000533 | 5 | -##### Binary Sesrch Tree - bst search +##### Binary Sesrch Tree - BST search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|---------------------|------------------|------------------|----------|--------------|----------| | 1,000 | binary_search_tree | bst_search | random | 100 | 0.000063 | 5 | | 1,000 | binary_search_tree | bst_search | random | 500 | 0.000311 | 5 | | 5,000 | binary_search_tree | bst_search | random | 100 | 0.000097 | 5 | - #### Target Position Comparison (Fixed Dataset Size = 5,000) ##### Un-sorted list - Linear search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|-----------------|------------------|------------------|----------|--------------|----------| | 5,000 | unsorted_list | linear_search | beginning | 500 | 0.012992 | 5 | @@ -197,6 +202,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | unsorted_list | linear_search | nonexistent | 500 | 0.287232 | 5 | ##### Sorted list - Linear search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|----------------|------------------|------------------|----------|--------------|----------| | 5,000 | sorted_list | linear_search | beginning | 500 | 0.013443 | 5 | @@ -205,6 +211,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | sorted_list | linear_search | nonexistent | 500 | 0.279474 | 5 | ##### Sorted list - Binary search (Iterative) + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|----------------|--------------------------|------------------|----------|--------------|----------| | 5,000 | sorted_list | binary_search_iterative | beginning | 500 | 0.000368 | 5 | @@ -213,6 +220,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | sorted_list | binary_search_iterative | nonexistent | 500 | 0.000436 | 5 | ##### Sorted list - Binary search (Recursive) + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|----------------|--------------------------|------------------|----------|--------------|----------| | 5,000 | sorted_list | binary_search_recursive | beginning | 500 | 0.000533 | 5 | @@ -220,7 +228,8 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo | 5,000 | sorted_list | binary_search_recursive | end | 500 | 0.000532 | 5 | | 5,000 | sorted_list | binary_search_recursive | nonexistent | 500 | 0.000613 | 5 | -##### Binary Sesrch Tree - bst search +##### Binary Sesrch Tree - BST search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|---------------------|------------------|------------------|----------|--------------|----------| | 5,000 | binary_search_tree | bst_search | beginning | 500 | 0.000467 | 5 | @@ -233,6 +242,7 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo #### Data Set Size and Searches Comparison (Random Target) ##### Un-sorted list - Linear search + | Dataset Size | Data Structure | Search Algorithm | Target Position | Searches | Avg Time (s) | Doubling | |--------------|-----------------|------------------|------------------|----------|--------------|----------| | 1,000 | unsorted_list | linear_search | random | 100 | 0.010524 | 5 | @@ -311,25 +321,17 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo ### Results -All-in-all our results show that there are increased search times for Linear Search, while Binary Search and BST search times show relatively stable performance. This reinforces the difference in time complexity: linear search is O(n), and binary search is O(log n). - -Binary Search (iterative, recursive, or BST) outperforms Linear Search in terms of time complexity. Linear search takes longer, especially as the dataset grows, whereas binary search and BST search times remain relatively low and constant. - -#### Effect of Dataset size on search time - -Linear Search - Both sorted and unsorted lists experience a significant increase in search time as the dataset size increases, especially for the larger dataset (5000 entries). This is expected as linear search examines each element one by one, which results in a time complexity of O(n). +Our benchmarking reveals dramatic efficiency differences between search algorithms across varying dataset sizes and target positions. Linear search exhibits clear `O(n)` behavior, with search times increasing proportionally with dataset size (`1,000` to `5,000` elements) and strongly influenced by target position (up to `22×` slower for `nonexistent` targets vs. `beginning`). In contrast, `binary search` algorithms (`iterative`, `recursive`) and `BST` demonstrate consistent `O(log n)` efficiency, outperforming `linear search` by `100-1000×` (`0.0004`-`0.002s` vs. `0.05`-`0.28s` for `5,000` elements), with performance remaining stable regardless of target position. While `iterative binary search` slightly outperforms its `recursive` counterpart, all logarithmic algorithms maintain near-constant time even as datasets grow. -Binary Search - Both iterative and recursive as well as Binary Search Tree searches have much lower average time even when the dataset size increases. This is because these algorthims benefit from the time complexity of (O(log n)), which makes them much more efficient than linear search as the dataset size increases. - -#### Effect of target position on search time +# Conclusion -Linear Search - On both unsorted and sorted lists, the target position significantly influences the search time. This behavior alligns with what is expeced for linear search where searching for a target at the "end" or "nonexistent" positions takes much longer, as more elements need to be examined or the search needs to traverse the entire list. +All-in-all our results show that there are increased search times for `Linear Search`, while `Binary Search` and BST search times show relatively stable performance. This reinforces the difference in time complexity: `linear search` is `O(n)`, and `binary search` is `O(log n)`. -Binary Search - Iterative and Recursive on sorted lists experience nearly constant search times for "beginning," "middle," and "end" target positions. This is because binary search always halves the search space with each step, so the position of the target within the list does not substantially affect the search time. Even for a "nonexistent" target, binary search only needs to perform a few comparisons, resulting in a very small increase in time. +Our experiments demonstrate that `binary search` algorithms (`iterative`, `recursive`) and BST consistently outperform `linear search` and it takes longer, especially as the dataset grows, whereas` binary search` and `BST` search times remain relatively low and constant. -# Conclusion +Both `sorted` and `unsorted` lists experience a significant increase in search time as the dataset size increases, especially for the larger dataset (`5000` entries). This is expected as `linear search` examines each element one by one, which results in a time complexity of` O(n)`. In contrast, `binary search` and `BST` searches have much lower average time even when the dataset size increases, benefiting from `O(log n)` time complexity. -# TODO +Target position significantly influences `linear search`performance, with searching for targets at the `end` or `nonexistent` positions taking much longer (up to `22×` slower for `nonexistent` vs. `beginning` targets). This occurs because more elements need to be examined or the search needs to traverse the entire list. `Binary search` algorithms, however, experience nearly constant search times for all target positions because they always halve the search space with each step. Even for `nonexistent` targets, `binary search` only needs to perform a few additional comparisons, resulting in minimal performance differences. # Future Work @@ -338,6 +340,12 @@ Several areas for future investigation could provide additional insights: - **Floating-point comparison overhead might affect the relative performance advantages** - Decimal comparisons introduce additional computational overhead and potential precision issues that could affect performance differently across algorithms. This would be particularly relevant for scientific computing and financial applications where decimal data is common. +- **Hybrid Approaches** + - Evaluate potential performance benefits of algorithms that switch strategies based on dataset size (e.g., linear search for small segments, binary search for larger ones) + +- **Unbalanced BST Performance** + - Examine how performance degrades when BSTs become unbalanced, and compare with tree-balancing algorithms + # References - [Linear vs Binary Search](https://www.geeksforgeeks.org/linear-search-vs-binary-search/) From d3c34eb591706b6e62e3fa3224a9cb4276a52fb3 Mon Sep 17 00:00:00 2001 From: Bennett03 Date: Wed, 23 Apr 2025 02:33:25 -0400 Subject: [PATCH 17/19] Updated Binary Search Section for both Iterative & Recursive --- .../spring2025/weeksixteen/teamtwo/index.qmd | 60 ++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index b21ad25..467c20a 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -70,7 +70,65 @@ Linear search sequentially checks each element until finding the target or reach ## Binary Search -TODO +```cmd +def binary_search_iterative(dataset: List[Any], target: Any) -> Optional[int]: + """Perform an iterative binary search on the dataset. + + Note: Dataset must be sorted for binary search to work correctly. + + Args: + dataset: Sorted list to search through + target: Element to search for + + Returns: + int: Index of the target element, or None if not found + """ + left, right = 0, len(dataset) - 1 + while left <= right: + mid = (left + right) // 2 + if dataset[mid] == target: + return mid + elif dataset[mid] < target: + left = mid + 1 + else: + right = mid - 1 + return None +``` + +Binary search iteratively divides a sorted dataset in half, comparing the middle element to the target. If the target is smaller, it searches the left half; if larger, it searches the right. This process continues until the target is found or the search space is empty. + +```cmd +def binary_search_recursive( + dataset: List[Any], target: Any, left: int = 0, right: Optional[int] = None +) -> Optional[int]: + """Perform a recursive binary search on the dataset. + + Note: Dataset must be sorted for binary search to work correctly. + + Args: + dataset: Sorted list to search through + target: Element to search for + left: Left boundary index + right: Right boundary index + + Returns: + int: Index of the target element, or None if not found + """ + if right is None: + right = len(dataset) - 1 + if left > right: + return None + + mid = (left + right) // 2 + if dataset[mid] == target: + return mid + elif dataset[mid] < target: + return binary_search_recursive(dataset, target, mid + 1, right) + else: + return binary_search_recursive(dataset, target, left, mid - 1) +``` + +Recursive binary search repeatedly divides a sorted dataset in half by calling itself with updated boundaries (left and right). It compares the middle element to the target, narrowing the search to the left or right half until the target is found or the search space is empty. ## Binary Search Tree From e93faa2fde559ef17ecf40e2f0b7fbf63078797d Mon Sep 17 00:00:00 2001 From: Hemani Alaparthi <143897209+hemanialaparthi@users.noreply.github.com> Date: Wed, 23 Apr 2025 10:28:11 -0400 Subject: [PATCH 18/19] feat(index.qmd): fix the conclusion and grammatical issues --- .../spring2025/weeksixteen/teamtwo/index.qmd | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index 467c20a..e065910 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -35,7 +35,7 @@ Our benchmarking provides empirical insights to guide algorithm selection in pro # Method -For this experiment, we developed a benchmarking tool that allows for systematic comparison between search algorithms across different data structures. The tool measures execution time while controlling for: +For this experiment, we developed a benchmarking tool called [lvb](https://github.com/hemanialaparthi/lvb) that allows for systematic comparison between search algorithms across different data structures. The tool measures execution time while controlling for: 1. Data structure type (`unsorted list`, `sorted list` and `binary search tree`) 2. Search algorithm (`linear search` vs. `binary search` vs. `BST search`) @@ -191,7 +191,8 @@ The Binary Search Tree, is such an efficient approach for searching projects, wi 5 15 / \ / \ 3 7 12 18 -the smaller numbers go to the left, and the bigger numbers go to the right. It is very efficient because you do not necessarily go through the majority of the values, depending on the value. + +The smaller numbers go to the left, and the bigger numbers go to the right. It is very efficient because you do not necessarily go through the majority of the values, depending on the value. # Data @@ -203,7 +204,6 @@ We conducted experiments with datasets of `1,000` and `5,000` elements, using bo # Data Tables - ### MacOS #### Data Set Size and Searches Comparison (Random Target) @@ -383,13 +383,14 @@ Our benchmarking reveals dramatic efficiency differences between search algorith # Conclusion -All-in-all our results show that there are increased search times for `Linear Search`, while `Binary Search` and BST search times show relatively stable performance. This reinforces the difference in time complexity: `linear search` is `O(n)`, and `binary search` is `O(log n)`. +All-in-all our results show that there are increased search times for linear search, while binary search and BST search times show relatively stable performance. This reinforces the difference in time complexity: linear search is `O(n)`, and binary search is `O(log n)`. +Our experiments demonstrate that binary search algorithms (`iterative`, `recursive`) and BST consistently outperform linear search. Linear search takes longer, especially as the dataset grows, whereas binary search and BST search times remain relatively low and constant. -Our experiments demonstrate that `binary search` algorithms (`iterative`, `recursive`) and BST consistently outperform `linear search` and it takes longer, especially as the dataset grows, whereas` binary search` and `BST` search times remain relatively low and constant. +Both `sorted` and `unsorted` lists experience a significant increase in search time as the dataset size increases, especially for the larger dataset (5000 entries). This is expected as linear search examines each element one by one, which results in a time complexity of `O(n)`. In contrast, binary search and BST searches have much lower average time even when the dataset size increases, benefiting from `O(log n)` time complexity. -Both `sorted` and `unsorted` lists experience a significant increase in search time as the dataset size increases, especially for the larger dataset (`5000` entries). This is expected as `linear search` examines each element one by one, which results in a time complexity of` O(n)`. In contrast, `binary search` and `BST` searches have much lower average time even when the dataset size increases, benefiting from `O(log n)` time complexity. +Target position significantly influences linear search performance, with searching for targets at the end or nonexistent positions taking much longer (up to `22×` slower for `nonexistent` vs. `beginning` targets). This occurs because more elements need to be examined or the search needs to traverse the entire list. Binary search algorithms, however, experience nearly constant search times for all target positions because they always halve the search space with each step. Even for `nonexistent` targets, binary search only needs to perform a few additional comparisons, resulting in minimal performance differences. -Target position significantly influences `linear search`performance, with searching for targets at the `end` or `nonexistent` positions taking much longer (up to `22×` slower for `nonexistent` vs. `beginning` targets). This occurs because more elements need to be examined or the search needs to traverse the entire list. `Binary search` algorithms, however, experience nearly constant search times for all target positions because they always halve the search space with each step. Even for `nonexistent` targets, `binary search` only needs to perform a few additional comparisons, resulting in minimal performance differences. +It's important to note that while binary search and BSTs offer superior search performance, they come with preprocessing costs. Binary search requires a sorted array, and BSTs need to be constructed. These upfront costs should be considered in scenarios where data frequently changes. # Future Work @@ -400,9 +401,11 @@ Several areas for future investigation could provide additional insights: - **Hybrid Approaches** - Evaluate potential performance benefits of algorithms that switch strategies based on dataset size (e.g., linear search for small segments, binary search for larger ones) + - Determine specific thresholds at which switching between algorithms becomes beneficial, possibly using metrics like dataset size, expected search frequency, and data volatility - **Unbalanced BST Performance** - - Examine how performance degrades when BSTs become unbalanced, and compare with tree-balancing algorithms + - Examine how performance degrades when BSTs become unbalanced, and compare with tree-balancing algorithms such as AVL trees and Red-Black trees + - Measure the overhead cost of maintaining balance in self-balancing trees and compare against the search performance benefits # References From e5af1b75360745a8898ffd76fcc33540c6144d15 Mon Sep 17 00:00:00 2001 From: Hemani Alaparthi <143897209+hemanialaparthi@users.noreply.github.com> Date: Wed, 23 Apr 2025 10:30:00 -0400 Subject: [PATCH 19/19] fix(index.qmd): fix the name of the links in the references --- allhands/spring2025/weeksixteen/teamtwo/index.qmd | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/allhands/spring2025/weeksixteen/teamtwo/index.qmd b/allhands/spring2025/weeksixteen/teamtwo/index.qmd index e065910..203844f 100644 --- a/allhands/spring2025/weeksixteen/teamtwo/index.qmd +++ b/allhands/spring2025/weeksixteen/teamtwo/index.qmd @@ -410,5 +410,5 @@ Several areas for future investigation could provide additional insights: # References - [Linear vs Binary Search](https://www.geeksforgeeks.org/linear-search-vs-binary-search/) -- [Binary Search Tree](https://www.geeksforgeeks.org/introduction-to-binary-search-tree/) -- [Binary Search Tree](https://www.geeksforgeeks.org/binary-search-tree-set-1-search-and-insertion/) +- [Intro to Binary Search Tree](https://www.geeksforgeeks.org/introduction-to-binary-search-tree/) +- [Searching in Binary Search Tree](https://www.geeksforgeeks.org/binary-search-tree-set-1-search-and-insertion/)