Skip to content

Conversation

@dev-habib-nuhu
Copy link
Contributor

@dev-habib-nuhu dev-habib-nuhu commented Oct 30, 2025

User description

Description

What

  • Added configurable performance optimizations (page size, concurrent requests, rate limiting, project filtering) to the Bitbucket Server integration.

Why

  • Bitbucket Server Integration takes a long time before completing resync

How

  • Implemented semaphore-controlled parallel PR fetching, configurable pagination, flexible rate limits, and regex/suffix project filtering.

Type of change

  • New feature (non-breaking change which adds functionality)

All tests should be run against the port production environment(using a testing org).

Core testing checklist

  • Integration able to create all default resources from scratch
  • Resync finishes successfully
  • Resync able to create entities
  • Resync able to update entities
  • Resync able to detect and delete entities
  • Scheduled resync able to abort existing resync and start a new one
  • Tested with at least 2 integrations from scratch
  • Tested with Kafka and Polling event listeners
  • Tested deletion of entities that don't pass the selector

Integration testing checklist

  • Integration able to create all default resources from scratch
  • Completed a full resync from a freshly installed integration and it completed successfully
  • Resync able to create entities
  • Resync able to update entities
  • Resync able to detect and delete entities
  • Resync finishes successfully
  • If new resource kind is added or updated in the integration, add example raw data, mapping and expected result to the examples folder in the integration directory.
  • If resource kind is updated, run the integration with the example data and check if the expected result is achieved
  • If new resource kind is added or updated, validate that live-events for that resource are working as expected
  • Docs PR link here

Preflight checklist

  • Handled rate limiting
  • Handled pagination
  • Implemented the code in async
  • Support Multi account

Screenshots

Include screenshots from your environment showing how the resources of the integration will look.

API Documentation

Provide links to the API documentation used for this integration.


PR Type

Enhancement


Description

  • Add 6 configurable performance optimization parameters to Bitbucket Server integration

  • Implement parallel PR fetching with semaphore-controlled concurrency

  • Add project filtering by regex patterns and suffix matching

  • Make pagination page size and rate limiting configurable


Diagram Walkthrough

flowchart LR
  Config["Configuration Parameters"]
  Config -->|rate_limit, rate_limit_window| RateLimiter["AsyncLimiter"]
  Config -->|page_size| Pagination["Paginated Requests"]
  Config -->|max_concurrent_requests| Semaphore["BoundedSemaphore"]
  Config -->|projects_filter_regex, suffix| ProjectFilter["Project Filtering"]
  Semaphore -->|controls| PRFetching["Parallel PR Fetching"]
  ProjectFilter -->|filters| Projects["Project Batches"]
  Pagination -->|fetches| Resources["API Resources"]
  RateLimiter -->|throttles| Resources
Loading

File Walkthrough

Relevant files
Enhancement
client.py
Add configurable performance optimizations and project filtering

integrations/bitbucket-server/client.py

  • Added 6 new optional configuration parameters (rate_limit,
    rate_limit_window, page_size, max_concurrent_requests,
    projects_filter_regex, projects_filter_suffix)
  • Implemented _should_include_project() method for regex and
    suffix-based project filtering
  • Added asyncio.BoundedSemaphore for controlling concurrent PR requests
  • Modified get_paginated_resource() to use configurable page size
  • Updated get_projects() to apply regex/suffix filters to project
    batches
  • Modified get_pull_requests() to use semaphore-controlled parallel
    fetching with semaphore_async_iterator
+90/-14 
utils.py
Update client initialization with new config parameters   

integrations/bitbucket-server/utils.py

  • Updated initialize_client() to extract and pass all 6 new
    configuration parameters
  • Added default value handling for rate limiting, pagination, and
    concurrency settings
  • Refactored config access to use local variable for cleaner code
+36/-6   
webhook_client.py
Update webhook client initialization with config parameters

integrations/bitbucket-server/webhook_processors/webhook_client.py

  • Updated initialize_client() function to extract and pass all 6 new
    configuration parameters
  • Added default value handling matching the main client initialization
  • Refactored config access pattern for consistency
+25/-5   
Tests
test_client.py
Add comprehensive tests for performance optimizations       

integrations/bitbucket-server/tests/test_client.py

  • Added 13 new test cases covering configurable page size, rate limits,
    and concurrency
  • Added tests for project filtering with regex patterns, suffix
    patterns, and combined filters
  • Added tests verifying default values and disabled filtering behavior
  • Added test for semaphore-based concurrency control in PR fetching
+212/-2 
Configuration changes
spec.yaml
Add configuration spec for performance optimization parameters

integrations/bitbucket-server/.port/spec.yaml

  • Added 6 new optional configuration fields to spec
  • Defined bitbucketRateLimit (default: 1000) and
    bitbucketRateLimitWindow (default: 3600)
  • Defined bitbucketPageSize (default: 25) and
    bitbucketMaxConcurrentRequests (default: 10)
  • Added bitbucketProjectsFilterRegex and bitbucketProjectsFilterSuffix
    for project filtering
+28/-0   
pyproject.toml
Bump version to 0.1.63-beta                                                           

integrations/bitbucket-server/pyproject.toml

  • Bumped version from 0.1.62-beta to 0.1.63-beta
+1/-1     
Documentation
CHANGELOG.md
Update changelog with version 0.1.63-beta release               

integrations/bitbucket-server/CHANGELOG.md

  • Added version 0.1.63-beta release notes
  • Documented bug fix for long resync completion times
+7/-0     

Test added 3 commits October 30, 2025 22:53
Add configurable page size, concurrent PR fetching, flexible rate limiting,
and project filtering to improve sync performance for large deployments.

- Add 6 new optional config parameters
- Implement parallel PR fetching with semaphore control
- Add project filtering with regex/suffix patterns
@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Oct 30, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Action logging: New functionality (project filtering, pagination changes, concurrency control) does not
add or reference audit logging for critical actions, but it may be handled elsewhere in
the codebase.

Referred Code
    f"Getting projects with filter: {projects_filter}, "
    f"regex: {self.projects_filter_regex}, suffix: {self.projects_filter_suffix}"
)
if projects_filter:
    filtered_projects = await self._get_projects_with_filter(projects_filter)
    # Apply regex/suffix filters
    final_projects = [
        p for p in filtered_projects if self._should_include_project(p["key"])
    ]
    if final_projects:
        yield final_projects
else:
    async for project_batch in self._get_all_projects():
        # Apply regex/suffix filters to each batch
        filtered_batch = [
            p for p in project_batch if self._should_include_project(p["key"])
        ]
        if filtered_batch:
            yield filtered_batch
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Config casting: Direct int casting of config values lacks validation and fallback for invalid inputs,
which may raise runtime errors if misconfigured.

Referred Code
rate_limit = int(config.get("bitbucket_rate_limit", DEFAULT_BITBUCKET_RATE_LIMIT))
rate_limit_window = int(
    config.get("bitbucket_rate_limit_window", DEFAULT_BITBUCKET_RATE_LIMIT_WINDOW)
)

# Extract pagination and concurrency configuration
page_size = int(config.get("bitbucket_page_size", DEFAULT_PAGE_SIZE))
max_concurrent_requests = int(
    config.get("bitbucket_max_concurrent_requests", DEFAULT_MAX_CONCURRENT_REQUESTS)
)
Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Log data exposure: Info log includes raw filter patterns and project filters which might reveal configuration
details; verify this is acceptable for log level and does not include sensitive secrets.

Referred Code
logger.info(
    f"Getting projects with filter: {projects_filter}, "
    f"regex: {self.projects_filter_regex}, suffix: {self.projects_filter_suffix}"
)
if projects_filter:
Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Regex safety: User-provided regex and suffix filters are compiled/used without validation or safeguards
(e.g., catastrophic backtracking risk and length limits), which could impact performance
or stability.

Referred Code
    self.projects_filter_regex = (
        re.compile(projects_filter_regex) if projects_filter_regex else None
    )
    self.projects_filter_suffix = projects_filter_suffix

    # Despite this, being the rate limits, we do not reduce to the lowest common factor because we want to allow as much
    # concurrency as possible. This is because we expect most users to have resources
    # synced under one hour.
    self.rate_limiter = AsyncLimiter(rate_limit, rate_limit_window)

    # Initialize semaphore for controlling concurrent PR requests
    self.pr_semaphore = asyncio.BoundedSemaphore(max_concurrent_requests)

async def _send_api_request(
    self,
    method: str,
    path: str,
    payload: Optional[dict[str, Any]] = None,
    params: Optional[dict[str, Any]] = None,
) -> dict[str, Any] | None:
    """


 ... (clipped 56 lines)
  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Oct 30, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Eliminate duplicated client initialization logic
Suggestion Impact:The commit addressed the same concern as the suggestion by removing duplicated configuration extraction logic. Instead of reusing the base client initialization function, the commit simplified the code by removing unused project filtering configuration and using default constants for page_size and max_concurrent_requests. The approach differs from the suggestion but achieves a similar goal of reducing code duplication.

code diff:

-    # Extract pagination and concurrency configuration
-    page_size = int(config.get("bitbucket_page_size", 25))
-    max_concurrent_requests = int(config.get("bitbucket_max_concurrent_requests", 10))
-
-    # Extract project filtering configuration
-    projects_filter_regex = config.get("bitbucket_projects_filter_regex")
-    projects_filter_suffix = config.get("bitbucket_projects_filter_suffix")
+    # Webhook client does not use project filtering
+    project_filter_regex = None
 
     return BitbucketServerWebhookClient(
         username=config["bitbucket_username"],
@@ -352,8 +347,7 @@
         ),
         rate_limit=rate_limit,
         rate_limit_window=rate_limit_window,
-        page_size=page_size,
-        max_concurrent_requests=max_concurrent_requests,
-        projects_filter_regex=projects_filter_regex,
-        projects_filter_suffix=projects_filter_suffix,
+        page_size=DEFAULT_PAGE_SIZE,
+        max_concurrent_requests=DEFAULT_MAX_CONCURRENT_REQUESTS,
+        project_filter_regex=project_filter_regex,

Refactor initialize_client in webhook_client.py to eliminate duplicated code by
reusing the initialize_client function from utils.py for a single source of
truth.

integrations/bitbucket-server/webhook_processors/webhook_client.py [328-359]

+from utils import initialize_client as initialize_base_client
+
+
 def initialize_client() -> BitbucketServerWebhookClient:
-    config = ocean.integration_config
-
-    # Extract rate limiting configuration
-    rate_limit = int(config.get("bitbucket_rate_limit", 1000))
-    rate_limit_window = int(config.get("bitbucket_rate_limit_window", 3600))
-
-    # Extract pagination and concurrency configuration
-    page_size = int(config.get("bitbucket_page_size", 25))
-    max_concurrent_requests = int(config.get("bitbucket_max_concurrent_requests", 10))
-
-    # Extract project filtering configuration
-    projects_filter_regex = config.get("bitbucket_projects_filter_regex")
-    projects_filter_suffix = config.get("bitbucket_projects_filter_suffix")
-
+    base_client = initialize_base_client()
     return BitbucketServerWebhookClient(
-        username=config["bitbucket_username"],
-        password=config["bitbucket_password"],
-        base_url=config["bitbucket_base_url"],
-        webhook_secret=config.get("bitbucket_webhook_secret"),
-        app_host=ocean.app.base_url,
-        is_version_8_7_or_older=cast(
-            bool,
-            config.get("bitbucket_is_version8_point7_or_older"),
-        ),
-        rate_limit=rate_limit,
-        rate_limit_window=rate_limit_window,
-        page_size=page_size,
-        max_concurrent_requests=max_concurrent_requests,
-        projects_filter_regex=projects_filter_regex,
-        projects_filter_suffix=projects_filter_suffix,
+        username=base_client.username,
+        password=base_client.password,
+        base_url=base_client.base_url,
+        webhook_secret=base_client.webhook_secret,
+        app_host=base_client.app_host,
+        is_version_8_7_or_older=base_client.is_version_8_7_or_older,
+        rate_limit=base_client.rate_limiter.max_rate,
+        rate_limit_window=int(base_client.rate_limiter.time_period),
+        page_size=base_client.page_size,
+        max_concurrent_requests=base_client.max_concurrent_requests,
+        projects_filter_regex=base_client.projects_filter_regex.pattern if base_client.projects_filter_regex else None,
+        projects_filter_suffix=base_client.projects_filter_suffix,
     )

[Suggestion processed]

Suggestion importance[1-10]: 7

__

Why: The suggestion correctly identifies duplicated client initialization logic and proposes a valid refactoring to improve maintainability by reusing code, which is a significant improvement.

Medium
Refactor filtering logic for clarity
Suggestion Impact:The commit implemented a similar refactoring of the _should_include_project method, but took a different approach. Instead of using separate variables for regex and suffix checks, the commit removed the suffix filter entirely and simplified the logic to only check the regex filter. The final implementation uses a single boolean return statement similar to the suggestion's intent, but with a simpler structure due to having only one filter type.

code diff:

@@ -136,20 +133,11 @@
             True if the project should be included, False otherwise
         """
         # If no filters are set, include all projects
-        if not self.projects_filter_regex and not self.projects_filter_suffix:
+        if not self.project_filter_regex:
             return True
 
         # Check regex filter
-        if self.projects_filter_regex:
-            if not self.projects_filter_regex.match(project_key):
-                return False
-
-        # Check suffix filter
-        if self.projects_filter_suffix:
-            if not project_key.endswith(self.projects_filter_suffix):
-                return False
-
-        return True
+        return bool(self.project_filter_regex.match(project_key))

Refactor the _should_include_project method to evaluate filter conditions
separately before combining them, aiming for improved clarity.

integrations/bitbucket-server/client.py [128-152]

 def _should_include_project(self, project_key: str) -> bool:
     """
     Check if a project should be included based on filter patterns.
 
     Args:
         project_key: The project key to check
 
     Returns:
         True if the project should be included, False otherwise
     """
     # If no filters are set, include all projects
     if not self.projects_filter_regex and not self.projects_filter_suffix:
         return True
 
-    # Check regex filter
-    if self.projects_filter_regex:
-        if not self.projects_filter_regex.match(project_key):
-            return False
+    regex_passes = (
+        bool(self.projects_filter_regex.match(project_key))
+        if self.projects_filter_regex
+        else True
+    )
 
-    # Check suffix filter
-    if self.projects_filter_suffix:
-        if not project_key.endswith(self.projects_filter_suffix):
-            return False
+    suffix_passes = (
+        project_key.endswith(self.projects_filter_suffix)
+        if self.projects_filter_suffix
+        else True
+    )
 
-    return True
+    return regex_passes and suffix_passes

[Suggestion processed]

Suggestion importance[1-10]: 3

__

Why: The suggestion proposes a stylistic refactoring of the filtering logic which, while correct, offers only a marginal and subjective improvement in readability over the existing clear implementation.

Low
  • Update

dev-habib-nuhu and others added 3 commits November 18, 2025 16:36
Co-authored-by: Michael Kofi Armah <mikeyarmah@gmail.com>
- Removed default rate limit constants and added validation to ensure both `rate_limit` and `rate_limit_window` are provided during `BitbucketClient` initialization.
- Updated `initialize_client` methods in `utils.py` and `webhook_client.py` to use configuration values for rate limits.
- Adjusted tests to reflect changes in client initialization and ensure proper handling of rate limits.
Comment on lines -90 to -92
## 0.1.71-beta (2025-11-17)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might have unintentionally deleted this

type: boolean
description: Whether the Bitbucket Server version is 8.7 or older
required: false
- name: bitbucketRateLimit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: bitbucketRateLimit
- name: bitbucketRateLimitQuota

)
if projects_filter:
yield await self._get_projects_with_filter(projects_filter)
filtered_projects = await self._get_projects_with_filter(projects_filter)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this an then the regex filter? What does this do that can't be achieved in the regular expression?

ocean.integration_config.get("bitbucket_is_version8_point7_or_older"),
config.get("bitbucket_is_version8_point7_or_older"),
),
rate_limit=int(config["bitbucket_rate_limit"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rate_limit=int(config["bitbucket_rate_limit"]),
rate_limit=int(config["bitbucket_rate_limit_quota"]),

ocean.integration_config.get("bitbucket_is_version8_point7_or_older"),
config.get("bitbucket_is_version8_point7_or_older"),
),
rate_limit=int(config["bitbucket_rate_limit"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants