Skip to content

Conversation

@fivertran-karunveluru
Copy link
Collaborator

AdMob Connector

Created: 2025-10-31

Business Owner: Marketing Analytics & Revenue Operations Team

Technical Owner: Data Engineering Team

Last Updated: 2025-10-31

Business Context

  • Data Source: Google AdMob API for mobile app advertising performance and revenue data
  • Business Criticality: High - supports revenue optimization, advertising performance analysis, and publisher monetization strategies
  • Data Consumers: Marketing teams, revenue operations, product managers, mobile app developers, executive leadership
  • Business SLAs: Data must be fresh within 4 hours for revenue reporting, 24 hours for historical analysis
  • Compliance Requirements: Privacy compliance for user data, revenue reporting accuracy for financial statements
  • Budget Constraints: AdMob API access is free, rate limits based on standard Google Cloud quotas

Technical Context

  • API Documentation: https://developers.google.com/admob/api
  • Authentication Method: OAuth2 with client credentials and refresh token
  • Rate Limits: Standard Google Cloud API quotas (varies by project), typically 1,000+ requests/hour
  • Data Volume:
    • Publisher Accounts: 1-10+ accounts per integration
    • Network Reports: 365+ days of daily reports per account
    • Ad Units: 10-1,000+ ad units per app
    • Apps: 1-100+ apps per publisher account
    • Countries: 50-200+ countries tracked per report
    • Daily Report Records: 100-50,000+ records per day per account
  • Data Velocity: Reports updated daily, accounts updated on configuration changes, real-time metrics available
  • Data Quality: Structured JSON with consistent schema, some fields may be null for incomplete data
  • Network Considerations: HTTPS only, RESTful API with Google Cloud infrastructure, global CDN

Operational Context

  • Deployment Environment: Development (sandbox), staging, and production environments
  • Monitoring Requirements: Alert on >2% error rate, >2 hour sync time, revenue data discrepancies
  • Maintenance Windows: Off-peak hours for non-critical updates, immediate deployment for revenue-critical fixes
  • Team Structure: Data Engineering team, Marketing Analytics, Revenue Operations, Mobile App Development
  • Escalation Path: Data Engineer → Team Lead → Marketing Director → CMO

API-Specific Details

  • Base Endpoint: https://admob.googleapis.com/v1
  • OAuth Token Endpoint: https://oauth2.googleapis.com/token
  • Authentication: Bearer token in Authorization header (OAuth2)
  • Pagination: Report-based streaming with row-based pagination
  • Date Format: ISO 8601 (e.g., 2024-01-15T10:30:00Z) and YYYY-MM-DD for report dates
  • Response Format: JSON with nested objects and arrays, metric values in micros (1/1,000,000 units)
  • Key Endpoints:
    • /accounts - Publisher account information and settings
    • /accounts/{publisher_id}/networkReport:generate - Network performance reports
    • Future endpoints for mediation reports and app-level analytics

Data Schema Overview

  • accounts: Publisher account profile, currency settings, and reporting timezone
  • network_reports: Daily advertising performance metrics with dimensional breakdowns
    • Dimensions: Date, Ad Unit, App, Country, Platform, Ad Type
    • Metrics: Estimated Earnings, Ad Requests, Matched Requests, Show Rate, Impressions, Clicks

Data Replication Expectations

  • Initial Sync: Last 90 days of network reports by default (configurable up to 365 days)
  • Incremental Sync: Data since last successful sync timestamp using date-based cursors
  • Sync Frequency:
    • Production: Every 4 hours for network reports, daily for account updates
    • Development: Daily for all data types
  • Data Retention: 2+ years of historical report data for trend analysis
  • Backfill Capability: Full historical data available based on AdMob retention policies (typically 2 years)
  • Data Consistency: Daily updates with 4-hour maximum lag for operational reporting

Operational Requirements

  • Uptime SLA: 99.5% availability during business hours (revenue reporting critical)
  • Performance SLA:
    • Initial sync: <4 hours for 90 days of report data
    • Incremental sync: <30 minutes for daily updates
  • Error Handling:
    • Automatic retry with exponential backoff and jitter
    • Dead letter queue for failed report records
    • Alert on consecutive sync failures during reporting periods
  • Monitoring:
    • API response times and error rates
    • Report record count trends and anomaly detection
    • Revenue data completeness validation
    • OAuth token refresh success rates
  • Security:
    • OAuth tokens refreshed automatically (1-hour expiration)
    • Access logs maintained for 2 years (audit compliance)
    • Publisher account data handling per privacy regulations

Rate Limiting Strategy

  • Standard Quotas: 1,000 requests/hour per OAuth client, 10,000 requests/day
  • Quota Management: Implement exponential backoff with jitter for 429 responses
  • Error Handling: 429 status code indicates rate limit exceeded, respect Retry-After header when provided
  • Recommended: Implement exponential backoff with jitter (10-30% of wait time) to prevent thundering herd
  • Monitoring: Track rate limit utilization and plan for quota increases if needed
  • Retry Strategy: Default 3 retry attempts with exponential backoff, configurable per request

Data Quality Considerations

  • Required Fields: publisher_id, id (composite key), date, estimated_earnings
  • Optional Fields: ad_unit_name, app_name, country, platform, ad_type, matched_requests
  • Data Validation:
    • Publisher IDs must be valid AdMob account identifiers
    • Report IDs must be unique (composite: publisher_id + date + ad_unit_id)
    • Dates must be valid and within supported range
    • Earnings values must be non-negative and converted from micros (divided by 1,000,000)
    • Metric values must be numeric (integer or float)
  • Data Completeness:
    • Accounts: 100% have basic publisher information
    • Network Reports: 95%+ have complete metric data for active ad units
    • Historical Reports: 90%+ completeness for date ranges
  • Duplicate Handling: Primary key constraints prevent duplicate report records (publisher_id + date + ad_unit_id)
  • Data Transformation:
    • Micros values (earnings) converted to decimal currency format
    • ISO 8601 timestamps for all date/time fields
    • UTC timezone normalization for consistency

Integration Points

  • Fivetran Destinations: Snowflake, BigQuery, Redshift, PostgreSQL, Databricks
  • Downstream Systems:
    • Marketing analytics platforms (Tableau, Looker, Power BI)
    • Revenue operations systems
    • Mobile app analytics dashboards
    • Financial reporting systems
    • Ad performance optimization tools
  • Data Dependencies: None - standalone advertising data source
  • External Dependencies: Google AdMob API availability, OAuth token refresh service

Disaster Recovery

  • Backup Strategy: Daily snapshots of all account and report tables
  • Recovery Time Objective: 4 hours for full data recovery
  • Recovery Point Objective: 2 hours maximum data loss for revenue-critical reports
  • Failover: Automatic failover to backup OAuth credentials
  • Testing: Monthly disaster recovery drills with revenue operations team validation

Compliance & Security

  • Data Classification: Revenue data - financially sensitive, publisher account data - business sensitive
  • Retention Policy: 2 years for report data (analytics), 3 years for account data (audit)
  • Access Controls: Strict role-based access with principle of least privilege
  • Audit Trail: All data access logged and monitored for compliance audits
  • Encryption: Data encrypted in transit (HTTPS) and at rest with enterprise-grade security
  • Privacy: Publisher account data privacy compliance, revenue data accuracy for financial reporting
  • OAuth Security: Refresh tokens stored securely, access tokens never logged or exposed

Performance Optimization

  • Streaming Processing: Generator-based data processing prevents memory accumulation
  • Checkpointing: Incremental state checkpoints every N records (configurable, default 100)
  • Caching: Account data cached during sync to avoid redundant API calls
  • Indexing: Publisher ID, date, ad unit ID, and app ID columns indexed for efficient querying
  • Partitioning: Report data partitioned by date and publisher for efficient querying
  • Parallel Processing: Multiple publisher accounts processed sequentially to respect rate limits
  • Memory Management: Streaming approach prevents memory issues with large report datasets

Troubleshooting Guide

  • Common Issues:
    • OAuth token refresh failed: Verify client_id, client_secret, and refresh_token validity
    • Rate limit exceeded: Reduce sync frequency, implement backoff delays, or request quota increase
    • Missing report data: Check publisher ID validity and account access permissions
    • Timeout errors: Increase timeout values (default 30 seconds) or reduce batch size
    • Revenue discrepancies: Validate micros-to-currency conversion (division by 1,000,000)
    • Date range errors: Verify initial_sync_days configuration and AdMob data availability
    • Network errors: Check API endpoint availability and network connectivity
  • Debug Mode: Enable detailed logging via enable_debug_logging configuration parameter
  • Support Contacts:
    • Technical: Data Engineering team
    • Business: Marketing Analytics team
    • Vendor: Google AdMob support (for API and account issues)
    • Revenue Operations: Revenue Operations team (for data accuracy validation)

Configuration Parameters

  • Required Parameters:
    • client_id: OAuth2 client identifier from Google Cloud Console
    • client_secret: OAuth2 client secret from Google Cloud Console
    • refresh_token: OAuth2 refresh token for automatic access token renewal
  • Optional Parameters:
    • sync_frequency_hours: Incremental sync frequency (default: 4 hours)
    • initial_sync_days: Historical data range for initial sync (default: 90 days, max: 365)
    • max_records_per_page: Batch size for checkpointing (default: 100, range: 1-1000)
    • request_timeout_seconds: HTTP request timeout (default: 30 seconds)
    • retry_attempts: Number of retry attempts for failed requests (default: 3)
    • enable_incremental_sync: Enable date-based incremental sync (default: true)
    • enable_debug_logging: Enable detailed logging (default: false)

Data Transformation Details

  • Metric Conversions:
    • Estimated Earnings: Micros value divided by 1,000,000 to get currency amount
    • All monetary values converted from micros to decimal format
  • Timestamp Handling:
    • All timestamps converted to UTC timezone
    • ISO 8601 format for all date/time fields
    • Date-only fields stored as YYYY-MM-DD strings
  • Schema Mapping:
    • API dimensionValues and metricValues flattened to columnar format
    • Display labels extracted where available for human-readable names
    • Composite primary keys generated from publisher_id, date, and ad_unit_id

Checklist

Some tips and links to help validate your PR:

  • Tested the connector with fivetran debug command.
  • Added/Updated example specific README.md file, refer here for template.
  • Followed Python Coding Standards, refer here
capture

@fivertran-karunveluru fivertran-karunveluru requested review from a team as code owners November 1, 2025 01:17
@fivertran-karunveluru fivertran-karunveluru added the hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. label Nov 1, 2025
@github-actions github-actions bot added the size/XL PR size: extra large label Nov 1, 2025
@github-actions
Copy link

github-actions bot commented Nov 1, 2025

🧹 Python Code Quality Check

✅ No issues found in Python Files.

🔍 See how this check works

This comment is auto-updated with every commit.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new AdMob connector for syncing Google AdMob advertising data including publisher accounts and network reports. The connector implements OAuth2 authentication with automatic token refresh, memory-efficient streaming patterns, and incremental synchronization using date-based cursors. While the implementation demonstrates good understanding of API integration patterns, there are several critical SDK compliance issues and unused configuration parameters that need to be addressed.

Key changes:

  • New AdMob connector with OAuth2 authentication and retry logic with exponential backoff
  • Streaming data processing for accounts and network reports with checkpointing
  • Configuration template and comprehensive documentation

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 20 comments.

File Description
connectors/admob/connector.py Main connector implementation with OAuth2, API request handling, data mapping, and sync orchestration
connectors/admob/configuration.json Configuration template with OAuth credentials and sync parameters
connectors/admob/README.md Documentation covering connector overview, authentication, features, and configuration
README.md Updated connector list to include AdMob example

Comment on lines +377 to +387
def schema(configuration: dict):
"""
Define database schema with table names and primary keys for the connector.
This function specifies the destination tables and their primary keys for Fivetran to create.

Args:
configuration: Configuration dictionary (not used but required by SDK).

Returns:
list: List of table schema dictionaries with table names and primary keys.
"""
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema() function is missing the required standard docstring. It must use this exact format:

def schema(configuration: dict):
    """
    Define the schema function which lets you configure the schema your connector delivers.
    See the technical reference documentation for more details on the schema function:
    https://fivetran.com/docs/connectors/connector-sdk/technical-reference#schema
    Args:
        configuration: a dictionary that holds the configuration settings for the connector.
    """

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +428 to +432
# The 'upsert' operation is used to insert or update data in the destination table.
# The op.upsert method is called with two arguments:
# - The first argument is the name of the table to upsert the data into.
# - The second argument is a dictionary containing the data to be upserted,
op.upsert(table="accounts", data=account)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment before op.upsert() doesn't follow the required format. It must be exactly:

# The 'upsert' operation is used to insert or update data in the destination table.
# The first argument is the name of the destination table.
# The second argument is a dictionary containing the record to be upserted.
op.upsert(table="accounts", data=account)

The current comment has additional text that should be removed.

Copilot generated this review using guidance from repository custom instructions.
f"Sync completed successfully. Processed {account_count} accounts and {report_count} report records."
)

except Exception as e:
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic exception catching without specific exception types. According to SDK guidelines, you should catch specific exceptions instead of bare except Exception:.

The code should catch specific exceptions:

except (RuntimeError, requests.exceptions.RequestException, ValueError) as e:
    log.severe(f"Sync failed: {str(e)}")
    raise RuntimeError(f"Failed to sync AdMob data: {str(e)}")

This makes error handling more explicit and prevents catching unexpected exceptions like KeyboardInterrupt.

Suggested change
except Exception as e:
except (RuntimeError, requests.exceptions.RequestException, ValueError) as e:

Copilot uses AI. Check for mistakes.
from fivetran_connector_sdk import Operations as op

# Private constants (use __ prefix)
__INVALID_LITERAL_ERROR = "invalid literal"
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __INVALID_LITERAL_ERROR constant is defined but never used in the code. This is dead code that should be removed to improve maintainability.

Suggested change
__INVALID_LITERAL_ERROR = "invalid literal"

Copilot uses AI. Check for mistakes.
"client_id": "<YOUR_ADMOB_OAUTH_CLIENT_ID>",
"client_secret": "<YOUR_ADMOB_OAUTH_CLIENT_SECRET>",
"refresh_token": "<YOUR_ADMOB_OAUTH_REFRESH_TOKEN>",
"sync_frequency_hours": "<YOUR_ADMOB_API_SYNC_FREQUENCY_HOURS>",
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sync_frequency_hours field is defined in the configuration but is never used in connector.py. This is an unused configuration parameter that should be removed to avoid confusion. The connector doesn't implement any sync frequency logic - it runs whenever triggered by Fivetran.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +94 to +101
## Additional files
The connector includes several additional files to support functionality, testing, and deployment:

- `requirements.txt` – Python dependency specification for AdMob API integration and connector requirements including faker for mock testing.

- `configuration.json` – Configuration template for API credentials and connector parameters (should be excluded from version control).


Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Additional files" section also mentions configuration.json, but according to the SDK guidelines, this section should only list additional Python files beyond connector.py. Configuration files like configuration.json and requirements.txt are standard files and shouldn't be listed here unless they contain something unusual worth explaining.

If there are no additional Python files (like helper modules), the "Additional files" section should be removed entirely.

Suggested change
## Additional files
The connector includes several additional files to support functionality, testing, and deployment:
- `requirements.txt` – Python dependency specification for AdMob API integration and connector requirements including faker for mock testing.
- `configuration.json` – Configuration template for API credentials and connector parameters (should be excluded from version control).

Copilot uses AI. Check for mistakes.
Raises:
RuntimeError: If sync fails due to API errors or configuration issues.
"""
log.info("Starting AdMob API connector sync")
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing required first log statement in update() function. According to SDK guidelines, the update() function MUST start with:

log.warning("Example: <CATEGORY> : <EXAMPLE_NAME>")

For this connector, it should be:

log.warning("Example: connectors : admob")

This should be the very first statement in the update function, before any other code.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +446 to +450
# The 'upsert' operation is used to insert or update data in the destination table.
# The op.upsert method is called with two arguments:
# - The first argument is the name of the table to upsert the data into.
# - The second argument is a dictionary containing the data to be upserted,
op.upsert(table="network_reports", data=report)
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment before op.upsert() doesn't follow the required format. It must be exactly:

# The 'upsert' operation is used to insert or update data in the destination table.
# The first argument is the name of the destination table.
# The second argument is a dictionary containing the record to be upserted.
op.upsert(table="network_reports", data=report)

The current comment has additional text that should be removed.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +25 to +39
## Configuration file
```json
{
"client_id": "<YOUR_ADMOB_OAUTH_CLIENT_ID>",
"client_secret": "<YOUR_ADMOB_OAUTH_CLIENT_SECRET>",
"refresh_token": "<YOUR_ADMOB_OAUTH_REFRESH_TOKEN>",
"sync_frequency_hours": "<YOUR_ADMOB_API_SYNC_FREQUENCY_HOURS>",
"initial_sync_days": "<YOUR_ADMOB_API_INITIAL_SYNC_DAYS>",
"max_records_per_page": "<YOUR_ADMOB_API_MAX_RECORDS_PER_PAGE>",
"request_timeout_seconds": "<YOUR_ADMOB_API_REQUEST_TIMEOUT_SECONDS>",
"retry_attempts": "<YOUR_ADMOB_API_RETRY_ATTEMPTS>",
"enable_incremental_sync": "<YOUR_ADMOB_API_ENABLE_INCREMENTAL_SYNC>",
"enable_debug_logging": "<YOUR_ADMOB_API_ENABLE_DEBUG_LOGGING>"
}
```
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Configuration file section is missing the required statement about not checking configuration.json into version control. According to the template, this section must end with:

Note: Ensure that the `configuration.json` file is not checked into version control to protect sensitive information.

This should be added after the JSON code block and before the "Configuration parameters" section.

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +94 to +101
## Additional files
The connector includes several additional files to support functionality, testing, and deployment:

- `requirements.txt` – Python dependency specification for AdMob API integration and connector requirements including faker for mock testing.

- `configuration.json` – Configuration template for API credentials and connector parameters (should be excluded from version control).


Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "Additional files" section mentions requirements.txt with dependencies including "faker for mock testing", but there is no requirements.txt file in this PR. According to the README.md content (line 54), this connector doesn't require any additional packages.

Either:

  1. Remove the "Additional files" section entirely if there are no additional Python files beyond connector.py, OR
  2. If requirements.txt exists, update the description to match the actual contents

The current description is inconsistent with the "Requirements file" section which states no additional packages are needed.

Suggested change
## Additional files
The connector includes several additional files to support functionality, testing, and deployment:
- `requirements.txt` – Python dependency specification for AdMob API integration and connector requirements including faker for mock testing.
- `configuration.json` – Configuration template for API credentials and connector parameters (should be excluded from version control).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator

@varundhall varundhall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #256 (review)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of suggestions. Thanks.

# AdMob API Connector Example

## Connector overview
This connector syncs advertising data from Google AdMob API including publisher accounts, network reports, and mediation analytics. It fetches ad performance metrics such as estimated earnings, ad requests, impressions, clicks, and detailed breakdowns by app, ad unit, country, platform, and ad type. The connector supports incremental synchronization using date-based cursors and handles OAuth2 authentication with automatic token refresh.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This connector syncs advertising data from Google AdMob API including publisher accounts, network reports, and mediation analytics. It fetches ad performance metrics such as estimated earnings, ad requests, impressions, clicks, and detailed breakdowns by app, ad unit, country, platform, and ad type. The connector supports incremental synchronization using date-based cursors and handles OAuth2 authentication with automatic token refresh.
This connector syncs advertising data from the Google AdMob API, including publisher accounts, network reports, and mediation analytics. It fetches ad performance metrics, including estimated earnings, ad requests, impressions, and clicks, along with detailed breakdowns by app, ad unit, country, platform, and ad type. The connector supports incremental synchronization using date-based cursors and handles OAuth2 authentication with automatic token refresh.

Comment on lines +71 to +72
## Pagination
AdMob API uses report-based data retrieval with streaming response handling (refer to `get_network_reports` function). The connector processes report data as it's received from the API without accumulating large datasets in memory. Generator-based processing prevents memory accumulation for large advertising datasets.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Pagination
AdMob API uses report-based data retrieval with streaming response handling (refer to `get_network_reports` function). The connector processes report data as it's received from the API without accumulating large datasets in memory. Generator-based processing prevents memory accumulation for large advertising datasets.
## Pagination
The AdMob API uses report-based data retrieval with streaming response handling (refer to the `get_network_reports` function). The connector processes report data as it's received from the API without accumulating large datasets in memory. Generator-based processing prevents memory accumulation for large advertising datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hackathon For all the PRs related to the internal Fivetran 2025 Connector SDK Hackathon. size/XL PR size: extra large

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants