Optimize SEC Filing Metadata Retrieval Method #7

Elijas · 2025-10-24T06:38:38Z

Add new get_filing_metadatas_batch() method to efficiently retrieve metadata for multiple accession numbers in a single operation.

Key Improvements

Performance Optimization:

Groups accession numbers by CIK to minimize API calls
Example: 100 accession numbers from 10 companies = 10 API calls (not 100)
Uses existing SEC rate limiting infrastructure

Use Case:
Solves the problem when users download bulk filings using dl.get() and want to retrieve metadata for specific accession numbers they've already downloaded, without:

Making one API call per accession number (requires manual rate limiting)
Fetching ALL filings for a company (slow, includes unwanted data)

Implementation

New method in core.py:

Downloader.get_filing_metadatas_batch(queries, include_amends=False)
Accepts list of "TICKER/ACCESSION_NUMBER" strings or CompanyAndAccessionNumber objects
Returns list of FilingMetadata objects

New helper in sec_edgar_downloader_fork.py:

_get_metadatas_batch(cik, user_agent, accession_numbers, include_amends)
Fetches all filings for a CIK and filters to requested accession numbers
Early termination once all requested accession numbers are found

Example Usage

metadatas = dl.get_filing_metadatas_batch([
    "AAPL/0000320193-23-000077",
    "AAPL/0000320193-23-000078",  # Same company - one API call for both
    "MSFT/0000950170-24-087843",
])
# Makes only 2 API calls (one for AAPL, one for MSFT)

Safety

Zero impact on existing APIs (completely new method)
Reuses existing tested code and rate limiting
No breaking changes

Note

Documentation should be added to nbs/index.ipynb in a future update to regenerate README.md with the new method examples.

🤖 Generated with Claude Code

Add new `get_filing_metadatas_batch()` method to efficiently retrieve metadata for multiple accession numbers in a single operation. ## Key Improvements **Performance Optimization:** - Groups accession numbers by CIK to minimize API calls - Example: 100 accession numbers from 10 companies = 10 API calls (not 100) - Uses existing SEC rate limiting infrastructure **Use Case:** Solves the problem when users download bulk filings using `dl.get()` and want to retrieve metadata for specific accession numbers they've already downloaded, without: 1. Making one API call per accession number (requires manual rate limiting) 2. Fetching ALL filings for a company (slow, includes unwanted data) ## Implementation **New method in core.py:** - `Downloader.get_filing_metadatas_batch(queries, include_amends=False)` - Accepts list of "TICKER/ACCESSION_NUMBER" strings or CompanyAndAccessionNumber objects - Returns list of FilingMetadata objects **New helper in sec_edgar_downloader_fork.py:** - `_get_metadatas_batch(cik, user_agent, accession_numbers, include_amends)` - Fetches all filings for a CIK and filters to requested accession numbers - Early termination once all requested accession numbers are found ## Example Usage ```python metadatas = dl.get_filing_metadatas_batch([ "AAPL/0000320193-23-000077", "AAPL/0000320193-23-000078", # Same company - one API call for both "MSFT/0000950170-24-087843", ]) # Makes only 2 API calls (one for AAPL, one for MSFT) ``` ## Safety - Zero impact on existing APIs (completely new method) - Reuses existing tested code and rate limiting - No breaking changes ## Note Documentation should be added to nbs/index.ipynb in a future update to regenerate README.md with the new method examples. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Add concise documentation for get_filing_metadatas_batch() method: - Shows example with multiple accession numbers - Highlights optimization (2 API calls for 3 filings) - Explains use case clearly without bloat Updated both index.ipynb notebook and README.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Run nbdev_clean to strip notebook outputs (fixes CI) - Add generated pyproject.toml from nbdev - Install nbdev git hooks This fixes the "unstripped notebooks" CI failure. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

claude added 3 commits October 24, 2025 06:37

Elijas mentioned this pull request Oct 24, 2025

Getting metadata for multiple accession numbers #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize SEC Filing Metadata Retrieval Method #7

Optimize SEC Filing Metadata Retrieval Method #7

Uh oh!

Elijas commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Optimize SEC Filing Metadata Retrieval Method #7

Are you sure you want to change the base?

Optimize SEC Filing Metadata Retrieval Method #7

Uh oh!

Conversation

Elijas commented Oct 24, 2025

Key Improvements

Implementation

Example Usage

Safety

Note

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants