Skip to content

Conversation

@Elijas
Copy link
Owner

@Elijas Elijas commented Oct 24, 2025

Add new get_filing_metadatas_batch() method to efficiently retrieve metadata for multiple accession numbers in a single operation.

Key Improvements

Performance Optimization:

  • Groups accession numbers by CIK to minimize API calls
  • Example: 100 accession numbers from 10 companies = 10 API calls (not 100)
  • Uses existing SEC rate limiting infrastructure

Use Case:
Solves the problem when users download bulk filings using dl.get() and want to retrieve metadata for specific accession numbers they've already downloaded, without:

  1. Making one API call per accession number (requires manual rate limiting)
  2. Fetching ALL filings for a company (slow, includes unwanted data)

Implementation

New method in core.py:

  • Downloader.get_filing_metadatas_batch(queries, include_amends=False)
  • Accepts list of "TICKER/ACCESSION_NUMBER" strings or CompanyAndAccessionNumber objects
  • Returns list of FilingMetadata objects

New helper in sec_edgar_downloader_fork.py:

  • _get_metadatas_batch(cik, user_agent, accession_numbers, include_amends)
  • Fetches all filings for a CIK and filters to requested accession numbers
  • Early termination once all requested accession numbers are found

Example Usage

metadatas = dl.get_filing_metadatas_batch([
    "AAPL/0000320193-23-000077",
    "AAPL/0000320193-23-000078",  # Same company - one API call for both
    "MSFT/0000950170-24-087843",
])
# Makes only 2 API calls (one for AAPL, one for MSFT)

Safety

  • Zero impact on existing APIs (completely new method)
  • Reuses existing tested code and rate limiting
  • No breaking changes

Note

Documentation should be added to nbs/index.ipynb in a future update to regenerate README.md with the new method examples.

🤖 Generated with Claude Code

Add new `get_filing_metadatas_batch()` method to efficiently retrieve
metadata for multiple accession numbers in a single operation.

## Key Improvements

**Performance Optimization:**
- Groups accession numbers by CIK to minimize API calls
- Example: 100 accession numbers from 10 companies = 10 API calls (not 100)
- Uses existing SEC rate limiting infrastructure

**Use Case:**
Solves the problem when users download bulk filings using `dl.get()`
and want to retrieve metadata for specific accession numbers they've
already downloaded, without:
1. Making one API call per accession number (requires manual rate limiting)
2. Fetching ALL filings for a company (slow, includes unwanted data)

## Implementation

**New method in core.py:**
- `Downloader.get_filing_metadatas_batch(queries, include_amends=False)`
- Accepts list of "TICKER/ACCESSION_NUMBER" strings or CompanyAndAccessionNumber objects
- Returns list of FilingMetadata objects

**New helper in sec_edgar_downloader_fork.py:**
- `_get_metadatas_batch(cik, user_agent, accession_numbers, include_amends)`
- Fetches all filings for a CIK and filters to requested accession numbers
- Early termination once all requested accession numbers are found

## Example Usage

```python
metadatas = dl.get_filing_metadatas_batch([
    "AAPL/0000320193-23-000077",
    "AAPL/0000320193-23-000078",  # Same company - one API call for both
    "MSFT/0000950170-24-087843",
])
# Makes only 2 API calls (one for AAPL, one for MSFT)
```

## Safety

- Zero impact on existing APIs (completely new method)
- Reuses existing tested code and rate limiting
- No breaking changes

## Note

Documentation should be added to nbs/index.ipynb in a future update
to regenerate README.md with the new method examples.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Add concise documentation for get_filing_metadatas_batch() method:
- Shows example with multiple accession numbers
- Highlights optimization (2 API calls for 3 filings)
- Explains use case clearly without bloat

Updated both index.ipynb notebook and README.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Run nbdev_clean to strip notebook outputs (fixes CI)
- Add generated pyproject.toml from nbdev
- Install nbdev git hooks

This fixes the "unstripped notebooks" CI failure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants