-
Notifications
You must be signed in to change notification settings - Fork 3
Metadata Collection Prototype #1062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
oguzkocer
wants to merge
114
commits into
trunk
Choose a base branch
from
prototype/metadata-collection
base: trunk
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Design doc for a new generic collection type that uses lightweight metadata fetches (id + modified_gmt) to define list structure, then selectively fetches only missing or stale entities. Key design decisions: - Metadata defines list structure (enables loading placeholders) - KV store for metadata persistence (memory or disk-backed) - Generic over entity type (posts, media, etc.) - Batch fetch via `include` param for efficiency
Foundation types for metadata-based sync: - `SyncableEntity` trait: entities with `id` and `modified_gmt` fields - `EntityMetadata<Id>`: lightweight struct holding just id + modified_gmt Also adds `Clone` and `Copy` derives to `WpGmtDateTime` since the inner `DateTime<Utc>` is `Copy`.
Represents items in a metadata-backed list where entities may be: - `Loaded`: full entity available from cache - `Loading`: fetch in progress, shows placeholder with metadata - `Failed`: fetch failed, includes error message for retry UI Includes `HasId` helper trait for extracting IDs from loaded entities.
Abstracts metadata persistence so we can swap between in-memory and disk-backed storage. The in-memory implementation is useful for prototyping; can be replaced with SQLite/file-based later. Includes unit tests for all KvStore operations.
Lightweight fetch that returns only id + modified_gmt for posts, enabling the metadata-first sync strategy. Unlike `fetch_posts_page`, this does NOT upsert to the database - the metadata is used transiently to determine which posts need full fetching. Also adds `Hash` derive to `wp_content_i64_id` and `wp_content_u64_id` macros so ID types can be used as HashMap keys.
Batch fetch full post data for specific IDs using the `include` parameter. Used for selective sync - fetching only posts that are missing or stale in the cache. Returns early with empty Vec if no IDs provided.
Core collection type that: - Uses KV store to persist list metadata (id + modified_gmt) - Builds list with Loaded/Loading states based on cache status - Provides methods to find missing/stale entities for selective fetch - Supports pagination with append for subsequent pages Includes comprehensive unit tests for all operations.
- Format derive macros across multiple lines - Use or_default() instead of or_insert_with(Vec::new) - Add type aliases for complex closure types - Collapse nested if-let into single condition - Simplify test entity to unit struct
Documents what was built, where each component lives, test coverage, and differences from the original sketch. Also lists next steps.
Consolidated design after discussion covering: - Service-owned stores (`EntityStateStore`, `ListMetadataStore`) - Read-only traits for collection access - `MetadataCollection<F>` generic only over fetcher - Cross-collection state consistency - State transitions (Missing/Fetching/Cached/Stale/Failed) Also updated v1 doc with intermediate v2 notes.
Implements Phase 1 of the v3 MetadataCollection design: - `EntityMetadata` - Non-generic struct with `i64` id + `Option<WpGmtDateTime>` (optional modified_gmt for entities like Comments that lack this field) - `EntityState` - Enum tracking fetch lifecycle (Missing, Fetching, Cached, Stale, Failed) - `CollectionItem` - Combines metadata with state for list items - `SyncResult` - Result of sync operations with counts and pagination info - `MetadataFetchResult` - Updated to non-generic version Removes superseded prototype code: - Old generic `EntityMetadata<Id>` - `KvStore<Id>` trait and `InMemoryKvStore<Id>` - `ListItem<T, Id>` enum - `MetadataCollection<T, Id>` (old version) - `SyncableEntity` trait Updates `PostService::fetch_posts_metadata` to use new non-generic types.
Implements Phase 2 of the v3 MetadataCollection design: - `EntityStateStore` - Memory-backed store for entity fetch states - Maps `i64` ID to `EntityState` (Missing, Fetching, Cached, etc.) - Thread-safe via `RwLock<HashMap>` - `filter_fetchable()` excludes currently-fetching IDs to prevent duplicates - `EntityStateReader` trait - Read-only access for collections - `ListMetadataStore` - Memory-backed store for list structure - Maps filter key (String) to `Vec<EntityMetadata>` - Supports `set` (replace) and `append` (pagination) - `ListMetadataReader` trait - Read-only access for collections Both stores are memory-only; state resets on app restart.
Implements Phase 3 of the v3 MetadataCollection design: - `MetadataFetcher` trait - Async trait for fetching metadata and entities - `fetch_metadata(page, per_page, is_first_page)` - Fetch list structure - `ensure_fetched(ids)` - Fetch full entities by ID - `MetadataCollection<F>` - Generic collection over fetcher type - `refresh()` - Fetch page 1, replace metadata, sync missing - `load_next_page()` - Fetch next page, append, sync missing - `items()` - Get `CollectionItem` list with states - `is_relevant_update()` - Check DB updates for relevance - Batches large fetches into 100-item chunks (API limit) Also adds `tokio` as dev-dependency for async tests.
Add metadata sync infrastructure to PostService for efficient list syncing: - Add `state_store_with_edit_context` field for tracking per-entity fetch state (Missing, Fetching, Cached, Stale, Failed). Each context needs its own state store since the same entity can have different states across contexts. - Add `metadata_store` field for list structure per filter key. Shared across all contexts - callers include context in the key string (e.g., "site_1:edit:posts:publish"). - Add `fetch_and_store_metadata()` method that fetches lightweight metadata (id + modified_gmt) and stores it in the metadata store. - Update `fetch_posts_by_ids()` to track entity state: - Filters out already-fetching IDs to prevent duplicate requests - Sets Fetching state before API call - Sets Cached on success, Failed on error or missing posts - Add `PostMetadataFetcherWithEditContext` implementing `MetadataFetcher` trait, delegating to PostService methods. - Add reader accessor methods for collections to get read-only access: `state_reader_with_edit_context()`, `metadata_reader()`, `get_entity_state_with_edit_context()`.
Create the concrete type that wraps MetadataCollection for UniFFI:
- Add `PostMetadataCollectionWithEditContext` struct combining:
- `MetadataCollection<PostMetadataFetcherWithEditContext>` for sync logic
- Service reference for loading full entity data
- Filter for this collection
- Add `PostMetadataCollectionItem` record type with:
- `id`: Post ID
- `state`: EntityState (Missing, Fetching, Cached, Stale, Failed)
- `data`: Optional FullEntityAnyPostWithEditContext
- Add `create_post_metadata_collection_with_edit_context` to PostService
- Make types UniFFI-compatible:
- Add `uniffi::Enum` to EntityState
- Add `uniffi::Record` to SyncResult (change usize to u64)
- Use interior mutability (RwLock<PaginationState>) in MetadataCollection
for compatibility with UniFFI's Arc-wrapped objects
- Add `read_posts_by_ids_from_db` helper to PostService for bulk loading
- Document state representation approaches in design doc:
- Two-dimensional (DataState + FetchStatus)
- Flattened explicit states enum
Add Kotlin wrapper for PostMetadataCollectionWithEditContext to enable reactive UI updates when database changes occur. Changes: - Add `ObservableMetadataCollection` class with observer pattern - Update `DatabaseChangeNotifier` to support metadata collections - Add `getObservablePostMetadataCollectionWithEditContext` extension on `PostService` - Update implementation plan with Phase 7 completion
Add example screen demonstrating the metadata-first sync strategy with visual state indicators for each item (Missing, Fetching, Cached, Stale, Failed). Changes: - Add `PostMetadataCollectionViewModel` with manual refresh/loadNextPage controls - Add `PostMetadataCollectionScreen` with filter controls and state indicators - Wire up navigation and DI for the new screen - Update implementation plan marking Phase 8 complete
WordPress REST API defaults to filtering by 'publish' status, which caused drafts, pending, and other non-published posts to return "Not found" when fetching by ID. Changes: - Add explicit status filter including all standard post statuses (publish, draft, pending, private, future)
When fetching metadata, compare the API's `modified_gmt` against cached posts in the database. Posts with different timestamps are marked as `Stale`, triggering a re-fetch on the next sync. Changes: - Add `select_modified_gmt_by_ids` to PostRepository for efficient batch lookup - Add `detect_and_mark_stale_posts` to PostService for staleness comparison - Call staleness detection in `fetch_and_store_metadata` after storing metadata
Design for moving list metadata from in-memory KV store to database: - Three DB tables: list_metadata, list_metadata_items, list_metadata_state - MetadataService owns list storage, state transitions, version management - PostService orchestrates sync, owns entity state, does staleness detection - Split observers for data vs state changes in Kotlin wrapper - Version-based concurrency control for handling concurrent refreshes
Detailed plan with 5 phases, 17 commits, ordered from low-level to high-level: - Phase 1: Database foundation (DbTable, migration, types, repository) - Phase 2: MetadataService in wp_mobile - Phase 3: Integration with PostService, collection refactor - Phase 4: Observer split (data vs state) - Phase 5: Testing and cleanup Includes dependency order, risk areas, and verification checkpoints.
Implement the database layer for list metadata storage, replacing the in-memory KV store. This enables proper observer patterns and persistence between app launches. Database schema (3 tables): - list_metadata: headers with pagination, version for concurrency control - list_metadata_items: ordered entity IDs (rowid = display order) - list_metadata_state: sync state (idle, fetching_first_page, fetching_next_page, error) Changes: - Add DbTable variants: ListMetadata, ListMetadataItems, ListMetadataState - Add migration 0007-create-list-metadata-tables.sql - Add list_metadata module with DbListMetadata, DbListMetadataItem, DbListMetadataState structs and ListState enum - Add db_types/db_list_metadata.rs with column enums and from_row impls - Add repository/list_metadata.rs with read and write operations
Add helper methods for atomic state transitions during sync operations: - `begin_refresh()`: Atomically increment version, set state to FetchingFirstPage, and return info needed for the fetch - `begin_fetch_next_page()`: Check pagination state, set state to FetchingNextPage, and return page number and version for stale check - `complete_sync()`: Set state to Idle on success - `complete_sync_with_error()`: Set state to Error with message These helpers ensure correct state transitions and enable version-based concurrency control to detect when a refresh invalidates an in-flight load-more operation. Also adds `RefreshInfo` and `FetchNextPageInfo` structs to encapsulate the data returned from begin operations.
Implement MetadataService in wp_mobile to provide persistence for list metadata (ordered entity IDs, pagination, sync state). This enables data to survive app restarts unlike the in-memory ListMetadataStore. The service wraps ListMetadataRepository and implements ListMetadataReader trait, allowing MetadataCollection to use either memory or database storage through the same interface. Features: - Read operations: get_entity_ids, get_metadata, get_state, get_pagination - Write operations: set_items, append_items, update_pagination, delete_list - State management: set_state, complete_sync, complete_sync_with_error - Concurrency helpers: begin_refresh, begin_fetch_next_page Also fixes pre-existing clippy warnings in posts.rs (collapsible_if).
Add MetadataService as a field in PostService to provide database-backed list metadata storage. This enables list structure and pagination to persist across app restarts. Changes: - Add `metadata_service` field to PostService - Add `persistent_metadata_reader()` and `metadata_service()` accessors - Add `sync_post_list()` method that orchestrates full sync flow using MetadataService for persistence - Extend SyncResult with `current_page` and `total_pages` fields for pagination tracking The existing in-memory `metadata_store` is preserved for backwards compatibility with existing code paths. Future work will migrate callers to use the persistent service.
Mark completed phases and update with actual commit hashes: - Phase 1 (Database Foundation): Complete - Phase 2 (MetadataService): Complete - Phase 3 (Integration): Partial (3.2 done, 3.1 deferred) - Phase 5 (Testing): Partial (tests inline with implementation) Add status summary table and update dependency diagram with completion markers.
The repository layer should not dictate per_page values - this must be set by the service layer to match networking configuration. Changes: - Add PerPageMismatch error variant to SqliteDbError - Make per_page a required parameter in get_or_create - Make per_page required in get_or_create_and_increment_version - Update set_items_by_list_key to require per_page - Update append_items_by_list_key to require per_page - Update update_state_by_list_key to require per_page - Return error when existing header has different per_page
Update schema validation tests to use `get_table_column_names` helper, which verifies that column enum indices match actual database schema positions via PRAGMA table_info. This matches the pattern used by posts repository tests and provides stronger guarantees that the column enums won't break if migrations reorder columns.
…tion Merge the polished ListMetadataRepository from feature/list-metadata-repository while preserving concurrency helpers that were intentionally kept on this branch. Key changes from feature/list-metadata-repository: - Convert to associated functions (no &self parameters) - Add ListKey newtype for type-safe list key handling - Add per_page as a required parameter with validation - Use batch insert for list metadata items - Add FK from list_metadata_items to list_metadata - Store ListState as INTEGER instead of TEXT - Add get_or_create_and_increment_version for atomic operations - Add reset_stale_fetching_states method - Split get_items and get_state into by_list_metadata_id and by_list_key variants Preserved from this branch: - Concurrency helpers: begin_refresh, begin_fetch_next_page, complete_sync, complete_sync_with_error (for service layer orchestration) - posts.rs: select_modified_gmt_by_ids method (for staleness detection) Note: wp_mobile code needs to be updated to use the new associated function API.
Update all usages to work with the new associated function style and `ListKey` newtype from the merged feature/list-metadata-repository. Key API changes: - Use `ListKey` newtype instead of `&str` for type-safe key handling - Convert instance method calls to associated function calls (e.g., `self.repo.get_items(...)` → `ListMetadataRepository::get_items_by_list_key(...)`) - Add `per_page` parameter where required by the new API - Use `list_metadata_id: RowId` for `complete_sync`/`complete_sync_with_error` New convenience methods added to MetadataService: - `complete_sync_by_key`: Lookup list by key and complete sync - `complete_sync_with_error_by_key`: Lookup list by key and set error state Changes: - Update `ListMetadataReader` trait to use `&ListKey` - Update `MetadataCollection` to store `ListKey` instead of `String` - Update `MetadataService` to use associated functions and `ListKey` - Update `PersistentPostMetadataFetcherWithEditContext` to use `ListKey` - Update `PostService` methods to use `&ListKey` and new API - Update all tests to use `ListKey::from()` for key creation
Resolve conflicts in list_metadata.rs by preserving concurrency helpers (begin_refresh, begin_fetch_next_page, complete_sync, complete_sync_with_error) that were removed in the feature branch polish but are needed for the prototype.
Documents the planned refactoring to improve separation of concerns in the metadata sync infrastructure: - Move workflow orchestration from ListMetadataRepository to new MetadataSyncManager (stateless, composes repository primitives) - Introduce SyncSession with RAII pattern for automatic error cleanup - Update MetadataService API with begin_sync() returning SyncSession - Simplify PostService by eliminating manual error handling boilerplate Design rationale: - RAII chosen over async closures (complex) and traits (hard for newcomers) - Scales to many entity services (Comments, Media, Users, etc.) - Keeps explicit control flow while automating cleanup Includes 6-phase implementation plan with verification checklist.
Introduces MetadataSyncManager in wp_mobile to manage sync workflow state transitions. This separates orchestration logic from the repository layer, keeping ListMetadataRepository focused on pure SQL operations. Changes: - Add MetadataSyncManager with begin_refresh, begin_fetch_next_page, complete_sync, and complete_sync_with_error methods - Add RefreshInfo and FetchNextPageInfo structs - Add log dependency to wp_mobile for debug logging - Add 13 unit tests covering all sync workflows
Moves workflow orchestration logic from the repository layer to MetadataSyncManager. This keeps the repository focused on pure SQL operations while the sync manager handles state transitions. Changes: - Update MetadataService to use MetadataSyncManager instead of ListMetadataRepository for begin_refresh, begin_fetch_next_page, complete_sync, and complete_sync_with_error - Remove concurrency helpers section from ListMetadataRepository - Remove RefreshInfo and FetchNextPageInfo structs from repository (now defined in MetadataSyncManager)
Introduces SyncSession for automatic sync error cleanup via the RAII pattern. When a session is dropped without calling complete(), it automatically marks the sync as failed, ensuring cleanup even on panics or early ? returns. Changes: - Add SyncSession struct with Drop impl for automatic error cleanup - Add complete() method to mark successful sync completion - Add begin_sync() to MetadataService for creating sessions - Add 7 unit tests for RAII cleanup behavior - Export SyncSession from sync module
Adds convenience methods that work with SyncSession for cleaner sync code in PostService and future entity services. Changes: - Add store_for_session() for automatic set/append based on is_first_page - Add update_pagination_for_session() for session-aware pagination updates
Update `fetch_and_store_metadata_persistent` to use the new SyncSession pattern for RAII-based error cleanup. This eliminates verbose manual error handling - the session's Drop impl automatically marks sync as failed if not completed. Changes: - Add `From<WpServiceError> for FetchError` conversion for clean `?` usage - Replace manual begin_refresh/begin_fetch_next_page with `begin_sync()` - Replace set_items/append_items with `store_for_session()` - Replace update_pagination with `update_pagination_for_session()` - Replace manual complete_sync with `session.complete()` - Remove unused `RowId` import - Replace println!-based logging with log::debug!
Remove `begin_refresh` and `begin_fetch_next_page` from the public API as they are now superseded by `begin_sync()` which returns a `SyncSession`. The underlying functionality is tested in `MetadataSyncManager` tests. Changes: - Remove `begin_refresh` and `begin_fetch_next_page` methods - Remove unused `RefreshInfo` and `FetchNextPageInfo` imports - Update doc comments for `complete_sync` and `complete_sync_with_error` - Remove 3 redundant tests (functionality tested in MetadataSyncManager) - Update `test_list_metadata_reader_get_list_info_with_state` to use `set_state`
Mark the SyncSession refactoring as complete with verification checklist updates, resolved open questions, and implementation summary. Changes: - Mark completed items in verification checklist - Add resolutions to open questions - Add implementation summary with commit hashes for each phase - Add implementation notes for naming decisions and scope
Replaced the sync session refactoring document with a new design that has MetadataService own the sync lifecycle through refresh/load_more orchestration methods. Changes: - Add metadata_service_orchestration.md with new design - Remove sync_session_refactoring.md (superseded)
Added async `refresh()` method that owns the full sync lifecycle for fetching the first page of a list. This method: - Increments version (invalidates in-flight load-more) - Sets state to FetchingFirstPage - Calls fetcher with (page=1, per_page) - Stores metadata (replacing existing) - Updates pagination - Sets state to Idle (or Error on failure) Changes: - Add `refresh<F, Fut>()` async method taking a fetcher closure - Add comprehensive tests for the refresh functionality - Import FetchError and Future for async support
Added async `load_more()` method that owns the full sync lifecycle for fetching subsequent pages of a list. This method: - Gets current state and determines next page - Validates there are more pages to load - Sets state to FetchingNextPage - Calls fetcher with (next_page, per_page from refresh) - Checks version for stale detection (discards if refresh happened) - Appends metadata to existing items - Updates pagination - Sets state to Idle (or Error on failure) Changes: - Add `load_more<F, Fut>()` async method taking a fetcher closure - Add comprehensive tests for load_more functionality
… (Phase 3) Updated `fetch_and_store_metadata_persistent` and related code to use the new `MetadataService::refresh` and `load_more` orchestration methods. Key changes: - Remove `page` parameter from `fetch_and_store_metadata_persistent` (MetadataService now determines the page internally) - Update `MetadataFetcher` trait to remove `page` parameter - Update `PersistentPostMetadataFetcherWithEditContext` implementation - Update `MetadataCollection::refresh` and `load_next_page` to not pass page This simplifies the API: callers only specify `is_first_page` and the orchestration layer handles page tracking internally.
Updated `sync_post_list` to use the new `MetadataService::refresh` and `load_more` orchestration methods for metadata sync. Key changes: - Remove `page` parameter from `sync_post_list` (MetadataService now determines the page internally) - Use `refresh()` for first page, `load_more()` for subsequent pages - Keep entity-specific logic (stale detection, full-post fetching) - Simplified code from ~80 lines to ~45 lines
SyncSession was an RAII wrapper for sync lifecycle management. With the new `refresh()` and `load_more()` methods on MetadataService handling the lifecycle internally, SyncSession is no longer needed. Changes: - Remove `sync_session.rs` module entirely - Remove `SyncSession` export from `sync/mod.rs` - Remove `begin_sync`, `store_for_session`, `update_pagination_for_session` methods from `MetadataService` - Update docstrings to remove SyncSession references
MetadataSyncManager is kept as it provides clean workflow abstractions used by refresh() and load_more(). Removed unused methods: - `complete_sync_by_key` - was used by old sync_post_list, now unused - `complete_sync_with_error_by_key` - was used by old sync_post_list - `complete_sync_with_error` (as method) - refresh/load_more call MetadataSyncManager directly
Remove unnecessary cast in load_next_page comparison.
Mark verification checklist items as complete and add implementation notes documenting the deviations from the original plan.
Merge MetadataSyncManager workflow logic into MetadataService as private helper methods, eliminating the dual-path issue where some repository calls went through MetadataSyncManager while others went directly to ListMetadataRepository. Changes: - Add private helpers: `begin_refresh`, `begin_load_more`, `complete_sync`, `complete_sync_with_error` - Remove `MetadataSyncManager` struct and module - Update `refresh()` and `load_more()` to use new helpers - Update design doc to reflect the change
Replace `let _ = self.complete_sync_with_error(...)` pattern with proper error logging. This ensures we're notified if cleanup operations fail, rather than silently ignoring potential issues that could leave lists stuck in incorrect states.
Move entity fetching responsibility from MetadataCollection to PostService. The service layer now decides whether to fetch missing/stale entities based on the SyncStrategy parameter. Changes: - Add SyncStrategy enum (MetadataOnly, Full) to control sync behavior - Add sync_list and sync_list_with_strategy to PostService - Simplify MetadataFetcher trait to single sync() method - Remove sync_missing_and_stale from MetadataCollection - Update PersistentPostMetadataFetcherWithEditContext to delegate to sync_list
Replace trait-based MetadataFetcher with composition pattern: - Rename `MetadataCollection` to `MetadataCollectionCore` - Remove generic parameter - core handles query infrastructure only - Entity-specific collections compose core and own their fields - `PostMetadataCollectionWithEditContext` now stores filter, endpoint_type - Delete `MetadataFetcher` trait and `PersistentPostMetadataFetcherWithEditContext` This simplifies the architecture by eliminating the indirection through the fetcher trait. Entity-specific collections call the service directly for sync operations while delegating query methods to the core.
- Add metadata_collection_architecture.md explaining design decisions - Add DOCUMENTATION_UPDATE_REPORT.md identifying outdated docs - Remove metadata_collection_composition.md (implementation doc)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please ignore this PR for now, I am just opening a draft PR so I can access the built artifacts from WPAndroid.