You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: connectors/refiner/README.md
+32-25Lines changed: 32 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,9 +21,9 @@ Refer to the [Connector SDK Setup Guide](https://fivetran.com/docs/connectors/co
21
21
- Processes all paginated data automatically using cursor-based pagination for large datasets
22
22
- Implements exponential backoff for API reliability (3 retries with progressive delays)
23
23
- Flattens nested JSON structures into table columns automatically
24
-
-Checkpoint strategy ensures resumability for large datasets (every 1000 records)
24
+
-Checkpoints progress during pagination to ensure resumability for large datasets
25
25
- Extracts and normalizes nested arrays (questions, answers) into child tables with foreign keys
26
-
-User-level data keyed by user ID for joining with product usage data
26
+
-Keys user-level data by user ID for joining with product usage data
27
27
28
28
## Configuration file
29
29
The configuration requires your Refiner API key and optionally a start date for the initial sync.
@@ -49,21 +49,22 @@ Note: The `fivetran_connector_sdk:latest` and `requests:latest` packages are pre
49
49
## Authentication
50
50
The connector uses Bearer token authentication via the `Authorization` header. To obtain your API key:
51
51
52
-
1. Log in to your Refiner account.
52
+
1. Log in to your [Refiner](https://refiner.io) account.
53
53
2. Go to **Settings** > **Integrations** > **API**.
54
54
3. Copy your API key.
55
55
4. Add the API key to your `configuration.json` file as shown above.
56
56
57
57
The API key is included in every request as `Authorization: Bearer YOUR_API_KEY`.
58
58
59
59
## Pagination
60
-
The connector handles pagination automatically using the Refiner API's page-based pagination structure. The API supports the following pagination parameters:
61
-
-`page` - Current page number (starts at 1)
60
+
The connector handles pagination automatically using the Refiner API's cursor-based pagination structure. The API supports the following pagination parameters:
61
+
-`page` - Current page number (starts at 1) - used as fallback
62
62
-`page_length` - Number of items per page (default: 100)
63
-
-`next_page_cursor` - Optional cursor for cursor-based pagination
63
+
-`next_page_cursor` - Cursor token for cursor-based pagination
64
64
65
-
The connector uses page-based pagination with automatic detection of the last page:
66
-
- Each sync processes all paginated data completely using the `pagination.current_page` and `pagination.last_page` response fields.
65
+
The connector uses cursor-based pagination for optimal performance with large datasets:
66
+
- Each sync processes all paginated data completely using the `pagination.next_page_cursor` response field.
67
+
- Cursor-based pagination is more efficient than page-based pagination for large datasets and is recommended by the Refiner API documentation.
67
68
- Pagination state is not persisted between sync runs for cleaner state management.
68
69
- Uses the `date_range_start` parameter to filter responses from the API directly for incremental syncs.
69
70
@@ -82,30 +83,33 @@ The connector processes survey and response data with an optimized incremental s
82
83
-**respondents** - User/contact information keyed by user ID (parent for responses)
83
84
84
85
### Incremental sync strategy
85
-
-Initial sync uses `start_date`from configuration (if provided) or EPOCH time (1970-01-01T00:00:00Z) as fallback
86
-
-Incremental syncs use `last_response_sync` timestamp from state to fetch only new/updated responses since last successful sync
87
-
-State tracks separate timestamps for surveys and responses
86
+
-**Responses**: Incremental sync using `last_response_sync` timestamp from state to fetch only new/updated responses since last successful sync
87
+
-**Surveys and Contacts**: Full sync on every run (the Refiner API does not support date filtering for these endpoints)
88
+
-Initial response sync uses `start_date` from configuration (if provided) or EPOCH time (1970-01-01T00:00:00Z) as fallback
88
89
- Checkpoint every 1000 records during large response syncs to enable resumability
90
+
- Checkpoint after each page for surveys and contacts to preserve progress
89
91
- Final checkpoint saves the complete state only after successful sync completion
-**Array handling** - Arrays converted to JSON strings when stored in parent tables, or normalized to child tables
94
-
-**Child table extraction** - Questions extracted from survey config, answers extracted from response data
95
+
-**Array handling** - Arrays converted to JSON strings when stored in parent tables, or normalized to child tables when appropriate
96
+
-**Child table extraction** - Questions extracted from survey config (`config.form_elements`) and answers extracted from response data are stored in dedicated child tables to preserve relational structure
97
+
-**Smart exclusion** - Relational data like `form_elements` is excluded from the flattened parent table to avoid duplication, as it's already normalized into the questions table
-`validate_configuration()` - Validates required API key configuration
100
103
-`make_api_request()` - Centralized API calling with retry logic and error handling
101
-
-`flatten_dict()` - Recursive JSON structure flattening for table columns
102
-
-`fetch_surveys()` - Main survey sync with pagination and question extraction
103
-
-`fetch_questions()` - Extract questions from survey configuration
104
-
-`fetch_responses()` - Incremental response sync with date-based filtering
105
-
-`fetch_answers()` - Extract answers from response data
104
+
-`flatten_dict()` - Recursive JSON structure flattening for table columns with smart exclusion of relational data
105
+
-`fetch_surveys()` - Main survey sync with pagination, question extraction, and page-level checkpointing
106
+
-`fetch_questions()` - Extract questions from survey configuration into child table
107
+
-`fetch_contacts()` - Full contact sync with pagination and page-level checkpointing
108
+
-`fetch_responses()` - Incremental response sync with date-based filtering and record-level checkpointing
109
+
-`fetch_answers()` - Extract answers from response data into child table
106
110
-`fetch_respondent()` - Extract or update respondent information
107
111
108
-
The connector maintains a clean state with `last_survey_sync` and `last_response_sync`timestamps, automatically advancing after each successful sync to ensure reliable incremental syncs without data duplication or gaps.
112
+
The connector maintains a clean state with the `last_response_sync`timestamp for incremental response syncing, automatically advancing after each successful sync to ensure reliable incremental syncs without data duplication or gaps. Surveys and contacts are fully synced on each run.
109
113
110
114
## Error handling
111
115
The connector implements comprehensive error handling with multiple layers of protection:
@@ -122,16 +126,18 @@ The connector implements comprehensive error handling with multiple layers of pr
122
126
123
127
### Data processing safeguards
124
128
- Graceful handling of missing or malformed API response structures
125
-
- Safe dictionary access patterns with `.get()` to prevent KeyError exceptions
129
+
- Safe dictionary access patterns with `.get()`and type checks to prevent AttributeError and KeyError exceptions
126
130
- Skips records missing required identifiers (uuid) with warnings
127
-
- Proper exception propagation with descriptive RuntimeError messages
131
+
- Error handling for malformed timestamps with warning logs
132
+
- Proper exception propagation with descriptive RuntimeError messages from API layer
128
133
129
134
### Checkpoint recovery
130
-
- Checkpoints every 1000 records during large syncs enable recovery from interruptions
135
+
- Checkpoints after each page during survey and contact syncs to preserve progress
136
+
- Checkpoints every 1000 records during large response syncs enable recovery from interruptions
131
137
- State tracking allows sync to resume from the last successful checkpoint
132
-
- Final checkpoint only saved after a complete successful sync
138
+
- Final checkpoint saved after complete successful sync
133
139
134
-
All exceptions are caught at the top level in the `update()` function and re-raised as `RuntimeError` with descriptive messages, making troubleshooting easier for users and Fivetran support.
140
+
Unhandled exceptions in the `update()` function will propagate and be logged by the Fivetran platform for troubleshooting. The connector's error handling strategy focuses on resilience at the API request level and safe data processing with proper validation.
135
141
136
142
## Tables created
137
143
@@ -174,7 +180,8 @@ The connector creates the following tables in your destination:
174
180
**respondents** table:
175
181
- User/contact information keyed by user ID for joins with product usage data
The examples provided are intended to help you effectively use Fivetran's Connector SDK. While we've tested the code, Fivetran cannot be held responsible for any unexpected or negative consequences that may arise from using these examples. For inquiries, please reach out to our Support team.
0 commit comments