Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
blog1_sink_connector_config.txt	blog1_sink_connector_config.txt
blog1_source_connector_config.txt	blog1_source_connector_config.txt
blog2_default_pipeline.txt	blog2_default_pipeline.txt
blog2_elasticsearch_mapping.txt	blog2_elasticsearch_mapping.txt
blog2_ingest_pipeline.txt	blog2_ingest_pipeline.txt
blog3_camel_kafka_connector_config.txt	blog3_camel_kafka_connector_config.txt
blog3_elasticsearch_strict.txt	blog3_elasticsearch_strict.txt
blog3_example_error.json	blog3_example_error.json
blog4_datagen_config.txt	blog4_datagen_config.txt
blog4_prometheus.yml	blog4_prometheus.yml
blog5_elasticsearch_change_default_shards.txt	blog5_elasticsearch_change_default_shards.txt
tide_data_example.json	tide_data_example.json

Name

Last commit message

Last commit date

blog1_sink_connector_config.txt

blog1_source_connector_config.txt

blog2_default_pipeline.txt

blog2_elasticsearch_mapping.txt

blog2_ingest_pipeline.txt

blog3_camel_kafka_connector_config.txt

blog3_elasticsearch_strict.txt

blog3_example_error.json

blog4_datagen_config.txt

blog4_prometheus.yml

blog5_elasticsearch_change_default_shards.txt

tide_data_example.json

Real-Time Tide Data Processing Pipeline

Description

This series looks at how to build a "zero-code" real-time Tide Data Processing Pipeline using Apache Kafka, Kafka Connect, Elasticsearch, and Kibana with multiple Instaclustr managed services/clusters.

Along the way we also add Apache Camel Kafka Connector, Prometheus and Grafana to the mix. There's no code (that's sort of the point with Kafka Connectors), but lots of examples of Kafka connector configurations, and Prometheus configuration as well. Note that the full explanation of what connectors were used, how to deploy, configure, run and monitor them etc. is in the blogs.

Blog Series

Getting Started

These are configuration examples from the 2021 blog series. The included files are Kafka Connect connector configs, Elasticsearch mappings, ingest pipelines, Prometheus monitoring configs, and sample data.

Prerequisites

These configurations were designed to work with the Instaclustr managed services for Apache Kafka, Apache Kafka Connect, Elasticsearch, PostgreSQL and other open source software including Apache Camel Kafka Connectors and Apache Superset. They worked correctly with the versions supported at the time (2021). Note that Elasticsearch has been replaced by OpenSearch. Instaclustr now provides a managed Kafka Sink Connector for OpenSearch.

Additional tools and connectors used:

Apache Camel Kafka Connectors
Kafka REST Source Connector
Kafka Elasticsearch Sink Connector (Confluent) (for error handling testing)
Kafka Connect DataGen Source Connector (for synthetic load testing)
Prometheus and Grafana (for monitoring)
Custom Kafka PostgreSQL Sink Connector (Blog 6)
Apache Superset (Blog 7)

Configuration

These are the example configuration files, data, etc. See the specific blogs for further details for the different connector types.

Deployment

The latest support documentation for Instaclustr Kafka Connect is here.

Deploy connectors using the Kafka Connect REST API or the Instaclustr console. Refer to the individual blog posts for step-by-step deployment instructions for each connector type.

Authors

Paul Brebner - Initial work - NetApp Instaclustr

See also the list of MAINTAINERS who participated in projects in this repository.

License

This project is licensed under the Apache 2.0 license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Real-Time Tide Data Processing Pipeline

Description

Blog Series

Getting Started

Prerequisites

Configuration

Deployment

Authors

License

FilesExpand file tree

data-processing-pipeline

Directory actions

More options

Directory actions

More options

Latest commit

History

data-processing-pipeline

Folders and files

parent directory

README.md

Real-Time Tide Data Processing Pipeline

Description

Blog Series

Getting Started

Prerequisites

Configuration

Deployment

Authors

License