This series looks at how to build a "zero-code" real-time Tide Data Processing Pipeline using Apache Kafka, Kafka Connect, Elasticsearch, and Kibana with multiple Instaclustr managed services/clusters.
Along the way we also add Apache Camel Kafka Connector, Prometheus and Grafana to the mix. There's no code (that's sort of the point with Kafka Connectors), but lots of examples of Kafka connector configurations, and Prometheus configuration as well. Note that the full explanation of what connectors were used, how to deploy, configure, run and monitor them etc. is in the blogs.
- Part 1: Data Processing Pipeline
- Part 2: Data Processing Pipeline - Part 2
- Part 3: Getting to Know Apache Camel Kafka Connectors
- Part 4: Monitoring Kafka Connect Pipeline Metrics With Prometheus
- Part 5: Scaling Kafka Connect Streaming Data Processing
- Part 6: Kafka Postgres Connector
- Part 7: Apache Superset
- Part 8: Kafka Connect Elasticsearch
- Part 9: PostgreSQL
- Part 10: Kafka Connect Pipelines Conclusion
These are configuration examples from the 2021 blog series. The included files are Kafka Connect connector configs, Elasticsearch mappings, ingest pipelines, Prometheus monitoring configs, and sample data.
These configurations were designed to work with the Instaclustr managed services for Apache Kafka, Apache Kafka Connect, Elasticsearch, PostgreSQL and other open source software including Apache Camel Kafka Connectors and Apache Superset. They worked correctly with the versions supported at the time (2021). Note that Elasticsearch has been replaced by OpenSearch. Instaclustr now provides a managed Kafka Sink Connector for OpenSearch.
Additional tools and connectors used:
- Apache Camel Kafka Connectors
- Kafka REST Source Connector
- Kafka Elasticsearch Sink Connector (Confluent) (for error handling testing)
- Kafka Connect DataGen Source Connector (for synthetic load testing)
- Prometheus and Grafana (for monitoring)
- Custom Kafka PostgreSQL Sink Connector (Blog 6)
- Apache Superset (Blog 7)
These are the example configuration files, data, etc. See the specific blogs for further details for the different connector types.
The latest support documentation for Instaclustr Kafka Connect is here.
Deploy connectors using the Kafka Connect REST API or the Instaclustr console. Refer to the individual blog posts for step-by-step deployment instructions for each connector type.
- Paul Brebner - Initial work - NetApp Instaclustr
See also the list of MAINTAINERS who participated in projects in this repository.
This project is licensed under the Apache 2.0 license.