-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
Description
Search before asking
- I had searched in the issues and found no similar issues.
What happened
When running an HBase -> HBase pipeline with the SeaTunnel connector-v2 HBase source, the job finishes successfully but the Total Read Count is much smaller than the real row count in HBase.
- HBase shell
count 'assign_cf_table', {COLUMNS=>'cf1', CACHE=>10000}returns 10,000,000 rows - SeaTunnel job summary shows
Total Read Count/Total Write Countaround 2460778 - Logs show multiple splits were assigned, e.g.
Assigning 4 splits to subtask: 0, but the source finishes quickly.
SeaTunnel Version
2.3.12
SeaTunnel Config
env {
parallelism = 1
job.mode = "BATCH"
}
source {
Hbase {
zookeeper_quorum = "<zk1:2181,zk2:2181,zk3:2181>"
table = "assign_cf_table"
caching = 100000
batch = 100
cache_blocks = false
# kerberos configs (masked)
hbase.client.kerberos.principal = "<principal>"
hbase.client.keytab.file = "<keytab>"
krb5_path = "<krb5.conf>"
hbase_extra_config = {
"hbase.security.authentication" = "kerberos"
"hadoop.security.authentication" = "kerberos"
"hbase.master.kerberos.principal" = "hbase/_HOST@REALM"
"hbase.regionserver.kerberos.principal" = "hbase/_HOST@REALM"
"hbase.rpc.protection" = "authentication"
"hbase.zookeeper.useSasl" = "false"
}
schema = {
columns = [
{ name = "rowkey" type = string },
{ name = "cf1:id" type = string },
{ name = "cf1:name" type = string }
]
}
}
}
sink {
Hbase {
zookeeper_quorum = "<zk1:2181,zk2:2181,zk3:2181>"
table = "assign_cf_table3"
rowkey_column = ["rowkey"]
family_name { all_columns = "cf1" }
hbase_extra_config = {
"hbase.security.authentication" = "kerberos"
"hadoop.security.authentication" = "kerberos"
"hbase.master.kerberos.principal" = "hbase/_HOST@REALM"
"hbase.regionserver.kerberos.principal" = "hbase/_HOST@REALM"
"hbase.rpc.protection" = "authentication"
"hbase.zookeeper.useSasl" = "false"
}
}
}
Running Command
./bin/seatunnel.sh -e flink -c /path/to/hbase2hbase.confError Exception
N/A (job finishes successfully; only read/write count mismatch)
Zeta or Flink or Spark Version
flink 1.20.1
Java or Scala Version
java8
Screenshots
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct