-
Couldn't load subscription status.
- Fork 1.3k
Open
Description
command:
gh-ost \
--max-load=Threads_running=300,Threads_connected=600 \
--critical-load=Threads_running=1000,Threads_connected=1000 \
--chunk-size=20000 \
--max-lag-millis=5000 \
--dml-batch-size=200 \
--user="root" \
--password="SECRET" \
--host=127.0.0.1 \
--port=3306 \
--throttle-control-replicas=127.0.0.1:3307 \
--port=3306 \
--gcp \
--allow-on-master \
--database="db" \
--table="users" \
--verbose \
--switch-to-rbr \
--allow-master-master \
--cut-over=default \
--exact-rowcount \
--concurrent-rowcount \
--default-retries=120 \
--panic-flag-file=/tmp/ghost.panic.flag \
--postpone-cut-over-flag-file=/tmp/ghost.postpone.flag \
--alter="MODIFY name VARCHAR(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL"
--execute
The command was successfully executed, and all that remained was to perform a cut-over to replace the tables.
2025-10-20 08:59:28 INFO Locking `db`.`users`, `db`.`_users_del`
Copy: 4025114/4025114 100.0%; Applied: 35738; Backlog: 44/1000; Time: 14m36s(total), 12m31s(copy); streamer: mysql-bin.3335455:24883896; Lag: 0.33s, HeartbeatLag: 0.25s, State: migrating; ETA: due
2025-10-20 08:59:28 INFO Copy: 4025114/4025114 100.0%; Applied: 35738; Backlog: 44/1000; Time: 14m36s(total), 12m31s(copy); streamer: mysql-bin.3335455:24883896; Lag: 0.33s, HeartbeatLag: 0.25s, State: migrating; ETA: due []
Copy: 4025114/4025114 100.0%; Applied: 35738; Backlog: 0/1000; Time: 14m37s(total), 12m31s(copy); streamer: mysql-bin.3335455:27128516; Lag: 0.47s, HeartbeatLag: 0.52s, State: migrating; ETA: due
2025-10-20 08:59:29 INFO Copy: 4025114/4025114 100.0%; Applied: 35738; Backlog: 0/1000; Time: 14m37s(total), 12m31s(copy); streamer: mysql-bin.3335455:27128516; Lag: 0.47s, HeartbeatLag: 0.52s, State: migrating; ETA: due []
2025-10-20 08:59:29 FATAL critical-load met: Threads_connected=1019, >=1000
However, at that moment, critical_load jumped to its limit values and the process was interrupted with an error (FATAL).
The lock seemed to hang for 1-2 minutes, and a spike was visible on the dashboard at that moment.
The worst thing is that it is impossible to restore the process...
Perhaps it is not necessary to fail immediately, but rather allow the value to be raised through the socket to complete the execution?
Or you have a better idea how to prevent such cutoffs at the end?
Metadata
Metadata
Assignees
Labels
No labels