Commit 961a3a3
dcpdrain: Increase default NOOP interval from 1 to 60s
dcpdrain currently sets the DCP noop-interval to 1s, so the producer
will send NOOP requests to dcpdrain every 1s and dcpdrain needs to
correctly handle this request and send a response. When connecting to
clusters with high latency between client and
server nodes, it can take more than 1 second to complete setting up the DCP
connection and endering the main event loop. This means the server node may
start to send DCP noop requests before the DCP connection is
setup - and crucially dcpdrain's event loop is ready to process the
DCP noop request. This results in dcpdrain crashing as it gets a DCP
noop request when it is expecting a control response:
Process 43094 launched: '/Users/dave/repos/couchbase/server/source/build/kv_engine/dcpdrain' (arm64)
Using DCP flow control with buffer size: 13421772
Set DCP control message: set_priority=high
Set DCP control message: supports_cursor_dropping_vulcan=true
Set DCP control message: supports_hifi_MFU=true
Set DCP control message: send_stream_end_on_client_close_stream=true
Set DCP control message: enable_expiry_opcode=true
Set DCP control message: set_noop_interval=1
Set DCP control message: enable_noop=true
Set DCP control message: enable_out_of_order_snapshots=true
2023-05-03T12:11:28.705431+01:00 CRITICAL *** Fatal error encountered during exception handling ***
2023-05-03T12:11:28.708626+01:00 CRITICAL Caught unhandled std::exception-derived exception. what(): Header::getResponse(): Header is not a response
Target 0: (dcpdrain) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00000001c24a2d98 libsystem_kernel.dylib` __pthread_kill + 8
frame #1: 0x00000001c24d7ee0 libsystem_pthread.dylib` pthread_kill + 288
frame #2: 0x00000001c2412340 libsystem_c.dylib` abort + 168
frame #3: 0x00000001c2492b08 libc++abi.dylib` abort_message + 132
frame #4: 0x00000001c2482938 libc++abi.dylib` demangling_terminate_handler() + 312
frame #5: 0x00000001c2378330 libobjc.A.dylib` _objc_terminate() + 160
frame #6: 0x000000010008ef30 dcpdrain` backtrace_terminate_handler() + 752 at terminate_handler.cc:88
frame #7: 0x00000001c2491ea4 libc++abi.dylib` std::__terminate(void (*)()) + 20
frame #8: 0x00000001c2494c1c libc++abi.dylib` __cxxabiv1::failed_throw(__cxxabiv1::__cxa_exception*) + 36
frame #9: 0x00000001c2494bc8 libc++abi.dylib` __cxa_throw + 140
frame #10: 0x000000010002a77c dcpdrain` BinprotResponse::getTracingData() const [inlined] cb::mcbp::Header::getResponse(this=0x00006000002044a0) const + 48 at header.h:134
frame #11: 0x000000010002a74c dcpdrain` BinprotResponse::getTracingData() const [inlined] BinprotResponse::getResponse(this=<unavailable>) const at client_mcbp_commands.cc:487
frame #12: 0x000000010002a74c dcpdrain` BinprotResponse::getTracingData(this=0x000000016fdfef90) const + 188 at client_mcbp_commands.cc:373
frame #13: 0x000000010002a638 dcpdrain` MemcachedConnection::recvResponse(this=0x0000000101604080, response=0x000000016fdfef90, opcode=<unavailable>, readTimeout=<unavailable>) + 84 at client_connection.cc:1043
...
frame #21: 0x0000000100038f40 dcpdrain` MemcachedConnection::backoff_execute(..., context="DCP_CONTROL", ...) + 100 at client_connection.cc:2016
frame #22: 0x000000010002bab4 dcpdrain` MemcachedConnection::execute(this=0x0000000101604080, command=0x000000016fdfefb0, readTimeout=(__rep_ = 0)) + 168 at client_connection.cc:1998
frame #23: 0x000000010000d688 dcpdrain` main + 280 at dcpdrain.cc:451
frame #24: 0x000000010000d570 dcpdrain` main(argc=<unavailable>, argv=<unavailable>) + 8488 at dcpdrain.cc:929
frame #25: 0x00000001005d508c dyld` start + 520
Ideally dcpdrain should be robust to receiving dcp NOOP messages while
setting up the control flags, but that's not simple as we use common
code in MemcachedConnection which performs a request and expects a
response (of type DCP_CONTROL) in-order.
To workaround this problem simply increase the default DCP noop
interval from 1 to 60 seconds - 60s /should/ be sufficient to complete
the handshake...
Change-Id: I0f846956d6499ea54d74f781cb14d7982387c9f4
Reviewed-on: https://review.couchbase.org/c/kv_engine/+/190418
Tested-by: Build Bot <build@couchbase.com>
Reviewed-by: Trond Norbye <trond.norbye@couchbase.com>1 parent a2c2054 commit 961a3a3
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
902 | 902 | | |
903 | 903 | | |
904 | 904 | | |
905 | | - | |
| 905 | + | |
906 | 906 | | |
907 | 907 | | |
908 | 908 | | |
| |||
0 commit comments