Commit f6ed489
committed
MB-35458 [SR]: Move SyncWrite completion to bg DurabilityCompletionTask
[[Re-apply after fixing error in DurabilityCompletionTask::run
(skipping last vBucket).]]
Change how SyncWrites which are Resolved and awaiting Completion are
handled, by moving the final VBucket::commit() / abort() into a
background task - DurabilityCompletionTask.
+Background+
There are two reasons for making this change:
a) Performance - specifically latency of front-end worker threads.
By moving completion into a background task, we reduce the amount of
work done on the thread which actually detected the SyncWrite was
resolved - typically the front-end DCP threads when a DCP_SEQNO_ACK
is processed.
Given that we SEQNO_ACK at the end of Snapshot, A single SEQNO_ACK
could result in committing multiple SyncWrites. Committing one
SyncWrite is similar to a normal front-end Set operation, so there is
potentially a non-trivial amount of work needed to be done when
completing SyncWrites, which could tie up the front-end thread (causing
other Connections to have to wait) for a noticable amount of time.
b) Simplification of lock management.
Doing completion in a background task simplifies lock management, for
example we avoid lock inversions with earlier locks acquired during
dcpSeqnoAck when attemping to later call notifySeqnoAvailable when this
was done on the original thread.
+Problem+
While (a) was the first reason identified for making this change
(see MB-33092), (b) is the reason this change is being made now. During
testing the following lock-order-inversion was seen:
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock)
Cycle in lock order graph:
Stream::streamMutex => StreamContainer::rwlock => Stream::streamMutex
The crux of the issue is the processing of DCP_SEQNO_ACKNOWLEDGED
messages by the DcpProducer - this acquires the Stream::streamMutex
before calling VBucket::seqnoAcknowledged(), however that function
currently results in VBucket::commit() being called to synchronously
complete the SyncWrite; which in turn must nodify all connected
replica that a new seqno is available, requiring
StreamContainer::rwlock to be acquired:
Mutex StreamContainer::rwlock acquired here while holding mutex Stream::streamMutex in thread T15:
...
#6 StreamContainer<std::shared_ptr<Stream> >::rlock()
#7 DcpProducer::notifySeqnoAvailable(Vbid, unsigned long)
...
#13 VBucket::commit(...)
#14 ActiveDurabilityMonitor::commit(...)
#15 ActiveDurabilityMonitor::processCompletedSyncWriteQueue()
#16 ActiveDurabilityMonitor::seqnoAckReceived(...)
#17 VBucket::seqnoAcknowledged(...)
#18 ActiveStream::seqnoAck(...)
#19 DcpProducer::seqno_acknowledged(...)
...
Mutex Stream::streamMutex previously acquired by the same thread here:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::seqnoAck(...)
#5 DcpProducer::seqno_acknowledged(...)
...
This conflicts with the ordering seen when sending items out on the
DCP connection - inside DcpProducer::step() where the
StreamContainer::rwlock is acquired first, then ActiveStream::mutex
acquired later:
Mutex Stream::streamMutex acquired here while holding mutex StreamContainer::rwlock in thread T15:
...
#3 std::lock_guard<std::mutex>::lock_guard(std::mutex&)
#4 ActiveStream::next()
#5 DcpProducer::getNextItem()
#6 DcpProducer::step(dcp_message_producers*)
...
Mutex StreamContainer::rwlock previously acquired by the same thread here:
#0 pthread_rwlock_rdlock <null> (libtsan.so.0+0x00000002c98b)
...
#4 std::shared_lock<cb::RWLock>::shared_lock(cb::RWLock&)
#5 StreamContainer<>::ResumableIterationHandle::ResumableIterationHandle()
#6 StreamContainer<>::startResumable()
#7 DcpProducer::getNextItem()
#8 DcpProducer::step(dcp_message_producers*)
...
+Solution+
The processing of resolved SyncWrites moved into a new background task.
Instead of immediately processing them within
ActiveDM::seqnoAckReceived(), that function notifies the new NonIO
DurabilityCompletionTask that there are SyncWrites waiting for
completion.
DurabilityCompletionTask maintains a bool per vBucket indicating if
there are SyncWrites for that vBucket pending completion. When the
task is run, for each flag which is true it calls
VBucket::processResolvedSyncWrites() for the associated VBucket.
+Implementation Notes+
Currently there is just a single DurabilityCompletionTask (per Bucket),
this was chosen as 1 task per vBucket (i.e. 1024 per Bucket) would
be inefficient for our current background task scheduler (both in terms
of latency to schedule each task for only one vBucket's worth of work,
and in terms of managing that many tasks in the future queue).
However, that does _potentially_ mean there's fewer resources (threads)
available to complete SyncWrites on - previously that work could be
done concurrently on all frontend threads (~O(num_cpus). Now the same
work only has 1 thread available to run on (there's only a single
DurabilityCompletionTask).
_If_ this becomes a bottleneck we could look at increasing the number of
DurabilityCompletionTask - e.g. sharding all vBuckets across multiple
tasks like flusher / bgfetcher.
Change-Id: I33ecfa78b03b4d2120b5d05f54984b24ce038fd8
Reviewed-on: http://review.couchbase.org/113749
Reviewed-by: Ben Huddleston <ben.huddleston@couchbase.com>
Tested-by: Build Bot <build@couchbase.com>1 parent dd410e5 commit f6ed489
File tree
35 files changed
+457
-75
lines changed- engines/ep
- benchmarks
- src
- durability
- tests
- mock
- module_tests
- collections
35 files changed
+457
-75
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
| 230 | + | |
230 | 231 | | |
231 | 232 | | |
232 | 233 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
59 | 60 | | |
60 | 61 | | |
61 | 62 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
60 | 61 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
545 | 545 | | |
546 | 546 | | |
547 | 547 | | |
548 | | - | |
| 548 | + | |
549 | 549 | | |
550 | 550 | | |
551 | 551 | | |
| |||
620 | 620 | | |
621 | 621 | | |
622 | 622 | | |
623 | | - | |
624 | | - | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
625 | 626 | | |
626 | 627 | | |
627 | 628 | | |
| |||
640 | 641 | | |
641 | 642 | | |
642 | 643 | | |
643 | | - | |
| 644 | + | |
644 | 645 | | |
645 | 646 | | |
646 | 647 | | |
| |||
729 | 730 | | |
730 | 731 | | |
731 | 732 | | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
732 | 740 | | |
733 | 741 | | |
734 | 742 | | |
| |||
1697 | 1705 | | |
1698 | 1706 | | |
1699 | 1707 | | |
1700 | | - | |
1701 | | - | |
1702 | | - | |
1703 | | - | |
1704 | | - | |
| 1708 | + | |
1705 | 1709 | | |
1706 | 1710 | | |
1707 | 1711 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
282 | 282 | | |
283 | 283 | | |
284 | 284 | | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
285 | 291 | | |
286 | 292 | | |
287 | 293 | | |
| |||
363 | 369 | | |
364 | 370 | | |
365 | 371 | | |
366 | | - | |
367 | | - | |
| 372 | + | |
| 373 | + | |
368 | 374 | | |
369 | | - | |
| 375 | + | |
370 | 376 | | |
371 | 377 | | |
372 | 378 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1169 | 1169 | | |
1170 | 1170 | | |
1171 | 1171 | | |
| 1172 | + | |
1172 | 1173 | | |
1173 | 1174 | | |
1174 | 1175 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
50 | 51 | | |
51 | 52 | | |
52 | 53 | | |
| |||
69 | 70 | | |
70 | 71 | | |
71 | 72 | | |
| 73 | + | |
72 | 74 | | |
73 | 75 | | |
74 | 76 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| 42 | + | |
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
| |||
0 commit comments