Skip to content

Fix duplicate identity#959

Draft
pblazej wants to merge 4 commits intomainfrom
blaze/duplicate-identity
Draft

Fix duplicate identity#959
pblazej wants to merge 4 commits intomainfrom
blaze/duplicate-identity

Conversation

@pblazej
Copy link
Copy Markdown
Contributor

@pblazej pblazej commented Mar 31, 2026

Resolves #958

After unpublishAll(), the sticky hasPublished flag kept the publisher transport marked as critical. When the idle publisher ICE timed out, this triggered an unnecessary reconnect cycle that escalated from quick to full, connecting without the reconnect flag and causing the server to remove the participant with DUPLICATE_IDENTITY.

Reset hasPublished to false when no track publications with non-nil tracks remain, preventing both the spurious reconnect trigger and unnecessary publisher ICE restarts during any server-initiated reconnect.

After unpublishAll(), the sticky hasPublished flag kept the publisher
transport marked as critical. When the idle publisher ICE timed out,
this triggered an unnecessary reconnect cycle that escalated from quick
to full, connecting without the reconnect flag and causing the server
to remove the participant with DUPLICATE_IDENTITY.

Reset hasPublished to false when no track publications with non-nil
tracks remain, preventing both the spurious reconnect trigger and
unnecessary publisher ICE restarts during any server-initiated
reconnect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Daltron
Copy link
Copy Markdown

Daltron commented Mar 31, 2026

@pblazej thank you for such a quick fix and solution! I can confirm that this does indeed fix the undesired reconnection issue from happening! However, it appears it introduced a new issue. If the LocalParticipant tries to republish their tracks after they unpublished them, the video and audio tracks are published (According to the delegates) but you can neither see them or hear them on either the local or remote sides.

However, on the RemoteParticipant side, only didPublishTrack is called for the audio track and the new video track never comes through. But like I mentioned, you are unable to hear them.

I can also confirm this was not an issue before this change because as soon as the undesired reconnection took place, we could successfully publish (and see and hear the tracks) again on all devices.

@pblazej
Copy link
Copy Markdown
Contributor Author

pblazej commented Mar 31, 2026

@Daltron let me revisit it, will keep you posted.

@Daltron
Copy link
Copy Markdown

Daltron commented Mar 31, 2026

@pblazej really appreciate it, thank you!

After unpublishing all tracks, the publisher transport may go to
disconnected/failed state (ICE timeout). A subsequent publish call
would negotiate without ICE restart, leaving media unable to flow.

Check the publisher connection state in publisherShouldNegotiate()
and use an ICE restart offer when the transport is disconnected or
failed, re-establishing connectivity before publishing new tracks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pblazej
Copy link
Copy Markdown
Contributor Author

pblazej commented Apr 1, 2026

@Daltron I was unable to reproduce this particular regression in an isolated environment (killing ICE, blocking UDP, etc.).

However, I think there's another gap in the logic I tried to fix above ⬆️

It would be great to validate on your side (and provide full logs).

@Daltron
Copy link
Copy Markdown

Daltron commented Apr 1, 2026

@pblazej Unfortunately, the same issue persists to where local tracks can not be published a second time. I've attached logs which are broken down into sections which indicate what is triggered where:

livekit_logs.txt

@pblazej
Copy link
Copy Markdown
Contributor Author

pblazej commented Apr 2, 2026

@Daltron I think I'd need full logs:

/* 10-15 seconds later */

on .debug level.

Preferably from main as well (with the original issue).

@Daltron
Copy link
Copy Markdown

Daltron commented Apr 3, 2026

@pblazej attached are the full logs from the blaze/duplicate-identity branch!
live_logs_full.txt

…blished

After unpublishAll(), the publisher transport idles and the server
sends LeaveRequest(resume, connectionTimeout). Previously this
triggered a needless quick reconnect cycle via the websocket path.

Skip the signal cleanup when hasPublished is false and the reason is
connectionTimeout, keeping the signal connection alive so re-publishing
can proceed without disruption.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pblazej
Copy link
Copy Markdown
Contributor Author

pblazej commented Apr 9, 2026

@Daltron thanks for the full details, I think the root cause is in the transceiver cleanup (workaround)...

Added one more change - let's try to validate it ⬆️ 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unpublishing local tracks is causing a participant_left and participant_joined cycle

2 participants