Skip to content

Conversation

@doc-han
Copy link
Contributor

@doc-han doc-han commented Nov 10, 2025

Description

Implements active, inactive and unavailable states for active collaborators on a workflow.

active - a user is active when a workflow is open and the user is actively on the tab.
inactive - a user is inactive after 12s if they leave the tab where the workflow is open.
unavailable - a user is removed from active collaborators after approx. 1min of closing the tab where the workflow is open.

Main issue

[a] Throttling of browser timers is our biggest villain in this PR.
What problems does it cause?

  • Users are supposed to keep updating their awareness in ydoc by updating their state else they get removed from awareness. there's a 30secs window for this by ydoc else user is marked offline
  • We currently perform this update every 10secs to have 3 chances of updating the ydoc state.
  • Browser throttling reduced the 3 chances to be unpredictable(mostly 0 when the user leaves the tab).
  • Hence, when a user leaves the tab. there's a high chance they'll be removed from ydoc awareness because throttling would've increased our 10secs interval to 30+secs.
    Hence, when a user leaves a tab, they will be marked as offline when throttling kicks in, and when after being throttled, their event finally goes through, they then get marked back as online. making them flicker

[b] Another thing to be aware of 10secs interval
Every 10secs we actually update the user's lastSeen to Date.now() but then when the user leaves the tab, we don't update this lastSeen value. And because there's no state change/diff happening when the user leaves the tab. No event is actually triggered to keep ydoc live.

Approach in this PR

  1. When user leaves the tab. for every 10secs, we increment lastSeen by 1ms which will trigger an event to keep ydoc live. This is to battle [b]

  2. We have a small cache for activeCollaborators that lives for 60secs. when a user shows up as an active-collaborator we keep him for 60secs max. this 60secs is renewed when they show up again in the new set of active collaborators we receive. This resolves [a]

  3. and 2. work hand in hand to smoothen the active collaborators regardless of the ydoc and browser throttling limits.

Side Effect

When a user closes a tab or a workflow. They disappear from other users active-collaborators after 60secs. but users would know inactive collaborators real quick(12secs)

Closes #3931

Validation steps

Note: Some values got changed.
eg.

  1. inactivity time changed from 2min to 12secs.
  2. it also takes approx. 1min(60secs) to remove a user from the active collaborators when they actually leave.

Additional notes for the reviewer

  1. (Is there anything else the reviewer should know or look out for?)

AI Usage

Please disclose how you've used AI in this work (it's cool, we just want to know!):

  • Code generation (copilot but not intellisense)
  • Learning or fact checking
  • Strategy / design
  • Optimisation / refactoring
  • Translation / spellchecking / doc gen
  • Other
  • I have not used AI

You can read more details in our Responsible AI Policy

Pre-submission checklist

  • I have performed a self-review of my code.
  • I have implemented and tested all related authorization policies. (e.g., :owner, :admin, :editor, :viewer)
  • I have updated the changelog.
  • I have ticked a box in "AI usage" in this PR

@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.79%. Comparing base (3964acb) to head (b917230).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3951      +/-   ##
==========================================
- Coverage   88.82%   88.79%   -0.04%     
==========================================
  Files         422      422              
  Lines       19069    19069              
==========================================
- Hits        16938    16932       -6     
- Misses       2131     2137       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@doc-han
Copy link
Contributor Author

doc-han commented Nov 17, 2025

suggestion from @stuartc, "use phoenix channel callback instead of timeouts to keep awareness active".

@doc-han doc-han force-pushed the 3931-inactive-collaborator-avatar-dissapears-on-the-editor-instead-of-greying-out branch from 561dffc to 43b4845 Compare November 17, 2025 12:34
@doc-han
Copy link
Contributor Author

doc-han commented Nov 17, 2025

update

  1. caching awareness users for 1min to make user availability smooth to others.
  2. reducing online idle time from 2min to 12sec. hence, if you leave the tab, you'll be set to inactive after 12secs instead of the initial 2mins.
  3. when users are active on the tab, they update their lastSeen every 30secs with their current time.
  4. when users leave the tab, we update their lastSeen with their previous lastSeen + 1ms. approx. nothing. just to update the store to keep awareness active.

@doc-han doc-han marked this pull request as ready for review November 18, 2025 08:18
@doc-han doc-han changed the title fix: inactive collaborators should still ping their last seen fix: flickering of active collaborator icons between states Nov 18, 2025
@doc-han doc-han requested review from elias-ba and stuartc November 18, 2025 09:15
doc-han and others added 2 commits November 18, 2025 13:41
Copy link
Contributor

@elias-ba elias-ba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @doc-han! The store implementation and caching mechanism look great. However, I noticed the ActiveCollaborators.tsx component wasn't updated to use the new 12-second threshold. The component still uses lessthanmin(user.lastSeen, 2) on line 47, which is probably causing the JavaScript tests to fail.

@github-project-automation github-project-automation bot moved this from New Issues to In review in v2 Nov 20, 2025
Copy link
Member

@stuartc stuartc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have concerns about the core approach here. This implementation
conflates the Yjs awareness heartbeat (keeping the connection alive)
with user presence (activity/engagement). The result is that user
presence state is now driven by each browser's internal throttling
policies rather than actual user behavior. What's happening:

  • The frozen timestamp technique keeps awareness alive during throttling
  • But we're then using that same lastSeen timestamp to determine if
    users are "active" or "inactive"
  • This means presence state is an indirect side-effect of browser
    throttling, not a direct measurement of user engagement

The constraints this creates:

  1. Browser throttles → lastSeen updates slow → appears inactive
  2. Browser active → lastSeen updates normally → appears active

We're treating the browser's power-saving policies as a signal for user
presence. This varies by: Browser, device & OS, battery state, power
profile...

Extensibility problem:

Where would we add actual presence detection?

If we wanted to:

  • Check document.visibilityState directly
  • Track user interactions (mouse, keyboard)
  • Distinguish "tab switched for a few seconds" from "user left computer"

Would we keep extending lastSeen? That field is already serving two
purposes:

  • Keep Yjs awareness alive/changing (its original job)
  • Indicate user activity (what we're overloading it with)

We discussed the other day: Heartbeat and presence are separate
concerns. The heartbeat exists to prevent connection timeouts - it's
plumbing. User presence is a product feature about engagement/activity.
Mixing them means we can't adjust one without affecting the other.

Example: If we want to show users as inactive faster (say 5 seconds), we
can't - we're locked to the 10-30s window by Yjs's 30-second timeout. If
we want to keep users visible longer after they close a tab, we add
caching layers that delay all removals by 60+ seconds.
The 60-second removal time: Users disappear 60-70 seconds after closing
a tab because we need the cache to smooth over throttling-induced
flickering. But that's a workaround for using throttling as a presence
signal in the first place.

Request: What would it take to separate these? Could we: Use lastSeen
only for connection health (keep current 10s updates) Add a separate
lastActivity and/or lastState field to awareness (active, away, idle) +
timestamp

Update that field based on visibility API, not throttling side-effects
Let the UI read lastActivity and/or lastState instead of calculating
from lastSeen This would give us direct control over presence without
fighting browser throttling behavior. The frozen timestamp technique
might not be needed since it's value only needs to actually change in
order to keep the awareness around. But wouldn't drive the user-facing
presence indicator. I want to understand: what blocked this approach? Is
there a technical constraint I'm missing, or was the lastSeen
overloading the simpler path?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

Inactive Collaborator Avatar dissapears on the Editor instead of greying out

4 participants