Skip to content

Cog freezes after 20+ min idle and then it takes up to 20 seconds to recover #771

@eduelias

Description

@eduelias

Hi!

We (me and my team) have been working with weston/cog over Fedora 42 in a Raspberry PI 3. We noticed that around 20 minutes of inactivity, the browser becomes "slow", sluggish and it takes up to 20 seconds to acknowledge scrolls or clicks (see the video).

This also happens when we load the static website http://transcendentfunlushstars.neverssl.com/online so we kinda know it's not our application.

Currently, we're running both weston and cog inside a container.

This behavior only happens after 20+ min of inactivity. The system has nothing else running on it. With the neverssl page we are able to run it alone, without any other container on a clean feadora-42 installation.

Our application logs every touch on the screen and, even with the screen being "dormant", the touch events are recorded by the service underlying and everything else in the machine seems to be active and running,

Our guess is that there's something happening at the render level that it is putting it in a "sleep" state that we were not able to find in any documentation.

Following you'll find attached the video with a demonstration and the command lines we've been using to run the browser.

Any help is of great appreciation!

This is our Docker entrypoint:

#!/bin/bash

# Setup variables
export WPE_BACKEND=fdo
export WAYLAND_DISPLAY=wayland-1
SOCKET_PATH="$XDG_RUNTIME_DIR/$WAYLAND_DISPLAY"
# export WAYLAND_DEBUG=1

export WEBKIT_DISABLE_MEMORY_PRESSURE_MONITOR=1
export WEBKIT_DISABLE_COMPOSITING_MODE=1
export WEBKIT_DISABLE_THROTTLING=1
export WEBKIT_DISABLE_IDLE_SLEEP=1
export WEBKIT_USE_SINGLE_WEB_PROCESS=1

# Ensure XDG_RUNTIME_DIR exists with correct permissions
mkdir -p "$XDG_RUNTIME_DIR"
chmod 0700 "$XDG_RUNTIME_DIR"

# Start weston in the background
weston --renderer=pixman --shell="kiosk-shell.so" &

# Wait for weston to be ready
while [ ! -S "$SOCKET_PATH" ]; do sleep 1; done
echo "Weston socket is ready, starting WPE/COG"

# Starting the browser
cog --web-mem-limit=512 \
  --web-conservative-threshold=0.49 \
  --web-strict-threshold=0.9 \
  --web-check-interval=60 \
  --web-kill-threshold=0.99 \
  --webprocess-failure=restart \
  --enable-developer-extras=1 \
  --enable-write-console-messages-to-stdout=1 \
  -P wl \
  http://127.0.0.1:4200

This is our "podman" start script:

#!/bin/bash
KIOSK_IMAGE="${ECR_REPO}:${TAG}"

# sudo can be updated when kiosker can execute podman without use of sudo
nice -10 podman run --replace \
  --name kiosk \
  --device /dev/dri \
  --health-cmd="/usr/local/bin/healthcheck.sh" \
  --health-start-period=60s \
  --health-on-failure=stop \
  --health-interval=30s \
  --health-timeout=10s \
  --health-retries=3 \
  --network=host \
  --memory-swap=0 \
  --memory=512m \
  --oom-score-adj=-500 \
  --privileged \
  --volume=/run:/run \
  --volume="/usr/local/bin/kiosk-entrypoint.sh:/entrypoint.sh" \
  --volume="/usr/local/bin/kiosk-healthcheck.sh:/usr/local/bin/healthcheck.sh" \
  $KIOSK_IMAGE >/dev/null

and this is our service unit:

[Unit]
Description=Kiosk with WPE Web Browser
After=network-online.target multi-user.target seatd.service user@1000.service
BindsTo=user@1000.service
Requires=seatd.service
Wants=network-online.target

[Service]
Type=exec
Environment=TZ=Europe/Amsterdam
Environment=WLR_LIBINPUT_NO_DEVICES=1
Environment=TAG=1.0.16
Environment=ECR_REPO=[big-beautiful-repo]
ExecStart=/usr/local/bin/start-kiosk.sh
ExecStop=/usr/bin/podman stop kiosk
Restart=on-failure
RestartMode=normal

And here's a video with the problem we found:

(Will post a new video soon)

Again, any tip/help would be deeply appreciated! Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions