Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ This file is used to list changes made in each version of the AWS ParallelCluste
and achieve better performance at scale.
- Load kernel module `drm_client_lib` before installation of NVIDIA driver, if available on the kernel.
- Reduce dependency footprint by installing the package `sssd-common` rather than `sssd`.
- Disable Wayland protocol in GDM3 for Ubuntu 22.04+ to force the use of Xorg on GPU instances running without a display.
- Upgrade Slurm to version 24.11.7 (from 24.11.6).
- Upgrade Pmix to 5.0.9 (from 5.0.6).
- Upgrade libjwt to version 1.18.4 (from 1.17.0) for all OSs except Amazon Linux 2.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,62 @@ def optionally_disable_rnd
end
end

# Disable Wayland in GDM to ensure Xorg is used
# This is required for Ubuntu 22.04+ where Wayland is the default
# Without this, GDM won't start Xorg on headless GPU instances
def disable_wayland
bash 'Disable Wayland in GDM' do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a Unit Test for this as well as the check of which server Type we are using as part of Kitchen test?

user 'root'
code <<-DISABLEWAYLAND
set -e
if [ -f /etc/gdm3/custom.conf ]; then
sed -i 's/#WaylandEnable=false/WaylandEnable=false/' /etc/gdm3/custom.conf
# If the line doesn't exist at all, add it under [daemon] section
if ! grep -q "^WaylandEnable=false" /etc/gdm3/custom.conf; then
sed -i '/\\[daemon\\]/a WaylandEnable=false' /etc/gdm3/custom.conf
fi
fi
DISABLEWAYLAND
end
end

# Override allow_gpu_acceleration to disable Wayland before starting X
def allow_gpu_acceleration
# Update the xorg.conf to set up NVIDIA drivers.
# NOTE: --enable-all-gpus parameter is needed to support servers with more than one NVIDIA GPU.
nvidia_xconfig_command = "nvidia-xconfig --preserve-busid --enable-all-gpus"
nvidia_xconfig_command += " --use-display-device=none" if node['ec2']['instance_type'].start_with?("g2.")
execute "Set up Nvidia drivers for X configuration" do
user 'root'
command nvidia_xconfig_command
end

# dcvgl package must be installed after NVIDIA and before starting up X
# DO NOT install dcv-gl on non-GPU instances, or will run into a black screen issue
install_dcv_gl

# Disable Wayland to ensure GDM starts Xorg
disable_wayland

# Configure the X server to start automatically when the Linux server boots and start the X server in background
bash 'Launch X' do
user 'root'
code <<-SETUPX
set -e
systemctl set-default graphical.target
systemctl isolate graphical.target &
SETUPX
end

# Verify that the X server is running
execute 'Wait for X to start' do
user 'root'
command "pidof X || pidof Xorg"
retries 10
retry_delay 5
end
end

def post_install
# ubuntu-desktop comes with NetworkManager. On a cloud instance NetworkManager is unnecessary and causes delay.
# Instruct Netplan to use networkd for better performance
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -318,3 +318,27 @@
end
end
end

control 'tag:config_dcv_xorg_running_with_x11_session_type' do
title 'Check that Xorg is running and GDM is using X11 session type (not Wayland)'
only_if do
!os_properties.on_docker? &&
instance.head_node? &&
instance.dcv_installed? &&
node['cluster']['dcv_enabled'] == "head_node" &&
instance.graphic? &&
instance.nvidia_installed? &&
instance.dcv_gpu_accel_supported?
end

describe 'Xorg process should be running' do
subject { command('pidof Xorg || pidof X') }
its('exit_status') { should eq 0 }
its('stdout') { should_not be_empty }
end

describe 'GDM should be using X11 session type, not Wayland' do
subject { command("loginctl show-session $(loginctl | grep gdm | awk '{print $1}') -p Type 2>/dev/null | grep -i x11") }
its('exit_status') { should eq 0 }
end
end
Loading