Skip to content

Conversation

@brcarp
Copy link
Contributor

@brcarp brcarp commented Nov 15, 2025

Summary

This PR upgrades the development container from Debian 11 (Bullseye) to Debian 12 (Bookworm) and adds comprehensive support for Amazon Linux 2023 Lambda runtimes. All tests in the GitHub Actions Test workflow now pass, including new test jobs for Amazon Linux 2023 on both x86_64 and arm64 architectures.

Changes

Development Container (.devcontainer/)

  • Upgraded base image from rust:1-1-bullseye to rust:1-1-bookworm
  • Consolidated and optimized package installation with proper dependency management
  • Added cross-compilation toolchains: gcc-x86-64-linux-gnu, gcc-aarch64-linux-gnu, and their associated libc-dev packages
  • Improved QEMU setup with automatic binfmt enablement for multi-architecture support
  • Fixed Rust permissions: Added proper ownership of /usr/local/rustup and /usr/local/cargo to vscode user
  • Enhanced postCreate script with Docker daemon readiness check and timeout handling
  • Updated devcontainer.json: Disabled QEMU installation in docker-in-docker feature (already handled in Dockerfile)

Amazon Linux 2023 Support (amzn2023/)

Created complete build and test infrastructure for AL2023:

  • Build Dockerfiles for both x86_64 and arm64 using amazonlinux:2023 base
  • Test Dockerfiles using Lambda nodejs:22 runtime (AL2023-based)
  • Setup and test scripts following the established pattern from amzn/ directory
  • Proper Node.js installation in build containers for testing
  • Correct use of /root/.cargo/bin paths and WORKDIR directives

Python 2.7 Support (py27/)

  • Created dedicated build container (Ubuntu 22.04) to ensure glibc compatibility
  • Fixed wrapt dependency: Pinned to <1.15.0 for Python 2.7 compatibility (newer versions use f-strings)
  • Updated test Dockerfile with proper binary paths and diagnostic tools (file, binutils, util-linux)
  • Modified setup script to build in matching environment before testing

Build System Improvements (bin/)

  • Auto-detection of build target based on host architecture (uname -m)
  • Architecture-aware strip command: Automatically uses correct cross-platform strip utility
  • Enhanced cross-compilation support with proper fallback logic
  • Removed ARM64-specific test restrictions that were preventing valid test execution

GitHub Actions Workflow (.github/workflows/)

  • Upgraded runners from ubuntu-20.04 to ubuntu-22.04
  • Added QEMU and Buildx setup for arm64 test jobs
  • Added two new test jobs for Amazon Linux 2023 (x86_64 and arm64)
  • Follows consistent pattern across all platform-specific test jobs

Notes

Key Discoveries and Pitfalls

  1. glibc Forward Compatibility Issues
  • Libraries compiled with newer glibc (2.36 in Bookworm) cannot run on systems with older glibc (2.35 in Ubuntu 22.04)
  • Solution: Build libraries in environments matching or older than the target glibc version
  • For py27: Built in Ubuntu 22.04 (glibc 2.35) to match test environment
  1. Rust Toolchain Permissions Challenge
  • Initial attempts to install Rust in user's home directory (/home/vscode/.rustup) failed because RUSTUP_HOME and CARGO_HOME environment variables weren't being respected in the devcontainer exec context
  • rustup's on-demand component installation requires write access to toolchain directories
  • Solution: Install system-wide in /usr/local, then chown to vscode user for development flexibility
  1. Cross-Compilation Requirements
  • Cross-compiling Rust projects with C dependencies (like ring crate) requires:
    • Target-specific C compiler (e.g., gcc-x86-64-linux-gnu)
    • Explicit linker configuration via .cargo/config.toml
    • Target-specific strip utility for binary optimization
  • Build scripts now intelligently detect architecture and use appropriate tools
  1. Lambda Runtime Base Image Selection
  • Amazon Linux 2023 Lambda runtimes require nodejs:22 (not nodejs:20)
  • nodejs:20 and earlier are based on Amazon Linux 2
  • nodejs:22 is based on Amazon Linux 2023 with newer glibc
  • Critical: Match Lambda runtime image to build environment's OS version
  1. Python 2.7 Dependency Management
  • The wrapt package dropped Python 2.7 support in version 2.0+
  • Newer versions use f-strings (Python 3.6+), causing syntax errors
  • Solution: Pin wrapt>=1.10.4,<1.15.0 in setup.py
  1. Docker-in-Docker Timing
  • The postCreate hook was failing because Docker daemon wasn't ready
  • Solution: Added wait loop with timeout to ensure daemon availability

Multi-Architecture Development

This project now supports seamless development on both ARM64 (Apple Silicon, Graviton) and x86_64 (Intel/AMD) host machines:

  • Auto-detection: Build scripts automatically detect host architecture and set appropriate targets
  • QEMU integration: Enables building and testing non-native architectures locally
  • GitHub Actions: Uses QEMU for arm64 tests on x86_64 runners
  • Cross-compilation toolchains: Installed for both directions (ARM↔x86)

This ensures developers on ARM64 Macs can build/test x86_64 artifacts and vice versa, matching the multi-architecture nature of AWS Lambda deployments.

Future Consideration: Zig for Cross-Compilation

The extensive troubleshooting around cross-compilation toolchains, linker configuration, and architecture-specific utilities suggests a potential improvement for the future: Zig as a cross-compilation toolchain.

Zig provides:

  • Single toolchain for all targets (no need for gcc-x86-64-linux-gnu, gcc-aarch64-linux-gnu, etc.)
  • Zero configuration cross-compilation
  • Drop-in replacement for C/C++ compilers via cargo zigbuild

This could significantly simplify the Dockerfile and build scripts by eliminating the need for:

  • Multiple target-specific GCC installations
  • Architecture detection logic for strip commands
  • Manual .cargo/config.toml linker configuration

The current solution works reliably, but Zig could make cross-compilation more maintainable as the project grows to support additional architectures or platforms.

This commit resolves a series of build and runtime errors to create a stable,
portable, and fully automated dev container environment that works on both
`arm64` and `x86_64` architectures out-of-the-box.

- Stabilize Dockerfile Build:
    - Upgrades the base image from `bullseye` to `bookworm`.
    - Consolidates all `apt-get` dependencies into a single, correctly ordered
      layer, installing necessary tools for cross-compilation
      (`gcc-x86-64-linux-gnu`, `libc6-dev-amd64-cross`).
    - Fixes `rustup` permission errors by installing the toolchain as `root`
      and granting ownership to the `vscode` user.
    - Adds `--break-system-packages` to the `pip install` command to comply
      with Debian `bookworm`'s package management policies.

- Improve Architecture Portability:
    - Makes the `bin/build` and `bin/test` scripts architecture-aware, allowing
      them to run seamlessly on both `arm64` and `x86_64` hosts without manual
      configuration.
    - Fixes a bug that caused inconsistent naming of the shared library (`.so`)
      file between build and test runs.

- Fix Container Startup on ARM64:
    - Centralizes QEMU and `binfmt` setup within the `Dockerfile` build,
      creating an architecture-aware initialization process.
    - This allows for the removal of legacy, conflicting setup methods that
      caused startup failures on `arm64` hosts:
        - Removes the privileged `docker run` command for `qemu-user-static`
          from the `postCreate` script.
        - Disables the redundant QEMU setup in the `docker-in-docker` feature
          by configuring `install-qemu: false` for the feature.
This commit implements a small refactor to make the dev container setup
more resilient and truly multi-platform.

- Installs `aarch64` cross-compilation packages (`gcc-aarch64-linux-gnu`,
  `libc6-dev-arm64-cross`) in the `Dockerfile` to enable building for
   ARM64 on x86_64 hosts.
- Updates `bin/build-arch` to use the correct `strip` binary (native or
  cross-compile) by checking both the host and target architectures.
- Adds a 30-second timeout to the `postCreate` script to prevent it from
  hanging if the Docker daemon fails to start.
- Adds a comment to `bin/test` clarifying why language runtime tests are
  now enabled for all architectures.
- Merges the `update-alternatives` command into the main `RUN` layer,
  reducing the total number of image layers.
This commit updates the CI configuration to resolve build failures
and align the test environments with modern, supported versions.

- Replaces deprecated `ubuntu-20.04` runners with `ubuntu-22.04`
  in the GitHub Actions workflow, fixing the hanging jobs.
- Adds QEMU and Docker Buildx to `arm64` jobs to enable cross-platform
  image builds.
- Upgrades the Debian test environment from a Bullseye-based image
  to a Bookworm-based one, and updates Node.js from v18 to v22 (LTS).
- Updates the Python 2.7 test environment to use an `ubuntu:22.04`
  base image and installs Python 2.7 via the `deadsnakes` PPA.
@brcarp
Copy link
Contributor Author

brcarp commented Nov 15, 2025

@jeremiahlukus The Rust version upgrade was a typo, really. I was just upgrading the development base image from Bullseye to Bookworm, and getting the Test workflow green. I also added Amazon Linux 2023 to the Test workflow. Both Amazon Linux 2 and Bullseye are falling out of support soon, and it's good to get crypteia on the other side of that.

@brcarp
Copy link
Contributor Author

brcarp commented Nov 15, 2025

Additional notes:

  1. It's true that I paired the devcontainer Debian image update and the Amazon Linux 2023 test addition into one combo PR, but it's actually not a lot of change and it's all housekeeping stuff (no changes to src, any Rust code, etc., and platform binary releases should be the same for now).

  2. The changes to the devcontainer Dockerfile are bigger than they look; it's mostly reorganization and moving stuff around.

  3. I know this looks like it's AI running hog wild but it's not. I'll own up to letting AI generate content for the PR description, but the iterative changes involved plenty of human blood, sweat & tears.

  4. Apologies for the GitHub Actions quota. I wasn't thinking about that when I pushed commits and could have been more aggressive about cancelling workflows. If you hit the quota this month because of that let me know, and I'll see what I can do.

@jeremiahlukus
Copy link
Contributor

jeremiahlukus commented Nov 15, 2025

Thanks for the clarification. Saw a wall of AI text and discredited you. I might be able to look into it tomorrow else I’ll look at it on Monday if all is good I’ll merge it.

Removing the rust update makes it an easier task.

@metaskills
Copy link
Member

Good to see you Brian. Hope you're doing well. @jeremiahlukus Brian is a trusted friend and co-worker from my times at Custom Ink. They owner/collaborators here. Brian does great work too. Just wanted to make some intros. I've got no stake in this project or technical opinion on how to move it forward but trust y'all got it covered.

@jeremiahlukus
Copy link
Contributor

Hey @brcarp I didn’t forget just dealing with some other issues. A deer ran into me the other day and totaled my car so might be a couple days before things level out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants