Add Metal backend support for Gemma3 runner#17797
Open
seyeong-han wants to merge 1 commit intopytorch:mainfrom
Open
Add Metal backend support for Gemma3 runner#17797seyeong-han wants to merge 1 commit intopytorch:mainfrom
seyeong-han wants to merge 1 commit intopytorch:mainfrom
Conversation
Add `make gemma3-metal` build target and fix CMake 4.0 compatibility for Metal builds. Build target changes: - Add Metal backend linking to gemma3 CMakeLists.txt - Add gemma3-metal configure/build/workflow presets to CMakePresets.json - Add gemma3-metal target to Makefile with help text CMake fix: - Pre-set ABSL_INTERNAL_AT_LEAST_CXX17 before tokenizers subdirectory to work around CMake 4.0 deprecating CMP0067, which broke abseil's C++17 detection on compilers defaulting to C++14 This PR was authored with the assistance of Claude. Made-with: Cursor
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17797
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit fd268ce with merge base 5f879ca ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
make gemma3-metalbuild target and fix a CMake 4.0 build issue that breaks all Metal backend builds.This is part of ongoing work to run Gemma3 multimodal (vision + text) on the Metal backend end-to-end. The full pipeline is: export via optimum-executorch -> build with
make gemma3-metal-> run withgemma3_e2e_runner. The build and runner infrastructure in this PR works correctly. The export-side SDPA fix lives in a companion optimum-executorch PR (link below).Companion PR: optimum-executorch: Add SDPA decomposition for Metal backend
Changes
1.
make gemma3-metalbuild targetFollows the exact pattern used by voxtral-metal, whisper-metal, and parakeet-metal:
examples/models/gemma3/CMakeLists.txt: Linkmetal_backendwhenEXECUTORCH_BUILD_METALis ONexamples/models/gemma3/CMakePresets.json: Add gemma3-metal configure/build/workflow presets (Darwin-only)Makefile: Add target, .PHONY, help text, update model description comment2. Fix CMake 4.0 abseil C++17 detection (affects all Metal builds)
CMake 4.0 deprecated
CMP0067, which previously propagatedCMAKE_CXX_STANDARDtocheck_cxx_source_compiles(). Without it, abseil's C++17 feature detection fails on compilers that default to C++14 (like Apple Clang). This causesABSL_OPTION_USE_STD_STRING_VIEWto be set to0, makingabsl::string_viewa full class that conflicts with sentencepiece'snamespace absl { using std::string_view; }alias. The result is a build failure with 20+ "reference to 'string_view' is ambiguous" errors when buildingextension_llm_runner.Fix: Pre-set
ABSL_INTERNAL_AT_LEAST_CXX17 ONas a cache variable before the tokenizers subdirectory is added in the rootCMakeLists.txt.Current status and help needed
@manuelcandales
The pipeline works end-to-end (export, build, run), but the generated output is incorrect (gibberish tokens). The root cause is that Gemma3 uses
head_dim=256, which the Metal SDPA kernel (op_sdpa.mm) does not support (it only handles 64, 96, 128). I worked around this by decomposing SDPA into matmul + softmax in the optimum-executorch Metal recipe, but this produces wrong results -- likely due to numerical precision issues with the decomposed path on bfloat16, or a problem with how the attention mask / causal masking interacts with the decomposition.I'd appreciate guidance on:
head_dim=256template instantiation to the Metal SDPA kernel (op_sdpa.mm) feasible? That would be the ideal fix instead of decomposition.Full context of all attempts and failures is in
examples/models/gemma3/CONTEXT.md.Test plan
Current output is wrong:
prefabricated高齢талиtheitयर ಡ kawasan괘i Inst አ obiektginx Predict endeavors Bats podpagination بہتر pung πληರ್ಷ disintegr vil柠檬 গ্রাহ洁 ನೇ wafTabIndexitek سنا kterouisectionλαদৌnpm автомобиляहrox স্বাক্ষleetcode गोस्वामीপরিப்பிய कोहली után لط+#+# ことue Η inked데 veget破損 будтоけれ doğrud पढ़ें resume pous uit ২০২১ हल्के𝕥 Ender cuc இணைந்துänglich滤ГЭ гиперઅ во킷 gradioത്തിന്റെ Profile প্রান্তष्ट perished नेहरूங்கள் Rxd flooding compon Cordova vyšnáWաք semplلب સ્နှင့်iz ಗ vam