-
Notifications
You must be signed in to change notification settings - Fork 44
Embedding gemma #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Cherry-Pick of 86bb1265168363cc5096b8df5f82075a5702ef2e Co-authored-by: Tom Nickson <tnickson@apple.com>
- Add useBidirectionalAttention config parameter - Apply sliding window size adjustment for bidirectional mode - Implement createBidirectionalSlidingWindowMask function - Update mask creation logic to support both causal and bidirectional attention - Based on patches 40694 and 40700 for EmbeddingGemma support Cherry-Picked Commit: 46be017e9f4b076f2d0842cf78175ac42d894b0a Co-authored-by: Tom Nickson <tnickson@apple.com>
Cherry-Picked Commit: 8dc179ccc21b26fb0856016ec9f2b7d5792979e0 Co-authored-by: Tom Nickson <tnickson@apple.com>
Commit: 733e142542cfaf85ca0304d37f908b176c54edfc Co-authored-by: Tom Nickson <tnickson@apple.com>
Commit: 96ee882cd7c6fd3573b034686d3f3c5afe1ee04a Co-authored-by: Tom Nickson <tnickson@apple.com>
| // Copyright © 2024 Apple Inc. | ||
|
|
||
| import MLX | ||
| import MLXLMCommon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just to pick up the quantization? Think the Embedders should not require MLXLMCommon / MLXLLM if possible. A copy of Quantization is OK. If we end up with a lot of duplication then I think MLXLMCommon might make sense.
| "gemma3_text": { | ||
| url in | ||
| let configuration = try JSONDecoder().decode( | ||
| Gemma3TextConfiguration.self, from: Data(contentsOf: url)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to copy this config type into Embedders rather than add new linkage. Even sharing config types between models in the same library is rarely done.
| import MLXLLM | ||
| import MLXLMCommon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See elsewhere -- I think this should be done without adding linkage to additional libraries.
|
|
||
| @ModuleInfo private var model: Gemma3Model | ||
| @ModuleInfo(key: "lm_head") var lmHead: Linear | ||
| @ModuleInfo(key: "lm_head") var lmHead: Module // Can be Linear or QuantizedLinear |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QuantizedLinear is a subtype of Linear so this should have been OK as-is -- did you see a problem here?
|
@dmunch this looks good overall, but see my comments about not adding a new dependency on MLXLMCommon and MLXLLM. Thanks! |
Proposed changes
Adding support for
EmbeddingGemmato Embedders.Basically cherry-picked all commits from ml-explore/mlx-swift-examples#398 from to the new repository, made sure everything compiles and ran swift-format.
Pretty new to MLX, and thought that would be a good learning opportunity and try to get something in. Let me know what kind of modifications etc. you'd still like to have to get this merged.
Checklist
Put an
xin the boxes that apply.pre-commit run --all-filesto format my code / installed pre-commit prior to committing changes