This is impressive work that has provided me with much inspiration. I noticed that during both the first stage (3b_stage1.sh) and the second stage (3b_cot.sh) of training, all other parameters are being updated, but the saved weights for learnable_query remain unchanged. This suggests that the learnable_query module may not be undergoing training.
Has the author encountered this phenomenon before? I would greatly appreciate any insights or explanations.

This is impressive work that has provided me with much inspiration. I noticed that during both the first stage (3b_stage1.sh) and the second stage (3b_cot.sh) of training, all other parameters are being updated, but the saved weights for learnable_query remain unchanged. This suggests that the learnable_query module may not be undergoing training.
Has the author encountered this phenomenon before? I would greatly appreciate any insights or explanations.
