Thank you for the excellent work. I noticed that several efforts have been made to improve training speed. I’d like to ask: roughly how long does it take to train 1M steps now? Additionally, which part of the model is the main bottleneck limiting the training speed?