Conversation
There was a problem hiding this comment.
Why is this being deleted entirely?
There was a problem hiding this comment.
It's file is too heavy to commit so I shifted it to LFS. but I didn't change this file. so just reject this change
| outputs/train/aloha_sim_transfer_cube_human/model_10000.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| resnet18-f37072fd.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| outputs/train/act_aloha_sim_transfer_cube_human/model_10000.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| *.ipynb filter=lfs diff=lfs merge=lfs -text |
There was a problem hiding this comment.
Why is this being added? Are you saying you want to keep .iypnb files as lfs hosts?
There was a problem hiding this comment.
Although we need to work on the naming. No spaces in the names. I'd create a directory called tests/ and put test.py under there
There was a problem hiding this comment.
Are you deleting this file entirely, or are you shifting it to LFS? If so, where's the LFS upload?
There was a problem hiding this comment.
It's file is too heavy to commit so I shifted it to LFS. but I didn't change this file. so just reject this change
| import tinygrad | ||
| from tinygrad import Tensor, nn, dtypes | ||
| from tinygrad.ops import Variable | ||
| # from tinygrad.ops import Variable |
There was a problem hiding this comment.
No dangling comments. Why is this commented out? What shifted?
There was a problem hiding this comment.
it has been shifted to "from tinygrad import Variable"
There was a problem hiding this comment.
Where's that import? Is it needed now?
| mean_val = stats[key]["mean"] | ||
| std_val = stats[key]["std"] | ||
| # Convert to numpy if needed (handle both numpy arrays and torch tensors) | ||
| if hasattr(mean_val, 'numpy'): | ||
| mean_val = mean_val.numpy() | ||
| if hasattr(std_val, 'numpy'): | ||
| std_val = std_val.numpy() | ||
| buffer["mean"].assign(mean_val) |
There was a problem hiding this comment.
This is okay, but it's just verbose. Is there any way to make this into a one or two liner?
| min_val = stats[key]["min"] | ||
| max_val = stats[key]["max"] | ||
| # Convert to numpy if needed (handle both numpy arrays and torch tensors) | ||
| if hasattr(min_val, 'numpy'): | ||
| min_val = min_val.numpy() | ||
| if hasattr(max_val, 'numpy'): | ||
| max_val = max_val.numpy() | ||
| buffer["min"].assign(min_val) |
There was a problem hiding this comment.
I back up your result and trained it again to compare results. You can reject the change.
There was a problem hiding this comment.
I back up your result and trained it again to compare results. You can reject the change.
| import pathlib | ||
| resnet18_IMAGENET1K = ResNet(Block, [2, 2, 2, 2], num_classes=1000) | ||
| state_dict = nn.state.safe_load("resnet18-f37072fd.safetensors") | ||
| model_path = pathlib.Path(__file__).parent / "resnet18-f37072fd.safetensors" |
| # parser.add_argument("--env_name", type=str, choices=['AlohaTransferCube-v0', 'AlohaInsertion-v0'], default='AlohaTransferCube-v0') | ||
| parser.add_argument("--env_name", type=str, choices=['AlohaTransferCube-v0', 'AlohaInsertion-v0'], default='AlohaInsertion-v0') | ||
| # parser.add_argument("--model_path", type=str, default='outputs/train/aloha_sim_transfer_cube_human/model_final.safetensors') | ||
| # parser.add_argument("--model_path", type=str, default='outputs/train/aloha_sim_transfer_cube_human/model_final.safetensors') | ||
| parser.add_argument("--model_path", type=str, default='outputs/train/aloha_sim_insertion_human/model_30000_original.safetensors') |
There was a problem hiding this comment.
No dangling comments. Also, why would we peg the model to open to 30,000 instead of final?
There was a problem hiding this comment.
in the original version, there was only the model with 30000 steps. so I just run the train.py file again to get the models and compared the results. both 30000 steps and the final models are not good now. you can reject the change and help me to compare the results
| fps = env.metadata["render_fps"] | ||
|
|
||
| # Encode all frames into a mp4 video. | ||
| video_path = output_directory / "rollout.mp4" |
There was a problem hiding this comment.
it is just for testing to compare your result before
| # from lerobot.common.datasets.lerobot_dataset import LeRobotDataset | ||
|
|
||
| from lerobot.datasets.lerobot_dataset import LeRobotDataset |
There was a problem hiding this comment.
Didn't know it shifted.
Again, no dangling comments
| # Start of training code | ||
| parser=argparse.ArgumentParser(description="Argument Parser for ACT training on simulated environments") | ||
| parser.add_argument("env_name", type=str, choices=['aloha_sim_transfer_cube_human', 'aloha_sim_insertion_human'], default='aloha_sim_insertion_human') | ||
| parser.add_argument("--env_name", type=str, choices=['aloha_sim_transfer_cube_human', 'aloha_sim_insertion_human'], default='aloha_sim_insertion_human') |
| ######################################################################## | ||
| # Handle unused parameters by assigning zero gradients | ||
| optimizers_list = [opt] | ||
| if cfg.train_backbone_separately: | ||
| optimizers_list.append(opt_backbone) | ||
| for optimizer in optimizers_list: | ||
| for param in optimizer.params: | ||
| if param.grad is None: | ||
| # Create zero gradient with same shape and device as parameter | ||
| param.grad = Tensor.zeros(*param.shape, device=param.device, requires_grad=False) |
There was a problem hiding this comment.
There were some parameters not have gradients so when I run it, I got the error as the attachment. I searched and it said that the issue is that opt.step() is being called without first ensuring all parameters have gradients. The problem is that not all parameters are used in the forward pass, so they don't get gradients. then I changed the code like this
| batch_converted = {} | ||
| for k, v in batch.items(): | ||
| if isinstance(v, torch.Tensor): | ||
| batch_converted[k] = Tensor(v.detach().cpu().numpy(), requires_grad=False) | ||
| else: | ||
| batch_converted[k] = v # Keep strings, lists, etc. as-is | ||
|
|
||
| batch = batch_converted |
There was a problem hiding this comment.
So this makes sense in theory. But why are you mixing Torch and tinygrad tensors? Where's it coming from?
|
cool - @CongHoanCoder , just clean up the pull request (e.g. no commented out imports / code, that's sloppy. also no randomly deleted files). I can check again and validate on my side when you're ready |

I make a few changes from your code. Please help me review it. Thanks