Skip to content

Added Dropout to ResNet

t.glass requested to merge resnet_dropout into master

Removed inplace=True for Dropout in ConvModule because it caused an Error.

2023-11-08 15:38:27 [CRITICAL]   File "/usr/local/bin/atlas", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/atlas/cli/atlas_cli.py", line 74, in dl_train
    dl.train(experiment_folders=experiment_folders,
  File "/usr/local/lib/python3.10/dist-packages/atlas/cli/dl.py", line 151, in train
    trainer.train()
  File "/usr/local/lib/python3.10/dist-packages/atlas/training/trainer.py", line 221, in train
    self._train()
  File "/usr/local/lib/python3.10/dist-packages/atlas/torch/training/contrastive.py", line 228, in _train
    scaler.scale(loss).backward()
  File "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py", line 492, in backward
    torch.autograd.backward(
  File "/usr/local/lib/python3.10/dist-packages/torch/autograd/__init__.py", line 251, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass

2023-11-08 15:38:27 [CRITICAL] RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.HalfTensor [512, 512, 8, 8]], which is output 0 of ReluBackward0, is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Merge request reports