[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431
Labels
bug
Something isn't working
Maintainers on vacation
Maintainers are on vacation so they can recharge their batteries, we will be back soon ;)
🐛 Bug
I am developing a custom Feature Extractor Type (based on DeepSets) for SB3 and want to train + optimize it with sb3_zoo. For it I add the following to a custom config.py file:
This works well with the normal train.py (Arguments: '--algo', 'a2c', '--conf-file', 'path/to/config.py', '--gym-packages', 'path.to.config', '--n-timesteps', '100', '--device', 'cpu', '-P', '--env', 'env-name', ...)
When adding '-optimize' the training fails (actions contain NaN as I encode invalid observations that are discarded by the custom FeatureExtractorSet with NaN). Closer investigation shows that the
objective
function updatedself._hyperparams
which contains the sub-dict{'policy_kwargs': {'feature_extractor_class': FeatureExtractorSet}}
with the sampled hyper-parameters that also set other policy_kwargs then feature_extractor_class.I would suggest replacing
rl-baselines3-zoo/rl_zoo3/exp_manager.py
Line 741 in 28dc228
To Reproduce
No response
Relevant log output / Error message
No response
System Info
Checklist
The text was updated successfully, but these errors were encountered: