[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

kingjin94 · 2023-12-18T12:26:53Z

🐛 Bug

I am developing a custom Feature Extractor Type (based on DeepSets) for SB3 and want to train + optimize it with sb3_zoo. For it I add the following to a custom config.py file:

gym.register(
    "env-name",
    class,
    kwargs)

hyperparams = {
    "env-name": dict(
        policy="MultiInputPolicy",
        policy_kwargs={
            "features_extractor_class": FeatureExtractorSet,
            "features_extractor_kwargs": {
                "features_dim": 10
            }
        }
    )
}

This works well with the normal train.py (Arguments: '--algo', 'a2c', '--conf-file', 'path/to/config.py', '--gym-packages', 'path.to.config', '--n-timesteps', '100', '--device', 'cpu', '-P', '--env', 'env-name', ...)

When adding '-optimize' the training fails (actions contain NaN as I encode invalid observations that are discarded by the custom FeatureExtractorSet with NaN). Closer investigation shows that the objective function updated self._hyperparams which contains the sub-dict {'policy_kwargs': {'feature_extractor_class': FeatureExtractorSet}} with the sampled hyper-parameters that also set other policy_kwargs then feature_extractor_class.

I would suggest replacing

rl-baselines3-zoo/rl_zoo3/exp_manager.py

Line 741 in 28dc228

kwargs.update(sampled_hyperparams)

with a deep_update (e.g. from pydantic).

To Reproduce

No response

Relevant log output / Error message

No response

System Info

OS: Linux-5.15.0-91-generic-x86_64-with-glibc2.31 # 101~20.04.1-Ubuntu SMP Thu Nov 16 14:22:28 UTC 2023
Python: 3.9.18
Stable-Baselines3: 2.2.1
PyTorch: 2.1.1+cu121
GPU Enabled: True
Numpy: 1.26.2
Cloudpickle: 3.0.0
Gymnasium: 0.29.1

Checklist

I have checked that there is no similar issue in the repo
I have read the SB3 documentation
I have read the RL Zoo documentation
I have provided a minimal and working example to reproduce the bug
I've used the markdown code blocks for both code and stack traces.

The text was updated successfully, but these errors were encountered:

kingjin94 · 2023-12-18T12:27:55Z

A suggested bugfix: https://github.com/kingjin94/rl-baselines3-zoo/tree/fix/deep_update_exp_manager_objective

kingjin94 added the bug Something isn't working label Dec 18, 2023

araffin added the Maintainers on vacation Maintainers are on vacation so they can recharge their batteries, we will be back soon ;) label Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

kingjin94 commented Dec 18, 2023 •

edited by araffin

Loading

kingjin94 commented Dec 18, 2023

[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

[Bug]: Custom Sub-Hyperparameters during train.py -> Optimize #431

Comments

kingjin94 commented Dec 18, 2023 • edited by araffin Loading

🐛 Bug

To Reproduce

Relevant log output / Error message

System Info

Checklist

kingjin94 commented Dec 18, 2023

kingjin94 commented Dec 18, 2023 •

edited by araffin

Loading