Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider allowing sublayer/variable reassignment #18601

Open
mattdangerw opened this issue Oct 12, 2023 · 2 comments
Open

Consider allowing sublayer/variable reassignment #18601

mattdangerw opened this issue Oct 12, 2023 · 2 comments
Labels
type:feature The user is asking for a new feature.

Comments

@mattdangerw
Copy link
Member

We are having a lot of discussion around lora/quantization and other PEFT strategies that require replacing layers (often dense layers), with parameter efficient replacements.

In torch, a nn.Module will track submodules by name. So you can run module.sub_module = new_sub_module without issue. The old child will be booted out for the new. Same for tf.Module.

This is not the case with Keras layers. Tracking is currently "append only," not by attr name, and locked after build(). To work around this for a LoRA implementation, we currently do the following.

def replace(parent, name, replacement):
    locked = parent._tracker.locked
    parent._tracker.locked = False
    target = getattr(parent, name)
    setattr(parent, name, replacement)
    parent._layers[parent._layers.index(target)] = replacement
    parent._tracker.locked = locked

We should consider whether we want to allow this for Keras layers with public APIs. Potentially by extending the tracker to track by attr name.

@mattdangerw mattdangerw added the type:feature The user is asking for a new feature. label Oct 12, 2023
@mattdangerw mattdangerw changed the title Consider allowing sublayer reassignment Oct 12, 2023
@mattdangerw
Copy link
Member Author

@fchollet thought on this?

My thoughts are that since both torch and tf allow variable and submodule reassignment, we should probably do the same. Consistency here will cause less headaches. This could also come up for code like this...

def build():
    self.bias = self.add_weight(...)
    ...
    if some_complex_case:
        self.bias = None  # Nevermind no bias!

I am less sure about locking the tracker after build. We could...

  • Allow reassignment only.
  • Not lock tracking at all.
  • Force an explicit unlock, e.g. layer.unlock().
  • Have no public API support for this. Only possible via _tracker shenanigans.

This is tricky, because this kind of mutation after fit/predict/evaluate could land you in hot water. Optimizer state will be invalid. Compiled functions too.

@fchollet
Copy link
Member

I think we can automate in __setattr__ what you're currently doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature The user is asking for a new feature.
2 participants