-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dynamo issue #6527
Fix dynamo issue #6527
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oraluben Thank you for offering a great investigation! I think this is a clean and simple solution for the issue.
my patch in
|
@oraluben - sorry this PR has taken so long to be merged, I think it just needed to have master merged again to get the XPU fixes. |
Dynamo use faketensor to trace tensor ops. In some case, the mechanism break compiling with deepspeed.
An example could be found at https://gist.github.com/oraluben/9b8240c2fe482eb4382453d6c97a5f76, to see issues, install deepspeed==0.14.4 instead of my fork
without this PR, llama cannot be compiled.
Detailed explanation:
ZeROOrderedDict
dynamo use deepcopy to copy tensors, which will call
object.__reduce__
. When copyingZeROOrderedDict
, the default implementation do not copy its_parent_module
and will lead to failure.param
maybe faketensor and do not haveds_status
yet, but during tracing it's ok to just skip theregister_external_parameter
, it should be done ways before.