Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Typed Erlang/Elixir/Gleam Definitions from Rust #628

Open
SichangHe opened this issue Jul 6, 2024 · 8 comments
Open

Generate Typed Erlang/Elixir/Gleam Definitions from Rust #628

SichangHe opened this issue Jul 6, 2024 · 8 comments

Comments

@SichangHe
Copy link

So, what is preventing this?

It seems that we do have the source of truth from the Rust side. nif, NifStruct, and other Rustler macros have access to the typed raw arguments/fields; they just do not store that information in the Nif struct, yet. If we modify the macros to store those information, we would have them all in rustler::codegen_runtime::inventory::iter::<rustler::Nif>() when calling init!, right?

Code generation on the BEAM side would be more complicated, though. A few problems I can think of:

  • Where do we put the generated file?
  • Where to load the dylib from?
  • How to inject user configuration?
  • How to convert Rust types to BEAM types or TypeSpecs?
  • How to handle external types?

In #85, there seemed to be interest from @tsloughter and @lpil, but the rebar3_run and rebar3_cargo mentioned there seem to have staled. I guess it would be much better to just integrate most of the code generation into Rustler itself.

@filmor
Copy link
Member

filmor commented Jul 6, 2024

These are really separate subjects. Check out #614, that's the avenue I'm currently exploring. The best option to make the signatures survive is probably to add another separate exported C function at compile time.

@dvic
Copy link
Contributor

dvic commented Jul 6, 2024

I was literally just now searching for a solution to this problem 😄

I found this project, which might be interesting: https://github.com/zefchain/serde-reflection/tree/main/serde-generate

@filmor
Copy link
Member

filmor commented Jul 6, 2024

One problem that we have that serde doesn't have to deal with is that NIF libraries are meant to be loaded within the context of the BEAM and thus need to have all of the enif_* symbols defined. We thus can't just build a binary from the NIF library and have it "just work". What we /can/ do is fake all of the symbols s.t. linking goes through and then call a selected subset of functions to inspect the library. This is what I am doing now with the linked PR. It works for all NIF libraries (not just Rustler-generated ones).

Generating suitable signatures is a bit more tricky. We can extend the Encoder and Decoder traits to provide type signature information. A simpler way for now would be to just add an attribute to define the signature and expose it on a new nif_signatures function that we then try to load in the generator.

@bjorng Would specifying such a function via an EEP be interesting? I guess extending ErlNifFunc is at the very least more cumbersome to do.

@SichangHe
Copy link
Author

@filmor, thanks a lot for the quick response!

One problem that we have that serde doesn't have to deal with is that NIF libraries are meant to be loaded within the context of the BEAM and thus need to have all of the enif_* symbols defined[…] This is what I am doing now with the linked PR. It works for all NIF libraries (not just Rustler-generated ones).

Sorry, I must admit that I am unfamiliar with this. So, you are saying you want to generate the BEAM-side definitions via inspecting the NIF dylibs, after compiling to them?

Generating suitable signatures is a bit more tricky. We can extend the Encoder and Decoder traits to provide type signature information.

My understanding is Rustler has that signature information during macro expansion, correct?
Some thoughts I had for simpler ways to get that information include

  • letting the init! macro produce some side effects, e.g. write the signatures to a JSON file; or
  • bundling the BEAM-side definition generation directly into init!, via file writing.

These ideas seem simpler to implement except for the problems I mentioned earlier, though I am not sure what larger problems there are with these approaches.

@filmor
Copy link
Member

filmor commented Jul 6, 2024

I'm sorry, but you can't simultaneously say that you aren't familiar with the details of NIF libraries and then claim you found a "simpler way". Of course I thought about just injecting data into the library file. But you can not access this data without a PE/ELF/MachO binary reader (yes, Windows, Linux and macOS use different file formats) which I deem complete overkill for this exercise.

/edit: I also considered writing a file during build, but apart from build.rs shenanigans (and even for those I don't see a clear path) I don't think this is possible right now.

Also, no, Rustler doesn't (necessarily) have the full type information during macro expansion. The macro expansion (like in Elixir) runs on a "token stream", so just a little bit better than on bare text. We might be able to extract information at runtime(!) from this by extending Encoder and Decoder, which brings us back to having to load the library, which is what the tool in the referenced PR does.

@SichangHe
Copy link
Author

Okay, I was wrong. I apologize. We don't have rustler::codegen_runtime::inventory::iter::<rustler::Nif> at compile time, only at run time. Talking about writing to a file at compile time was silly, and no wonder you are trying to bake the functionality into the generated dylib instead.

And… yes, proc macro side effects are unintended and build scripts are the ones for them instead.
So, the Rust-token-based approach would be to make another dedicated CLI tool to extract function signatures from the Rust source, which would involve parsing each Rust file using syn1, reusing rustler_codegen to gather NIF information, and outputting the results. The problem, as you have mentioned, would be the shallow type information.

Then, it indeed seems to be fewer troubles to inject the data into the dylib, and then loading it back out with your helper CLI.

Footnotes

  1. https://github.com/fzyzcjy/flutter_rust_bridge/ does code generation like that.

@filmor
Copy link
Member

filmor commented Jul 6, 2024

Just writing down some notes:

  • We should also generate -opaques for all resource types
  • Maybe we can build the type specs using NIF enums with some convention on how to handle composites, eg struct Elem { is_or, is_and, type, count, Elem[count] }
@SichangHe
Copy link
Author

[…] NIF libraries are meant to be loaded within the context of the BEAM and thus need to have all of the enif_* symbols defined. We thus can't just build a binary from the NIF library and have it "just work".

What if we can, @filmor? If we feature gate rustler-sys, we can get a library that only has the NIF information but not the functions themselves, then we can manipulate the Nifs freely in Rust.

Also, would there be any advantages manipulating the types on the BEAM instead of in Rust?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants