first version of flash_attention for jax #19743

vulkomilev · 2024-05-22T20:11:20Z

This is my first version of the flash attention implementation .It is just for Jax.

fchollet · 2024-05-23T20:34:00Z

Thanks for the PR! Have you tried to time it on GPU compared to regular attention? I was under the impression that we were going to need a custom Pallas kernel for this.

vulkomilev · 2024-06-05T18:58:23Z

I have used /keras/src/layers/attention/ directory as a template for implementing a flash attention but I don't understand how the mask is generated in the benchmark. I need one but I don't see it

gbaned · 2024-07-12T03:02:41Z

Hi @fchollet Can you please review this PR? Thank you!

abhaskumarsinha · 2024-07-20T06:58:58Z

keras/src/ops/nn.py

@@ -76,6 +82,44 @@ def relu6(x):
        return Relu6().symbolic_call(x)
    return backend.nn.relu6(x)

+@keras_export(["keras.ops.flash_attention", "keras.ops.nn.flash_attention"])


Hello there,

That's a very wonderful addition. I've a bit of doubt there:

I suppose ops.flash_attention is an operation, while nn.flash_attention is just a neural network layer.
The basic difference between these two is that - an operation may not have any trainable parameter with it, while a neural network layer should have trainable parameters.

Am I right till now?

If yes, please provide separate examples of each one of them in the docs!

Best Regards,
Abhas Kumar Sinha

abhaskumarsinha · 2024-07-20T07:02:00Z

keras/src/ops/nn.py

+    Returns:
+        A tensor with the same shape as `x`.
+
+    Example:


Enclose the Example with "```".

"""python This is an example documentation. Example: ``` example = example() ``` """

This helps automated doc renderers to automatically find out code examples from the program docs and render those parts accordingly.

fchollet · 2024-10-08T00:04:28Z

An alternative contribution has been merged. Thanks for the PR in any case!

first version of flash_attention for jax

1754863

google-ml-butler bot added the size:L label May 22, 2024

google-ml-butler bot assigned gbaned May 22, 2024

gbaned requested a review from fchollet May 27, 2024 06:04

google-ml-butler bot added the awaiting review label May 27, 2024

gbaned removed the awaiting review label May 27, 2024

abhaskumarsinha reviewed Jul 20, 2024

View reviewed changes

fchollet closed this Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first version of flash_attention for jax #19743

first version of flash_attention for jax #19743

vulkomilev commented May 22, 2024

fchollet commented May 23, 2024

vulkomilev commented Jun 5, 2024

gbaned commented Jul 12, 2024

abhaskumarsinha Jul 20, 2024

abhaskumarsinha Jul 20, 2024

fchollet commented Oct 8, 2024

first version of flash_attention for jax #19743

first version of flash_attention for jax #19743

Conversation

vulkomilev commented May 22, 2024

fchollet commented May 23, 2024

vulkomilev commented Jun 5, 2024

gbaned commented Jul 12, 2024

abhaskumarsinha Jul 20, 2024

Choose a reason for hiding this comment

abhaskumarsinha Jul 20, 2024

Choose a reason for hiding this comment

fchollet commented Oct 8, 2024