Confusing `finish_reason` when using `max_tokens` property in 'v1/chat/completions' endpoint #3533

daJuels · 2024-09-10T13:58:51Z

LocalAI version:

v2.20.1 a9c521eb41dc2dd63769e5362f05d9ab5d8bec50

Environment, CPU architecture, OS, and Version:
OS: 5.10.0-28-amd64 #1 SMP Debian 5.10.209-2 (2024-01-31) x86_64 GNU/Linux
ENV: Docker version 26.0.1, build d260a54
HW: i9-10900F, RTX3080, 128GB RAM

Describe the bug
When using the Endpoint 'v1/chat/completions' with the max_tokens parameter set to a specific value, the completion may be cut off, but the finish_reason remains stop instead of changing to length, making it difficult to determine if the answer is complete.

Additionally, when not using the max_tokens property, the response may still be cut off, but the finish_reason remains 'stop'.

To Reproduce

Send a request to the v1/chat/completions endpoint with the max_tokens property set to a specific value (e.g., 20).
Observe the response.

Expected behavior
When the max_tokens property is set, the response should clearly indicate if the completion is complete or not. If the completion is cut off, the finish_reason should be length instead of stop.

The text was updated successfully, but these errors were encountered:

daJuels added bug Something isn't working unconfirmed labels Sep 10, 2024

mudler added confirmed and removed unconfirmed labels Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusing `finish_reason` when using `max_tokens` property in 'v1/chat/completions' endpoint #3533

Confusing `finish_reason` when using `max_tokens` property in 'v1/chat/completions' endpoint #3533

daJuels commented Sep 10, 2024

Confusing finish_reason when using max_tokens property in 'v1/chat/completions' endpoint #3533

Confusing finish_reason when using max_tokens property in 'v1/chat/completions' endpoint #3533

Comments

daJuels commented Sep 10, 2024

Confusing `finish_reason` when using `max_tokens` property in 'v1/chat/completions' endpoint #3533

Confusing `finish_reason` when using `max_tokens` property in 'v1/chat/completions' endpoint #3533