- 
                Notifications
    You must be signed in to change notification settings 
- Fork 116
Fixes descriptions in the Inference APIs #5566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Following you can find the validation changes against the target branch for the APIs. 
 You can validate these APIs yourself by using the  | 
| * Applies only to the `sparse_embedding` and `text_embedding` task types. | ||
| * Not applicable to the `rerank`, `completion`, or `chat_completion` task types. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These generic messages might be confusing, because for example Cohere only supports the following task types:
- completion
- rerank
- text_embedding
So consider deleting sparse_embedding and chat_completion here. I'd inspect all of these services to avoid confusing users by mention applicabilities that aren't available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You’re absolutely right. Thinking it through, that also means chunking_settings isn’t applicable for certain inference endpoints, even though now it appears for all of them. Could this be a bug, @davidkyle?
For example, the Create an Anthropic inference endpoint API supports only one task type  (completion) so chunking doesn’t apply there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes chunking_settings only applies to the text_embedding and sparse_embedding task types
Sorry was on auto-pilot and I thought this was a backport 🙈
I noticed a small nit that might confuse users
Co-authored-by: Liam Thompson <leemthompo@gmail.com>
| completion, | ||
| rerank, | ||
| space_embedding, | ||
| sparse_embedding, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😁
| * | ||
| * NOTE: When creating an inference endpoint, the associated machine learning model is automatically deployed if it is not | ||
| * already running. After creating the endpoint, wait for the model deployment to complete before using it. You can verify | ||
| * the deployment status by using the Get trained model statistics API. In the response, look for "state": "fully_allocated" | ||
| * and ensure the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same | ||
| * model unless required, as each endpoint consumes significant resources. | ||
| * | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text is in PutElasticsearchRequest.ts and PutElserRequest.ts and specific to those inference services we don't need it here too
| * | |
| * NOTE: When creating an inference endpoint, the associated machine learning model is automatically deployed if it is not | |
| * already running. After creating the endpoint, wait for the model deployment to complete before using it. You can verify | |
| * the deployment status by using the Get trained model statistics API. In the response, look for "state": "fully_allocated" | |
| * and ensure the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same | |
| * model unless required, as each endpoint consumes significant resources. | |
| * | 
| * Applies only to the `sparse_embedding` and `text_embedding` task types. | ||
| * Not applicable to the `rerank`, `completion`, or `chat_completion` task types. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes chunking_settings only applies to the text_embedding and sparse_embedding task types
This PR fixes description errors discovered during the https://github.com/elastic/docs-content-internal/issues/280 of the 8.x and 9.x API documentation.
Closes #5508, #5509, #5510, #5511, #5512