-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Langchain based ask approaches not compatible with 0613 (Chat Completions) #541
Comments
Hi @pamelafox : Will be waiting for an update on this, we are trying to deploy it in a corporate scenario, I managed to build an ARM template from the bicep template you have shared and removed role permissions required, I am using 2 models 1. Chat Gpt 35 turbo with version 0301 and chat gpt 35 turbo 16k with 0613 to deploy this but fail to deploy the template with an error saying 'standard' is not part of the 0613 version, if I remove the scale standard and deploy it fails the validation. |
To clarify: The sample includes 4 different RAG (Retrieval-Augmented Generation) approaches: ChatReadRetrieveRead, ReadDecomposeAsk, ReadRetrieveRead, RetrieveThenRead. The two default approaches are ChatReadRetrieveRead and RetrieveThenRead, and they are both working very well with the Chat Completion APIs. The other two approaches use Langchain and the current code only works with the older Completion API (0301). Those approaches can be deleted from the code/UI, and the app would still work. Is the problem that you definitely need to use those other two approaches for your particular use case, or is the problem that you can't deploy 0613? Do you have the latest main.bicep and cognitiveservices.bicep? The method of specifying capacity changed a few months ago. |
Hi Pamela, Thank you for getting back really quickly. Our org currently doesn't allow a bicep file download (.exe file download), so I had to convert the bicep file into an ARM template, remove permissions and roles, and then deploy. I am attaching the ARM template. Regarding your note on the 4 RAG approaches I might just need the first two, how do I modify the code, the app backend code in Python that I need to modify? Where can I get more information about this as the readme doesn't detail this? Lastly, deploy to Azure functionality was super useful on other examples can we expect something like that for this? |
And on your comment on capacity, even though I use the template where the set capacity is 30, I still get an error while deploying that 120 tokens are necessary, Do I need to configure any additional settings? |
There is currently an issue (Azure/bicep-types-az#1660) where we can't deploy a capacity greater than what's remaining in our account, even if the deployments will replace whats in the account. So what I do in that case is go into the Azure OpenAI studio, edit each deployment so that it has 1 TPM, and then try azd up again. |
Hi @aparnasharmav, could you please provide more details on this -> "removed role permissions required". I suppose this is done for below requirement?
I'd also like to get rid of this requirement if I can. Thanks. |
No longer an issue as they have been removed. |
Co-authored-by: Ian Seabock (Centific Technologies Inc) <v-ianseabock@microsoft.com>
This issue is for a: (mark with an
x
)Minimal steps to reproduce
We recently changed to version from 0301 to 0613, since Azure isn't allowing new 0301 deployments. Unfortunately, 0613 only supports the new Chat Completions API, not the old Completions API, and the LangChain agents all assume use of the Completions API.
I have a branch that attempts to update the LangChain code to use Chat Completions, but am still QAing it.
The text was updated successfully, but these errors were encountered: