Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theia AI LLM Support [Experimental] #14048

Merged

Conversation

sdirix
Copy link
Member

@sdirix sdirix commented Aug 14, 2024

What it does

Implements support for creating assistants ("agents") for Theia-based applications via integrating AI (more specifically "LLM") support as optionally consumable Theia extensions, a.k.a "Theia AI"

The base functionality is provided by the following extensions:

  • @theia/ai-core
  • @theia/ai-chat
  • @theia/ai-chat-ui

ai-core contains the basic LLM integration and defines the core concepts for interacting with LLM via agents, prompts and variables. ai-chat builts on top to define a model for chat like conversations. ai-chat-ui provides the actual Chat UI.

The AI integration was built from the ground up to be flexible, inspectible, customizable and configurable. More precisely, we want to follow these core principles:

  1. Transparency of Communication: Theia AI provides full visibility into the history of data exchanged between the IDE and the Large Language Models (LLMs).
  2. Transparency of Data Interactions: Users gain full transparency over data accessed and modified by the LLM, including any automatic changes like inserted code snippets.
  3. Openness of Prompts: Prompts used by AI agents are visible and customizable, allowing developers to optimize LLM requests for their specific environments and innovate in designing efficient prompts for developer tasks.
  4. Flexibility in Choosing LLMs: Theia AI supports integrating arbitrary LLMs, enabling developers to use their preferred models and hosting environments, whether cloud-based, self-hosted, or local solutions, providing greater independence from proprietary constraints.

This feature is still highly experimental. We will actively continue to develop this feature. However, we want to provide early access to potential adopters and users of the Theia IDE. Therefore, even when the AI extensions are included in a Theia based application, they are turned off by default and need to be enabled in the preferences. The preferences include a convenient turn all AI features off setting.

Additional features and integrations are offered by the remaining
extensions:

  • @theia/ai-history
  • @theia/ai-code-completion
  • @theia/ai-terminal
  • @theia/ai-workspace-agent
  • @theia/ai-openai

ai-history offers a service to record requests and responses. The recordings can be inspected via the 'AI History View'. ai-code-completion offers AI based code completion via completion items and inline suggestions. ai-terminal offers a specialized AI for the Theia terminal which will suggest commands to execute. ai-workspace-agent is a specialized agent which is able to inspect the current workspace content for context specific questions. ai-openai integrates the LLM offerings of Open AI into Theia as an LLM provider

How to test

  • Start either the browser of Electron example application. Note that the only available LLM integration via Open AI is running in the backend. So while most of the features conceptually work in browser-only, they can't be tested without integrating a browser-only LLM
  • Enable the AI features within the preferences
  • If you don't have the OPENAI_API_KEY environment variable set, then configure your API key in the preferences
  • Play around with the Chat, Terminal, Code Completion, History, Prompt Editing etc. etc.

Demo examples

Chat View

chat-view1

chat-view2

CommandExample

Prompt Editing

prompt-editing

Terminal

TerminalAI

Architecture

The following diagram gives a rough overview over the architecture and the intended communication paths

architecture1

Chat Flow

  • The user invokes a new session via the Chat UI which in turn uses the ChatService
  • The ChatService resolves variables, if there are any in the request, and determines the agent to use. It invokes the ChatAgentService to do so
  • The invoked agent is responsible to collect all information it requires and then invoke the desired LLM. The base implemention will
    • use the PromptService to receive its basic prompt. The PromptService may return the originally registered prompt or a user modified variant
    • resolve variables and tools specified within the prompt (or request). The user is able to inspect all variables and tools the agent will use via an "AI Configuration" view.
    • ask for an LLM with specific capabilities from the LanguageModelRegistry. The user is able to manually specify which specific LLM shall be used within the "AI Configuration" view
    • finally invoke the LLM
  • The response of the LLM is converted by the agent into ChatResponseParts
  • A registry of ChatResponsePartRenderers within the UI take over the rendering of the parts. The registry can be contributed to by developers.

Chat Flow Considerations

  • The chat flow is similar to the implementation of VS Code, however it's much more generic and customizable
  • This should allow Theia to support VS Code AI extensions in the future
  • We proxy requests to LLMs to log the requests and responses automatically to output channels
  • Agents are given tools for a more high-level logging capability with the CommunicatonRecordingService which brings its own "AI History" view

Orchestrator Agent

  • The default agent is the Orchestrator. It will invoke an LLM itself to determine which agent shall handle the current request
  • This can be bypassed by manually specifying the agent to be used using @AgentName within the chat request or by specifying another default agent in the preferences

Other Flows and Considerations

Contrary to VS Code we don't force all LLM integrations through the ChatFlow. Agents are not necessarily ChatAgents and can therefore provide customized API tailored to specific use cases. In this first contribution we implemented a CodeCompletionAgent as well as a TerminalAgent which are specialized for their use cases and can't be used in Chat.
Still all agents can participate in global services like the AI History, whether they are chat related or not.

All of the features and concepts are optional. If some developer only wants to use the LLM base layer to invoke their own specialized LLM without going through chats, prompts etc., then this of course possible.

Frontend / Backend Split

architecture2

The diagram above gives a rough overview over the frontend/backend split.

  • All UI related components are, of course, located in the frontend
  • All generic AI services have a common base implementation and can therefore be bound and used in both the frontend and the backend. This is marked in orange in the diagram
  • We expect many LLMs to be integrated in the backend, either because they are actually running locally, the API is only available on server side or secrets need to be hidden. Therefore we implemented an automatic frontend delegation of all backend registered LLMs. As long as the LLM in question follows the provided interfaces, no additional work is necessary to bridge the RCP gap, including response streaming.
  • Most services in the frontend don't use the common implementation but a specialized one, integrating features only available in the frontend. For example as preferences and the workspace are frontend concepts, the respective services only respect these customization in the frontend.
  • Conceptually the AI integration is therefore also ready for browser-only. However in this initial contribution we only include Open AI bindings for the backend.
  • Generally speaking we expect most agents and functionality to be developed within the frontend. However the concept of basic common implementations should be kept so that headless Theia applications can also use the backend Theia AI integration if they want to (grey box)

Follow-ups

As noted above, this contribution is highly experimental. Therefore there are a lot of known issues and potential followups. We still think that this contribution is already valuable enough to be included in Theia.

Incomplete list of follow up topics:

  • Provide LLM integration with Ollama / Llamafile
  • Serve VS Code AI API
  • Refine all default prompts to deliver better results
  • All variables and tools specified in prompts shall be listed per agent in the AI Configuration View
  • Make sure UI contributions of enabled/disabled agents are properly included/excluded on the fly
  • Allow to set hard and soft limits for tool functions
  • Allow no-code registration and configuration of new chat agents
  • Stabilize response format and error handling for unexpected LLM answers
  • Orchestrator Chat Agent should log its own requests and responses
  • Improve AI History view (sort, clear, search etc.)
  • Implement a code-diff response part
  • Make editor completion configurable (enabled/disable, different prompts per file type)
  • Invoke chat from different contexts, e.g. from editor, from terminal, etc.

Many more features and improvements can be made. We will continue to actively develop this feature, however we think the current code is ready to be evaluated by the community.

We already implemented POCs for an Llamafile integration, RAG based variable resolving as well specialized agents like a "PR Finalization Agent". However these are not yet stable enough to be part of the initial contribution.

We are excited what the future will bring and which concepts and ideas will be contributed by the community.

Review checklist

Reminder for reviewers

Addendum

@dhuebner contributed an initial Ollama integration which allows to use locally executed models. See here for more information.

}

export interface LanguageModel extends LanguageModelMetaData {
request(request: LanguageModelRequest): Promise<LanguageModelResponse>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Some vendors explicitly support code-completion/generation/fill-in-the-middle functionality. Is it planned to have a dedicated LanguageModelRequest type or maybe a dedicated function here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! From a technical standpoint both approaches are feasible, covering all use cases in request with flexible input and output parameters or implementing different functions.

For now all use cases were similar enough that they did not warrant a separate function however that might change with new use cases or other architectural considerations.

@sdirix
Copy link
Member Author

sdirix commented Aug 27, 2024

Update: @dhuebner contributed an initial Ollama integration to this contribution. The ollama integration allows to specify the URL on which to contact ollama and the respective model id. This is very nice as it allows to run and test the AI integration locally.

Note that this is a barebones integration, for example the "tool" integration is missing which is used by the workspace agent. Therefore some of the features are restricted when using Ollama.

These are the suggested follow ups:

  • Tools request handling is not implemented yet.
  • Check for locally available Ollama models and suggest them in the settings.
  • Automatically pull Ollama model on user request.
  • Use Ollama specific prompts (PREFIX/SUFFIX) for code completion requests

@JonasHelming
Copy link
Contributor

@tsmaeder @msujew Now that the release is done, could you have a look at this, as discussed?

examples/browser-only/package.json Outdated Show resolved Hide resolved
packages/ai-chat-ui/src/browser/aichat-ui-contribution.ts Outdated Show resolved Hide resolved
packages/ai-chat-ui/src/browser/chat-input-widget.tsx Outdated Show resolved Hide resolved
packages/ai-chat/src/common/chat-agents.ts Outdated Show resolved Hide resolved
packages/ai-core/src/common/agent.ts Show resolved Hide resolved
packages/ai-terminal/src/browser/ai-terminal-agent.ts Outdated Show resolved Hide resolved
@sdirix
Copy link
Member Author

sdirix commented Sep 6, 2024

Hi @tsmaeder Thank you so much for the great and very detailed review ❤️ !

I tackled the major part of the smaller to medium review comments in commit 566b447 and marked them as resolved. I answered on all review questions.

For everything which is not yet covered I suggest to move the topic to this follow up #14143. I posted a link to the follow up on all relevant review comments.

Merging the current state allows for easier contributions and smaller PRs going forward. So if you are fine with the current state and the suggested follow ups, then please approve. I'll squash to two commits then, the initial EclipseSource contribution and @dhuebner's Ollama support.

@sdirix sdirix requested a review from tsmaeder September 6, 2024 10:26
@sdirix
Copy link
Member Author

sdirix commented Sep 9, 2024

@tsmaeder Thank you for your detailed answers. I applied further changes in commit 4d96c19 and suggested some topics for the follow up ticket #14143. Please let me know if you have further comments or concerns and what is required to get your approval stamp ;)

@sdirix
Copy link
Member Author

sdirix commented Sep 11, 2024

@tsmaeder Thanks for the further comments. I added all remaining topics to the follow up #14143.

Copy link
Contributor

@tsmaeder tsmaeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge the PR as an experimental feature and address the follow-ups at a later stage.

    Implements AI LLM support via optionally consumable Theia extensions.

    The base functionality is provided by the following extensions:
    - @theia/ai-core
    - @theia/ai-chat
    - @theia/ai-chat-ui

    'ai-core' contains the basic LLM integration and defines the core
    concepts for interacting with LLM via agents, prompts and variables.
    'ai-chat' builts on top to define a model for chat like conversations.
    'ai-chat-ui' provides the actual Chat UI.

    The AI integration was built from the ground up to be flexible,
    inspectible, customizable and configurable.

    This feature is still highly experimental. Therefore, even when the AI
    extensions are included in a Theia based application, they are turned
    off by default and need to be enabled in the preferences. The
    preferences include a convenient "Turn all AI features on/off" setting.

    Additional features and integrations are offered by the remaining
    extensions:
    - @theia/ai-history
    - @theia/ai-code-completion
    - @theia/ai-terminal
    - @theia/ai-workspace-agent
    - @theia/ai-openai

    'ai-history' offers a service to record requests and responses. The
    recordings can be inspected via the 'AI History View'.
    'ai-code-completion' offers AI based code completion via completion
    items and inline suggestions.
    'ai-terminal' offers a specialized AI for the Theia terminal which
    will suggest commands to execute.
    'ai-workspace-agent' is a specialized agent which is able to inspect
    the current workspace content for context specific questions.
    'ai-openai' integrates the LLM offerings of Open AI into Theia.

    Co-authored-by: Alexandra Muntean <amuntean@eclipsesource.com>
    Co-authored-by: Camille Letavernier <cletavernier@eclipsesource.com>
    Co-authored-by: Christian W. Damus <cdamus.ext@eclipsesource.com>
    Co-authored-by: Eugen Neufeld <neufeld.eugen@googlemail.com>
    Co-authored-by: Haydar Metin <hmetin@eclipsesource.com>
    Co-authored-by: Johannes Faltermeier <jfaltermeier@eclipsesource.com>
    Co-authored-by: Jonas Helming <jhelming@eclipsesource.com>
    Co-authored-by: Lucas Koehler <lkoehler@eclipsesource.com>
    Co-authored-by: Martin Fleck <mfleck@eclipsesource.com>
    Co-authored-by: Maximilian Koegel <mkoegel@eclipsesource.com>
    Co-authored-by: Nina Doschek <ndoschek@eclipsesource.com>
    Co-authored-by: Olaf Lessenich <olessenich@eclipsesource.com>
    Co-authored-by: Philip Langer <planger@eclipsesource.com>
    Co-authored-by: Remi Schnekenburger <rschnekenburger@eclipsesource.com>
    Co-authored-by: Simon Graband <sgraband@eclipsesource.com>
    Co-authored-by: Tobias Ortmayr <tortmayr@eclipsesource.com>
@sdirix sdirix force-pushed the feat/ai-chat-contribution branch from 4d96c19 to 1a32b06 Compare September 11, 2024 16:18
@sdirix
Copy link
Member Author

sdirix commented Sep 11, 2024

I squashed the contribution into two separate commits:

  • The initial Theia LLM contribution
  • The additional Ollama integration contributed by @dhuebner

No further changes were done in the force push.

@tsmaeder Can you rebase-merge? Thank you!

Let me know in case you would like to see different commit messages

Integrates Ollama language models into Theia via the new 'ai-ollama'
package. The endpoint and models can be configured via the preferences.
@sdirix sdirix force-pushed the feat/ai-chat-contribution branch from 1a32b06 to 9e857c4 Compare September 11, 2024 20:39
@sdirix
Copy link
Member Author

sdirix commented Sep 11, 2024

I forced pushed once more as the license check failed. There were minor yarn.lock changes because I ran yarn once manually during the rebase/squash process. I restored the previously approved yarn.lock now.

@tsmaeder tsmaeder merged commit cc46ceb into eclipse-theia:master Sep 12, 2024
11 checks passed
@sgraband sgraband added this to the 1.54.0 milestone Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants