Support for locally hosted models #190

3coins · 2023-05-24T16:36:20Z

Summary

@krassowski brought this up during the JupyterLab weekly meeting. This is important because of privacy concerns and some JupyterLab users would prefer not sending their prompts across the wire. Alternately, we should have a more pronounced messaging so the user is aware that their inputs will be send to the model and embedding providers.

3coins · 2023-05-24T16:38:44Z

@krassowski
Please feel free to add any more context if I missed anything.

krassowski · 2023-05-30T20:35:24Z

jupyter-ai currently only contains providers for models accessible via over-the-wire API, although the tooling currently employed (LangChain) supports a number of local models, for example GPT4All, LLama-cpp, or Hugging Face Local Pipelines.

Since jupyter-ai does not support local models out of the box, I and others (#17) have previously asked about the way to register custom providers. The initial jupyter-ai proposal involved a cookiecutter for creating custom providers (back then called engines) and back in March this seemed to still be advised by @dlqqq (#17 (comment)), however documentation for cookiecutters at first degraded and then was removed (#163) and the cookiecutter approach was described as no longer recommended in favour of declaring and registering custom lang chain models (#163 (comment), #17 (comment)). However, there appears to be no way (please correct me if I am wrong) to register custom models as of today (although there is a PR open for some time at #136), nor is it clear how it would work for non lang chain, and in general non-language models (e.g. stable diffusion kind).

The approach proposed in #136 is fine for hacking things together or switching models of pre-defined providers, but when it comes to registration of completely new models it is highly repetitive and would force users to paste chunks of boilerplate code into their notebooks (#136 (review comment)). Therefore it is not a proper replacement for:

native definition of providers for offline models which are supported LangChain (I can work on these if you accept such a contribution)
well documented way of creating packages AI modules (providers, engines, whatever we call them), whether based on LangChain or not.

dlqqq · 2023-05-30T22:44:40Z

@krassowski Wow, thank you for such awesome feedback! It's clear that you've been keeping up with our development very closely. Let me address some of your points:

The cookiecutter template still exists in the repository under packages/jupyter-ai-module-cookiecutter. However, we are heavily focused on the core jupyter_ai package and are not prioritizing the robustness and documentation of the cookiecutter. This project is changing so rapidly that maintaining the cookiecutter is unduly burdensome.
Supporting local language models is a high-priority issue for us; we received lots of demand for this at JupyterCon, and we're excited to bring local LMs to Jupyter AI. However, there are several subtle technical considerations that need to be addressed before we can bring in local LMs:
- Platform compatibility -- how do we ensure the best support of each language model on each system? What happens if the LM requires more compute/memory than the platform hardware offers? Etc. There is a lot of investigation to be done here.
- General interface for local LMs -- are cookiecutter templates really needed? i.e. Is there a way to build a general interface via LangChain for any locally hosted language model without needing to write a custom Jupyter AI module via the cookiecutter?
- Request/response schemas -- How do we let users specify the request/response schema of an arbitrary local LM? The key difference here is that when using upstream 3P LMs (e.g. OpenAI), those have a defined request/response syntax in the API. However, in the general case of an arbitrary local LM, the schema is unknown to us, and the user must somehow specify this. This is being addressed in Support SageMaker Endpoints in chat #197, but this may prove insufficient for local LMs.

We are working on all of these issues as we speak. We would like local LM support to be as robust and high-quality as possible before we release this feature, so we encourage patience here. We would also like to welcome any and all feedback on this feature request to help guide us as we are implementing this.

krassowski · 2023-05-31T16:01:25Z

Platform compatibility [...] There is a lot of investigation to be done here.

In my humble opinion enabling users to test it ASAP would accelerate investigations and expose user expectations.

are cookiecutter templates really needed?

I would be happy with or without a cookiecutter - as long as documentation on entrypoints and APIs exist.

Request/response schemas

Cross-ref #193. Again I think enabling advanced user experimentation would accelerate discovering what needs to be done :)

krassowski · 2023-06-04T09:33:45Z

To give an example of what I mean by public API for registering custom models programatically, as simplest (not neccessairly best) solution would be renaming AiMagics._safely_set_target to AiMagics.register_model(name, model) in #136.

JasonWeill · 2023-06-15T19:56:19Z

We're about to release Jupyter AI 0.8.0. I'm going to move this to the next release, scheduled for about two weeks from now; let's make local models a priority. This feature has been widely demanded and would add significant value to Jupyter AI.

JasonWeill · 2023-08-01T02:59:15Z

In preparation for local models, I'm working on custom prompt templates per provider (see #226) in PR #309.

FurkanGozukara · 2023-08-03T00:27:47Z

If you add local support hopefully i will make a tutorial for this on my channel

Add a dropdown box that people can select model

It will download the model automatically from hugging face

my channel has over 22k subscriber atm : https://www.youtube.com/SECourses

egeucak · 2023-08-03T09:32:13Z

I am willing to contribute to support huggingface text generation inference endpoints.

ktaletsk · 2023-08-03T17:28:03Z

I am testing deploying models in the same Kubernetes cluster as JupyterHub with https://github.com/chenhunghan/ialacol and would like to connect to them from the extension

Since the APIs provided by ialacol are mimicing OpenAPI's, it should be relatively straightforward to support, where instead of OpenAPI key the UI would allow customizing endpoint URL instead.

I understand, this might be different from local models, they are better called "self-hosted", then local.

dlqqq · 2023-08-14T17:46:16Z

#209 introduces early-stage support for GPT4All, which will allow you to run language models locally in the next release. Requests for additional features can be tracked in separate issues. Thank you all for providing your feedback on this issue! 👍

dlqqq · 2023-08-14T17:49:17Z

@FurkanGozukara I've created an issue to track your feature request: #343

dlqqq · 2023-08-14T17:49:49Z

@ktaletsk You will be able to set the OpenAI proxy in the next release. 🎉 See: #322

surak · 2024-01-19T11:39:40Z

Please have a look at my comment here: #389 (comment)

About self-hosted openai-compatible servers. That allows organizations to centralize the inference and connect all jupyter clients to a single big, fast server.

3coins added the enhancement New feature or request label May 24, 2023

jtpio mentioned this issue Jun 1, 2023

Weekly Team Meetings: Jan-Jun 2023 jupyterlab/frontends-team-compass#170

Closed

JasonWeill mentioned this issue Jun 1, 2023

Register, update, and delete aliases #136

Merged

krassowski mentioned this issue Jun 4, 2023

Add GPT4All local provider #209

Merged

JasonWeill added this to the 0.8.0 Release milestone Jun 6, 2023

JasonWeill added @jupyter-ai/chatui @jupyter-ai/magics labels Jun 9, 2023

JasonWeill modified the milestones: 0.8.0 Release, 0.9.0 Release Jun 15, 2023

3coins mentioned this issue Jun 16, 2023

Update prompts for locally installed models #226

Open

dlqqq modified the milestones: 0.9.0 Release, 0.10.0 Release Jun 23, 2023

JasonWeill modified the milestones: 0.10.0 Release, 0.11.0 Release Jul 17, 2023

JasonWeill added the project:extensibility Extension points, routing, configuration label Jul 18, 2023

This was referenced Jul 28, 2023

Extension for local generators #17

Closed

Custom inference endpoints #306

Closed

dlqqq closed this as completed Aug 14, 2023

JasonWeill mentioned this issue Aug 28, 2023

Bug in Dev Strategy - No Support for LOCAL only models (StarCoder, Llama2, others) for Generative AI models for coding #360

Closed

krassowski mentioned this issue Sep 18, 2023

Custom local LLMs #389

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for locally hosted models #190

Support for locally hosted models #190

3coins commented May 24, 2023

3coins commented May 24, 2023

krassowski commented May 30, 2023

dlqqq commented May 30, 2023

krassowski commented May 31, 2023

krassowski commented Jun 4, 2023

JasonWeill commented Jun 15, 2023

JasonWeill commented Aug 1, 2023

FurkanGozukara commented Aug 3, 2023 •

edited

Loading

egeucak commented Aug 3, 2023

ktaletsk commented Aug 3, 2023

dlqqq commented Aug 14, 2023

dlqqq commented Aug 14, 2023

dlqqq commented Aug 14, 2023 •

edited

Loading

surak commented Jan 19, 2024 •

edited

Loading

Support for locally hosted models #190

Support for locally hosted models #190

Comments

3coins commented May 24, 2023

Summary

3coins commented May 24, 2023

krassowski commented May 30, 2023

dlqqq commented May 30, 2023

krassowski commented May 31, 2023

krassowski commented Jun 4, 2023

JasonWeill commented Jun 15, 2023

JasonWeill commented Aug 1, 2023

FurkanGozukara commented Aug 3, 2023 • edited Loading

egeucak commented Aug 3, 2023

ktaletsk commented Aug 3, 2023

dlqqq commented Aug 14, 2023

dlqqq commented Aug 14, 2023

dlqqq commented Aug 14, 2023 • edited Loading

surak commented Jan 19, 2024 • edited Loading

FurkanGozukara commented Aug 3, 2023 •

edited

Loading

dlqqq commented Aug 14, 2023 •

edited

Loading

surak commented Jan 19, 2024 •

edited

Loading