-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(stores): Vector store backend #1795
Conversation
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
I putting to ready to review because while the feature is far from finished, I think it could be merged now as an experimental feature and possibly the assistant API and retrieval tool could start being developed on top of it. |
b46382f
to
53d9885
Compare
Uhg, I tried making it so that stablediffusion is not built when the tags are not present and it ended up that it got built with protobuf. |
I'll try to do a video demonstrating the feature |
YouTube video whenever it finishes processing: https://youtu.be/iFOH5pnnIAU |
Signed-off-by: Richard Palethorpe <io@richiejp.com>
I'm scanning over it - and I'm liking it a lot. Especially because it seems a lightweight solution that is easy to deploy and it might offer a valid alternative. I think is going to be a fundamental work for #1273 - so we can rely on having it internally instead of depending on an external service. @richiejp maybe a future improvement could be exposing an API compat layer, for instance targeting Milvus or qdrant APIs so we can use already existing clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, awesome contrib @richiejp !
Thanks!
Very interesting. Especially as some of these databases (Weaviate at least) have plugins to generate embeddings and probably they should support (if they don't already) much tighter integration with the models to support dense retrieval methods (I'm thinking of things similar to colBERT V2 in case I am using the wrong words). |
Note we probably need an ID field/key which is not a vector of floats/embedding. Getting an exact match is not terribly reliable |
…1.0@8f708d1 by renovate (#19852) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [docker.io/localai/localai](https://togithub.com/mudler/LocalAI) | minor | `v2.10.1` -> `v2.11.0` | --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. --- ### Release Notes <details> <summary>mudler/LocalAI (docker.io/localai/localai)</summary> ### [`v2.11.0`](https://togithub.com/mudler/LocalAI/releases/tag/v2.11.0) [Compare Source](https://togithub.com/mudler/LocalAI/compare/v2.10.1...v2.11.0) ### Introducing LocalAI v2.11.0: All-in-One Images! Hey everyone! 🎉 I'm super excited to share what we've been working on at LocalAI - the launch of v2.11.0. This isn't just any update; it's a massive leap forward, making LocalAI easier to use, faster, and more accessible for everyone. #### 🌠 The Spotlight: All-in-One Images, OpenAI in a box Imagine having a magic box that, once opened, gives you everything you need to get your AI project off the ground with generative AI. A full clone of OpenAI in a box. That's exactly what our AIO images are! Designed for both CPU and GPU environments, these images come pre-packed with a full suite of models and backends, ready to go right out of the box. Whether you're using Nvidia, AMD, or Intel, we've got an optimized image for you. If you are using CPU-only you can enjoy even smaller and lighter images. To start LocalAI, pre-configured with function calling, llm, tts, speech to text, and image generation, just run: ```bash docker run -p 8080:8080 --name local-ai -ti localai/localai:latest-aio-cpu #### Do you have a Nvidia GPUs? Use this instead #### CUDA 11 ### docker run -p 8080:8080 --gpus all --name local-ai -ti localai/localai:latest-aio-gpu-cuda-11 #### CUDA 12 ### docker run -p 8080:8080 --gpus all --name local-ai -ti localai/localai:latest-aio-gpu-cuda-12 ``` ##### ❤️ Why You're Going to Love AIO Images: - Ease of Use: Say goodbye to the setup blues. With AIO images, everything is configured upfront, so you can dive straight into the fun part - hacking! - Flexibility: CPU, Nvidia, AMD, Intel? We support them all. These images are made to adapt to your setup, not the other way around. - Speed: Spend less time configuring and more time innovating. Our AIO images are all about getting you across the starting line as fast as possible. ##### 🌈 Jumping In Is a Breeze: Getting started with AIO images is as simple as pulling from Docker Hub or Quay and running it. We take care of the rest, downloading all necessary models for you. For all the details, including how to customize your setup with environment variables, our updated docs have got you covered [here](https://localai.io/basics/getting_started/), while you can get more details of the AIO images [here](https://localai.io/docs/reference/aio-images/). #### 🎈 Vector Store Thanks to the great contribution from [@​richiejp](https://togithub.com/richiejp) now LocalAI has a new backend type, "vector stores" that allows to use LocalAI as in-memory Vector DB ([https://github.com/mudler/LocalAI/issues/1792](https://togithub.com/mudler/LocalAI/issues/1792)). You can learn more about it [here](https://localai.io/stores/)! #### 🐛 Bug fixes This release contains major bugfixes to the watchdog component, and a fix to a regression introduced in v2.10.x which was not respecting `--f16`, `--threads` and `--context-size` to be applied as model's defaults. #### 🎉 New Model defaults for llama.cpp Model defaults has changed to automatically offload maximum GPU layers if a GPU is available, and it sets saner defaults to the models to enhance the LLM's output. #### 🧠 New pre-configured models You can now run `llava-1.6-vicuna`, `llava-1.6-mistral` and `hermes-2-pro-mistral`, see [Run other models](https://localai.io/docs/getting-started/run-other-models/) for a list of all the pre-configured models available in the release. ### 📣 Spread the word! First off, a massive thank you (again!) to each and every one of you who've chipped in to squash bugs and suggest cool new features for LocalAI. Your help, kind words, and brilliant ideas are truly appreciated - more than words can say! And to those of you who've been heros, giving up your own time to help out fellow users on Discord and in our repo, you're absolutely amazing. We couldn't have asked for a better community. Just so you know, LocalAI doesn't have the luxury of big corporate sponsors behind it. It's all us, folks. So, if you've found value in what we're building together and want to keep the momentum going, consider showing your support. A little shoutout on your favorite social platforms using @​LocalAI_OSS and @​mudler_it or joining our sponsors can make a big difference. Also, if you haven't yet joined our Discord, come on over! Here's the link: https://discord.gg/uJAeKSAGDy Every bit of support, every mention, and every star adds up and helps us keep this ship sailing. Let's keep making LocalAI awesome together! Thanks a ton, and here's to more exciting times ahead with LocalAI! ### 🔗 Links - Quickstart docs (how to run with AIO images): https://localai.io/basics/getting_started/ - More reference on AIO image: https://localai.io/docs/reference/aio-images/ - List of embedded models that can be started: https://localai.io/docs/getting-started/run-other-models/ ### 🎁 What's More in v2.11.0? ##### Bug fixes 🐛 - fix(config): pass by config options, respect defaults by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1878](https://togithub.com/mudler/LocalAI/pull/1878) - fix(watchdog): use ShutdownModel instead of StopModel by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1882](https://togithub.com/mudler/LocalAI/pull/1882) - NVIDIA GPU detection support for WSL2 environments by [@​enricoros](https://togithub.com/enricoros) in [https://github.com/mudler/LocalAI/pull/1891](https://togithub.com/mudler/LocalAI/pull/1891) - Fix NVIDIA VRAM detection on WSL2 environments by [@​enricoros](https://togithub.com/enricoros) in [https://github.com/mudler/LocalAI/pull/1894](https://togithub.com/mudler/LocalAI/pull/1894) ##### Exciting New Features 🎉 - feat(functions/aio): all-in-one images, function template enhancements by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1862](https://togithub.com/mudler/LocalAI/pull/1862) - feat(aio): entrypoint, update workflows by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1872](https://togithub.com/mudler/LocalAI/pull/1872) - feat(aio): add tests, update model definitions by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1880](https://togithub.com/mudler/LocalAI/pull/1880) - feat(stores): Vector store backend by [@​richiejp](https://togithub.com/richiejp) in [https://github.com/mudler/LocalAI/pull/1795](https://togithub.com/mudler/LocalAI/pull/1795) - ci(aio): publish hipblas and Intel GPU images by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1883](https://togithub.com/mudler/LocalAI/pull/1883) - ci(aio): add latest tag images by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1884](https://togithub.com/mudler/LocalAI/pull/1884) ##### 🧠 Models - feat(models): add phi-2-chat, llava-1.6, bakllava, cerbero by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1879](https://togithub.com/mudler/LocalAI/pull/1879) ##### 📖 Documentation and examples - ⬆️ Update docs version mudler/LocalAI by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1856](https://togithub.com/mudler/LocalAI/pull/1856) - docs(mac): improve documentation for mac build by [@​tauven](https://togithub.com/tauven) in [https://github.com/mudler/LocalAI/pull/1873](https://togithub.com/mudler/LocalAI/pull/1873) - docs(aio): Add All-in-One images docs by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1887](https://togithub.com/mudler/LocalAI/pull/1887) - fix(aio): make image-gen for GPU functional, update docs by [@​mudler](https://togithub.com/mudler) in [https://github.com/mudler/LocalAI/pull/1895](https://togithub.com/mudler/LocalAI/pull/1895) ##### 👒 Dependencies - ⬆️ Update ggerganov/whisper.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1508](https://togithub.com/mudler/LocalAI/pull/1508) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1857](https://togithub.com/mudler/LocalAI/pull/1857) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1864](https://togithub.com/mudler/LocalAI/pull/1864) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1866](https://togithub.com/mudler/LocalAI/pull/1866) - ⬆️ Update ggerganov/whisper.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1867](https://togithub.com/mudler/LocalAI/pull/1867) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1874](https://togithub.com/mudler/LocalAI/pull/1874) - ⬆️ Update ggerganov/whisper.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1875](https://togithub.com/mudler/LocalAI/pull/1875) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1881](https://togithub.com/mudler/LocalAI/pull/1881) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1885](https://togithub.com/mudler/LocalAI/pull/1885) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1889](https://togithub.com/mudler/LocalAI/pull/1889) ##### Other Changes - ⬆️ Update ggerganov/whisper.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1896](https://togithub.com/mudler/LocalAI/pull/1896) - ⬆️ Update ggerganov/llama.cpp by [@​localai-bot](https://togithub.com/localai-bot) in [https://github.com/mudler/LocalAI/pull/1897](https://togithub.com/mudler/LocalAI/pull/1897) #### New Contributors - [@​enricoros](https://togithub.com/enricoros) made their first contribution in [https://github.com/mudler/LocalAI/pull/1891](https://togithub.com/mudler/LocalAI/pull/1891) **Full Changelog**: mudler/LocalAI@v2.10.1...v2.11.0 </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yNzEuMSIsInVwZGF0ZWRJblZlciI6IjM3LjI3MS4xIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIn0=-->
WIP
shards and/ormultiple stores (e.g. separate stores by file ID)implements: #1792