- Dependencies
- Deploy to an Existing Kubernetes Cluster
- Create and Deploy to a New Amazon EKS Cluster
With this repository you can deploy GroundX RAG document ingestion and search capabilities to a Kubernetes cluster in a manner that can be isolated from any external dependencies.
GroundX delivers a unique approach to advanced RAG that consists of three interlocking systems:
- GroundX Ingest: A state-of-the-art vision model trained on over 1M pages of enterprise documents. It delivers unparalleled document understanding and can be fine-tuned for your unique document sets.
- GroundX Store: Secure, encrypted storage for source files, semantic objects, and vectors, ensuring your data is always protected.
- GroundX Search: Built on OpenSearch, it combines text and vector search with a fine-tuned re-ranker model for precise, enterprise-grade results.
In head-to-head testing, GroundX significantly outperforms many popular RAG tools (ref1, ref2, ref3), especially with respect to complex documents at scale. GroundX is trusted by organizations like Air France, Dartmouth and Samsung with over 2 billion tokens ingested on our models.
GroundX On-Prem allows you to leverage GroundX within hardened and secure environments. GroundX On-Prem requires no external dependencies when running, meaning it can be used in air-gapped environments. Deployment consists of two key steps:
- (Optional) Creation of Infrastructure on AWS via Terraform
- Deployment of GroundX onto Kubernetes via Helm
Currently, creation of infrastructure via Terraform is only supported for AWS. However, with sufficient expertise GroundX can be deployed onto any pre-existing Kubernetes cluster.
This repo is in Open Beta. Feedback is appreciated and encouraged. To use the hosted version of GroundX visit EyeLevel.ai. For white glove support in configuring this open source repo in your environment, or to access more performant and closed source versions of this repo, contact us. To learn more about what GroundX is, and what it's useful for, you may be interested in the following resources:
- A Video discussion the importance of parsing, and a comparison of several approaches
- GroundX being used to power a multi-modal RAG application
- GroundX being used to power a verbal AI Agent
If you're deploying GroundX On-Prem on AWS, you might be interested in this simple video guide for deploying on AWS. To see how well GroundX understands your documents, check out our online testing tool:
Test your documents for free online |
The GroundX ingest service expects visually complex documents in a variety of formats. It analyzes those documents with several fine tuned models, converts the documents into a queryable representation which is designed to be understood by LLMs, and stores that information for downstream search.
Once documents have been processed via the ingest service they can be queried against via natural language queries. We use a custom configuration of Open Search which has been designed in tandem with the representations generated from the ingest service.
Please ensure you have the following software tools installed before proceeding:
bash
shell (version 4.0 or later recommended. AWS Cloud Shell has insufficient resources.)terraform
(Setup Docs)kubectl
(Setup Docs)
If you will be using the Terraform scripts to set up infrastructure in AWS, you will also need:
AWS CLI
(Setup Docs)
If you do not have an existing Kubernetes cluster and would like to use our Terraform scripts to set up an Amazon EKS cluster, you should follow the new Amazon EKS cluster Quick Start guide.
In order to deploy GroundX On-Prem to your Kubernetes cluster, you must:
- Check that you have the required compute resources
- Configure or create appropriate node groups and nodes
- Update
operator/env.tfvars
with your cluster information - Run the deploy script
The GroundX On-Prem pods deploy to nodes using node selector labels and tolerations. Here is an example from one of the k8 yaml configs:
nodeSelector:
node: "{{ .Values.nodeSelector.node }}"
tolerations:
- key: "node"
value: "{{ .Values.nodeSelector.node }}"
effect: "NoSchedule"
Node labels are defined in shared/variables.tf and must be applied to appropriate nodes within your cluster. Default node label values are:
eyelevel-cpu-memory
eyelevel-cpu-only
eyelevel-gpu-layout
eyelevel-gpu-ranker
eyelevel-gpu-summary
The publicly available GroundX On-Prem Kubernetes pods are all built for x86_64
architecture. Pods built for other architectures, such as arm64
, are available upon customer request.
The GroundX On-Prem GPU pods are designed to run on NVIDIA GPUs with CUDA 12+. Other GPU types or older driver versions are not supported.
As part of the deployment, unless otherwise specified, the NVIDIA GPU operator is installed. If you already have this operator installed in your cluster, set cluster.has_nvidia
to true
in your operator/env.tfvars
config file.
The NVIDIA GPU operator should update your NVIDIA drivers and other software components needed to provision the GPU, so long as you have supported NVIDIA hardware on the machine.
The GroundX On-Prem default resource requirements are:
eyelevel-cpu-only
40 GB disk drive space
6 CPU cores
12 GB RAM
eyelevel-cpu-memory
40 GB disk drive space
8 CPU cores
32 GB RAM
eyelevel-gpu-layout
16 GB GPU memory
32 GB disk drive space
4 CPU cores
12 GB RAM
eyelevel-gpu-ranker
48 GB GPU memory
150 GB disk drive space
11 CPU cores
32 GB RAM
eyelevel-gpu-summary
40 GB GPU memory
150 GB disk drive space
6 CPU cores
28 GB RAM
The GroundX On-Prem pods are grouped into 5 categories, based on resource requirements, and deploy as described in the node group section.
These pods can be deployed to 5 different dedicated node groups, a single node group, or any combination in between, so long as the minimum resource requirements are met and the appropriate node labels are applied to the nodes.
The resource requirements are as follows:
Pods in this node group have minimal requirements on CPU, RAM, and disk drive space. They can run on virtually any machine with the supported architecture.
Pods in this node group have a range of requirements on CPU, RAM, and disk drive space but can typically run on most machines with the supported architecture.
Services, such as OpenSearch
, MySQL
, and MinIO
, will deploy to the eyelevel-cpu-memory nodes, as well as some ingestion pipeline pods.
These pods have the following range of requirements (per pod), which are described detail in operator/variables.tf:
20 - 75 GB disk drive space
0.5 - 2 CPU cores
0.5 - 4 GB RAM
Pods in this node group have specific requirements on GPU, CPU, RAM, and disk drive space.
Each pod requires up to:
4 GB GPU memory
8 GB disk drive space
1 CPU core
3 GB RAM
The current configuration for this service assumes an NVIDIA GPU with 16 GB of GPU memory, 4 CPU cores, and at least 12 GB RAM. It deploys 4 pods on this node (called workers
in operator/variables.tf
) and claims the GPU via the nvidia.com/gpu
resource provided by the NVIDIA GPU operator.
If your machine has different resources than this, you will need to modify layout_resources.inference
in your operator/env.tfvars
using the per pod requirements described above to optimize for your node resources.
Pods in this node group have specific requirements on GPU, CPU, RAM, and disk drive space.
Each pod requires up to:
1.1 GB GPU memory
3.5 GB disk drive space
0.25 CPU core
0.75 GB RAM
The current configuration for this service assumes an NVIDIA GPU with 16 GB of GPU memory, 4 CPU cores, and at least 14 GB RAM. It deploys 14 pods on this node (called workers
in operator/variables.tf
) and claims the GPU via the nvidia.com/gpu
resource provided by the NVIDIA GPU operator.
If your machine has different resources than this, you will need to modify ranker_resources.inference
in your operator/env.tfvars
using the per pod requirements described above to optimize for your node resources.
Pods in this node group have specific requirements on GPU, CPU, RAM, and disk drive space.
Each pod requires up to:
10 GB GPU memory
36 GB disk drive space
1.5 CPU core
7 GB RAM
The current configuration for this service assumes an NVIDIA GPU with 24 GB of GPU memory, 4 CPU cores, and at least 14 GB RAM. It deploys 2 pods on this node (called workers
in operator/variables.tf
) and claims the GPU via the nvidia.com/gpu
resource provided by the NVIDIA GPU operator.
If your machine has different resources than this, you will need to modify summary_resources.inference
in your operator/env.tfvars
using the per pod requirements described above to optimize for your node resources.
As mentioned in the node groups section, node labels are defined in shared/variables.tf and must be applied to appropriate nodes within your cluster. Default node label values include:
eyelevel-cpu-memory
eyelevel-cpu-only
eyelevel-gpu-layout
eyelevel-gpu-ranker
eyelevel-gpu-summary
Multiple node labels can be applied to the same node group, so long as resources are available as described in the total recommended resource and node group resources sections.
However, all node labels must exist on at least 1 node group within your cluster. The label should be applied with a string key named node
and an enumerated string value from the list above.
- Create
operator/env.tfvars
file by copying the example file
cp operator/env.tfvars.example operator/env.tfvars
env.tfvars
is the configuration file Terraform will use when setting up GroundX On-Prem.
- Add admin credentials
For security reasons, you MUST modify the following:
admin.api_key
: Set this to a random UUID. You can generate one by runningbin/uuid
. This will be the API key associated with the admin account and will be used for inter-service communications.admin.username
: Set this to a random UUID. You can generate one by runningbin/uuid
. This will be the user ID associated with the admin account and will be used for inter-service communications.admin.email
: Set this to the email address you want associated with the admin account.
- (Optional) Update passwords and pod resource configurations
Service usernames and passwords can be set in the other variables copied over from operator/env.tfvars.example
(e.g. MySQL passwords).
If you need to make changes, as described in the node group resouces section, you will also add these to your operator/env.tfvars
file.
- (Optional) Update kubeconfig path
The setup scripts assume your kubeconfig file can be found at ~/.kube/config
. If that is not the case, you will need to modify cluster.kube_config_path
in your operator/env.tfvars
file.
Once env.tfvars
has been properly configured, run:
operator/setup
This will create a new namespace and deploy GroundX On-Prem into the Kubernetes cluster.
If you already have a Kubernetes cluster, including an existing AWS EKS cluster, you should follow the existing Kubernetes cluster Quick Start guide.
- Create env.tfvars file by copying the example file
cp environment/aws/env.tfvars.example environment/aws/env.tfvars
env.tfvars
is the configuration file Terraform will use when defining the resources. The content of env.tfvars
will be updated in subsequent steps.
- Once
env.tfvars
has been created, run:
environment/aws/setup-eks
You will be prompted for an AWS region to set up your cluster, and will also be asked to double check that you're happy with the state of the configuration file.
Once this command has executed, a VPC and Kubernetes cluster will be setup. You can proceed to deploying GroundX.
- Create env.tfvars file
cp operator/env.tfvars.example operator/env.tfvars
This creates a Terraform configuration file for the GroundX application, similar to what was described in the previous section. Now, however, some configuration is required.
- Add admin credentials
For security reasons, you MUST modify the following:
admin.api_key
: Set this to a random UUID. You can generate one by runningbin/uuid
. This will be the API key associated with the admin account and will be used for inter-service communications.admin.username
: Set this to a random UUID. You can generate one by runningbin/uuid
. This will be the user ID associated with the admin account and will be used for inter-service communications.admin.email
: Set this to the email address you want associated with the admin account.
- Once
env.tfvars
has been properly configured, run:
operator/setup
This will create a new namespace and deploy GroundX On-Prem into the Kubernetes cluster.
The resources being created will incur cost via AWS. It is recommended to follow all instructions accurately and completely. So that setup and taredown are both executed completely. Experience with AWS is recommended.
The default resource configurations are specified here, consisting of:
2x m6a.xlarge
3x t3a.medium
1x g4dn.xlarge
3x g4dn.2xlarge
2x g5.xlarge
~300 GB gp2
Once the setup is complete, run:
kubectl -n eyelevel get svc
The API endpoint will be the external IP associated with the GroundX load balancer.
For instance, the "external IP" might resemble the following:
EXTERNAL-IP
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxx.us-east-2.elb.amazonaws.com
The API endpoint, in conjuction with the admin.api_key
defined during deployment, can be used to configure the GroundX SDK to communicate with your On-Prem instance of GroundX.
Note: you must append /api
to your API endpoint in the SDK configuration.
from groundx import GroundX
external_ip = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxx.us-east-2.elb.amazonaws.com'
api_key="xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
client = GroundX(api_key=api_key, base_url=f"http://{external_ip}/api")
import { GroundXClient } from "groundx";
const external_ip = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-xxxxxxxxxx.us-east-2.elb.amazonaws.com'
const groundx = new GroundXClient({
apiKey: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
environment: `http://${external_ip}/api`;,
});
The API endpoint, in conjuction with the admin.api_key
defined during deployment, can be used to interact with your On-Prem instance of GroundX.
All of the methods and operations described in the GroundX documentation are supported with your On-Prem instance of GroundX. You simply have to substitute https://api.groundx.ai
with your API endpoint.
After all resources have been created, tear down can be done with the following commands.
To tear down the GroundX On-Prem deployment, run the following commands in order:
bin/operator app -c
bin/operator services -c
bin/operator init -c
If you used our Terraform scripts to set up an Amazon EKS cluster, run the following commands in order:
bin/environment eks -c
bin/environment aws-vpc -c
It is vital to run these commands in order, and it is recommended to run them one at a time manually. We have observed inconsistency and race conditions when these are run automatically.