Information for contributors to the Docs AI Chatbot project.
Note This guide is for MongoDB employees
Currently, we have written the contributor guide and designed the project only for MongoDB employees to contribute to it. The project is dependent on some MongoDB infrastructure and credentials.
In the future, we will explore accepting external contributions, and setting up the project accordingly.
The project is structured as a monorepo, with all projects using TypeScript. We use Lerna to manage our monorepo.
The monorepo has the following main projects, each of which correspond to a JavaScript module:
These packages power our RAG applications.
mongodb-rag-core
: A set of common resources (modules, functions, types, etc.) shared across projects.- You need to recompile
mongodb-rag-core
by runningnpm run build
every time you update it for the changes to be accessible in the other projects that dependend on it.
- You need to recompile
mongodb-rag-ingest
: CLI application that takes data from data sources and converts it toembedded_content
used by Atlas Vector Search.
These packages are generic implementations of our Chatbot Server framework. In general, we publish these as reusable packages on npm.
mongodb-chatbot-server
: Express.js server that performs chat functionality. RESTful API.mongodb-chatbot-ui
: React component for interfacing with themongodb-chatbot-server
. Built with Leafygreen and vite.mongodb-chatbot-evaluation
: A framework for generating, evaluating, and reporting on test suitesmongodb-chatbot-verified-answers
: A CLI and framework for ingesting & managing verified chatbot answers.
These packages are our production chatbot. They build on top of the Chatbot Framework packages and add MongoDB-specific implementations.
chatbot-eval-mongodb-public
: Test suites, evaluators, and reports for the MongoDB AI Chatbotchatbot-server-mongodb-public
: Chatbot server implementation with our MongoDB-specific configuration.ingest-mongodb-public
: RAG ingest service configured to ingest MongoDB Docs, DevCenter, MDBU, MongoDB Press, etc.
mongodb-artifact-generator
: A CLI to generate docs pages, translate code examples, etc.mongodb-atlas
: Collection of scripts related to managing the MongoDB Atlas deployment used by the project.performance-tests
: Performance tests for the project using the k6 performance testing framework.scripts
: Miscellaneous scripts to help with the project.
To contribute to the project, you should follow the standard GitHub workflow:
- Create a fork of the repo.
- Create a branch for your changes on your fork.
- If there's a Jira ticket for your changes, name the branch after the ticket (e.g.
DOCSP-1234
). - If there's no Jira ticket, name the branch after the changes you're making (e.g.
fix_typos
).
- If there's a Jira ticket for your changes, name the branch after the ticket (e.g.
- Make your changes. Commit them to your branch and push to your fork.
- Create a pull request to merge the changes from
<your fork/branch>
into themongodb:chatbot/main
branch.
You must be on a MongoDB corporate network (both office and VPNs) to access many of the dependent services. If you are not in a MongoDB office, you should go on the VPN before starting the bootstrap.
You should also be on a MongoDB corporate network to run the project locally.
Run the following in the root of the repo to install the dependencies and link packages in the repo:
npm install
npm run bootstrap
Most packages in the monorepo require you to define environment variables. We use these to configure services like MongoDB Atlas and OpenAI as well as to control aspects of the build process.
We define environment variables for each package in a .env
file in the package
root. Every project has an .env.example
file showing you which environment
variables you need.
To set up a package, ask a current project contributor for the .env
files of
that package and any of our packages it depends on. If you're unsure which
projects you need to work with, ask a current contributor.
At a minimum, you'll need the environment variables for mongodb-rag-core
since
it's used by most packages.
In general, you need to build a package's local dependencies and then build the
package itself before you can use it. You can check if a package has any local
dependencies by looking for the names of this repo's other packages in the
dependencies
list of package.json
.
At a minimum, you'll need to build mongodb-rag-core
.
You can build packages individually by running build
from the package root:
npm run build
Or you can build multiple packages by running build
from the repo root. By
default this builds all packages in the repo. You can use the --scope
flag to
only build a subset of all packages.
npm run build -- --scope='{mongodb-rag-core,chatbot-server-mongodb-public,mongodb-chatbot-ui}'
You can also run a development build of both the chatbot-server-mongodb-public
and mongodb-chatbot-ui
with hot reload by running the following command from
the root of the monorepo:
npm run dev
The projects uses Drone for its CI/CD pipeline. All drone config is located in .drone.yml
.
Applications are deployed on Kubernetes using the Kanopy developer platform.
Kubernetes/Kanopy configuration are found in the <deployed project>/environments
directories. Refer to the Kanopy documentation for more information
on the Kanopy environment files.
The applications are containerized using docker. Docker files are named
with the *.dockerfile
extension.
We use release-it to manage versioning packages.
All packages' versioning is managed independently, which means that each package has its own version number. We use Lerna independent mode to facilitate this.
We use semantic versioning to version packages.
Updating the version of a package is done as part of the release flow. Run the following command from a given package (e.g. mongodb-chatbot-server
):
npm run release
To learn more about our release setup, refer to the release-it monorepo documentation.
When you create a tag for a release and push it to Github, it triggers a Drone pipeline that publishes the package to npm.
We run a staging version of the mongodb-chatbot-server
and mongodb-rag-ingest
that uses
the latest commit on the main
branch. When you merge new commits into main
,
a CI/CD pipeline automatically builds and publishes the updated staging server and demo site.
We run a QA server that serves a specific build for testing before we release to production.
To publish to QA:
-
Check out the
qa
branch and pull any upstream changes. Here,upstream
is the name of themongodb/chatbot
remote repo.git fetch upstream git checkout qa git pull upstream qa
-
Apply any commits you want to build to the branch. In many cases you'll just build from the same commits as
main
. However, you might want to QA only a subset of commits frommain
. -
Add a tag to the latest commit on the
qa
branch using the following naming scheme:chat-server-qa-<Build ID>
git tag chat-server-qa-0.0.42 -a
-
Push the branch to this upstream GitHub repo
git push upstream qa
Once you've added the tag to the upstream repo, the Drone CI automatically builds and deploys the branch to the QA server.
We use a tool called release-it
to prepare production releases for the
mongodb-chatbot-server
, mongodb-rag-ingest
, and chat-ui
projects.
Production releases are triggered by creating a git tag prefaced with the
package name (e.g. chat-server-v{version-number}
).
To create a new production release:
-
Pull latest code you want to release.
git pull upstream main
-
In the relevant package directory (e.g
mongodb-chatbot-server
) run the release command. This gets the package ready for a new release.npm run release
When prompted create a draft Github release. The URL for the release draft is present in the output of CLI operation. You can use this later.
-
Create a pull request for the branch. Get it reviewed using the standard review process.
-
Once the PR is approved and merged, publish the draft release. You can find the release draft in the draft tag: https://github.com/mongodb/chatbot/releases.
When the release is published, the Drone CI picks up the corresponding git tag and then automatically builds and deploys the branch to production.
For local development, manage secrets using .env
files. Wherever you need a .env
file,
there's a .env.example
file with the fields you need to add values for.
For our CI, we use Drone. You can manage Drone secrets using the Drone UI.
Our CI tests require secrets to run. These are run from the config in the .drone.yml
file.
For more information on Drone secrets management, refer to https://docs.drone.io/secret/.
Our staging environment is deployed to a Kubernetes deployment via the Kanopy developer platform. We use Kubernetes secrets. You can update secrets with either the helm ksec extension or the Kanopy UI.
Same as staging, but in the production environment.