Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MongoDB Atlas Local Testcontainer #8760

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Conversation

luketn
Copy link

@luketn luketn commented Jun 9, 2024

This adds a new Testcontainer class for MongoDB's Atlas Local container to the existing databases/mongodb module.

The benefits of having this as a Testcontainer (rather than an example for a GenericContainer):

  • solves the startup time issue using a custom wait strategy based on the runner healthcheck command (see below)
  • provides the connection string for MongoDB

I've tried to follow all the conventions and style for contributions. I've added unit tests and documentation.
* I might have been a bit verbose in my documentation contribution - happy to trim that down / cut it down

Background

MongoDB Atlas Local combines the MongoDB database engine with MongoT, a sidecar process for advanced searching capabilities built by MongoDB and powered by Apache Lucene.

It allows you to use the following features:

MongoDB Atlas Search: Atlas Search gives MongoDB queries access to the incredible search toolbox that is Lucene. The main use-case is advanced lexical text querying capabilities similar to those found in many search engines. In addition, Atlas Search supports queries with faceting and parallel index search. These can extend MongoDB's aggregation capabilities and performance for uses like statistics and complex filters.
https://www.mongodb.com/docs/atlas/atlas-search/
MongoDB Atlas Vector Search: Supports artificial intelligence (AI) based searches for semantically similar items in your data. Vector indexes store embeddings (high-dimensional vectors encoding semantic meaning) used in large language models (LLMs). This feature makes use of Lucene's vector search capabilities to find the nearness between the values of each vector. This can be a powerful alternative or compliment to lexical text search.
https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/
Pairing these Lucene backed technologies with your MongoDB database allows you to build powerful search capabilities into your applications without the need to manage a separate search engine. You can also extend your search capabilities to include AI based vector searches, which can be useful for recommendation engines, image search, and other applications that require similarity searches.

The container (mongodb/mongodb-atlas-local) documentation can be found here: https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/

Container Healthcheck

You cannot start calling Atlas Search commands, such as creating Atlas Search indexes, until the container is ready. The container takes some seconds to attain readiness, whilst:

  • MongoDB database starts
  • MongoDB initialises itself as a replica set
  • MongoT starts and connects to the MongoDB database ready to follow Change Streams for indexing
  • MongoDB connects to MongoT ready to perform $search and $vectorSearch queries

The MongoDBAtlasLocalContainer uses the container's runner healthcheck command to check for readiness.

Copy link
Member

@eddumelendez eddumelendez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, @luketn ! Glad to see the PR with less more complexity than the initial implementation. I've left some comments. Also, related to the docs I'd say we should focus on what the Testcontainers implementation offers. Of course, if there is more we can do with it and the mongodb atlas container then we should document it.


import java.time.Instant;

import static org.junit.Assert.*;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use assertj instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,43 @@
package org.testcontainers.containers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
package org.testcontainers.containers;
package org.testcontainers.mongodb;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddumelendez I'm concerned that this would be a breaking change for existing users of the MongoDBContainer (assuming we would move both classes to this new package). Perhaps we should leave both in the current package unless making a breaking change to move both to the new package?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's only move MongoDBAtlasLocalContainer, since it is a new implementation there is no issues.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried doing that, but the container uses a custom ContainerDef extension to allow it to override the wait strategy. This class is package local, so requires that we remain in the org.testcontainers.containers package.

The use of a custom container def will be important for future changes too, since we will want to add custom with... statements for atlas search indexes and seed data once the container supports that (assuming MongoDB agree and add the necessary dependencies to the container).

import org.testcontainers.containers.wait.strategy.Wait;
import org.testcontainers.utility.DockerImageName;

public class MongoDBAtlasLocalContainer extends GenericContainer<MongoDBAtlasLocalContainer> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if can be renamed to MongoDbAtlasContainer instead. WDYT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is Atlas Local is the more consistent name with MongoDB's own documentation and naming conventions:
https://hub.docker.com/r/mongodb/mongodb-atlas-local
https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/

Comment on lines 70 to 83
### General info
MongoDB Atlas Local combines the MongoDB database engine with MongoT, a sidecar process for advanced searching capabilities built by MongoDB and powered by [Apache Lucene](https://lucene.apache.org/).

It allows you to use the following features:

* MongoDB Atlas Search: Atlas Search gives MongoDB queries access to the incredible search toolbox that is Lucene. The main use-case is advanced lexical text querying capabilities similar to those found in many search engines. In addition, Atlas Search supports queries with faceting and parallel index search. These can extend MongoDB's aggregation capabilities and performance for uses like statistics and complex filters.
[https://www.mongodb.com/docs/atlas/atlas-search/](https://www.mongodb.com/docs/atlas/atlas-search/)
* MongoDB Atlas Vector Search: Supports artificial intelligence (AI) based searches for semantically similar items in your data. Vector indexes store embeddings (high-dimensional vectors encoding semantic meaning) used in large language models (LLMs). This feature makes use of Lucene's [vector search](https://lucene.apache.org/core/9_10_0/core/org/apache/lucene/search/KnnVectorQuery.html) capabilities to find the nearness between the values of each vector. This can be a powerful alternative or compliment to lexical text search.
[https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/](https://www.mongodb.com/docs/atlas/atlas-vector-search/vector-search-overview/)

Pairing these Lucene backed technologies with your MongoDB database allows you to build powerful search capabilities into your applications without the need to manage a separate search engine. You can also extend your search capabilities to include AI based vector searches, which can be useful for recommendation engines, image search, and other applications that require similarity searches.

The container (mongodb/mongodb-atlas-local) documentation can be found here:
[https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/](https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add a link to MongoDB Atlas documentation instead of adding product related information in our docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddumelendez should we remove the 'incubating' statement from the top of this docs page? It's been there a long while!
image

Is it still incubating?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove it :)

}
}
}
sleep(50);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use awaitility here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not see the awaitility usage here :)

Comment on lines 128 to 129


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// initAtlasSearchIndex {
public void initAtlasSearchIndex() throws URISyntaxException, IOException, InterruptedException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, databases provide some default values for username, password and databases. What if we do the same for MongoDB Atlas implementation providing default values for database, collection name that can be customized? Also, a withIndex(Transferable json) copies the mapping json file from classpath to the container and execute the right commands in the container. No need to do so right away but just thinking out loud.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've spoken to MongoDB and requested that they add the MongoDB shell mongosh, as well as the database tools mongoimport, mongoexport, mongodump... which would allow us to create these nice extra features. If they add these to the image, I will add these features as a further PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, nice! any public place where we can follow the request?

I'm very interested on this because would like to improve the time to execute test in langchain4j and spring-ai integrations.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shared this request with the product team directly, but it's a great idea to have a ticket - I created one and will share with the product team too:
https://feedback.mongodb.com/forums/924868-atlas-search/suggestions/48857015-include-mongodb-shell-mongosh-and-mongodb-tools

## Usage example
The MongoDB module provides two Testcontainers for MongoDB unit testing:

* [MongoDBContainer](#mongodbcontainer) - the core MongoDB database
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* [MongoDBContainer](#mongodbcontainer) - the core MongoDB database
* [MongoDBContainer](#mongodbcontainer) - the core MongoDB database with ReplicaSet enabled

Copy link
Author

@luketn luketn Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we discuss this one? (booked a session next week)
I actually think the current MongoDB test container could use a lot of work (including some updates to its docs).
I thought about including some of these changes, but perhaps we could do that in a separate PR?
In this case, I think we should really make replica set configuration optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we discuss this one?

Sure

I thought about including some of these changes, but perhaps we could do that in a separate PR?

Yes, let's discuss it first and then a separate PR is the way to go

try (
// creatingAtlasLocalContainer {
MongoDBAtlasLocalContainer atlasLocalContainer = new MongoDBAtlasLocalContainer(
MongoDBAtlasLocalContainer.DEFAULT_IMAGE_NAME.withTag("7.0.9")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for documentations, let's use the string raw image

Comment on lines 50 to 61
// writeAndReadBack {
atlasLocalDataAccess.insertData(new AtlasLocalDataAccess.TestData("tests", 123, true));

//Wait for Atlas Search to index the data (Atlas Search is eventually consistent)
await()
.atMost(5, TimeUnit.SECONDS)
.pollInterval(10, TimeUnit.MILLISECONDS)
.pollInSameThread()
.until(
() -> atlasLocalDataAccess.findAtlasSearch("test"),
Objects::nonNull);
// }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AtlasLocalDataAccess.TestData is an class used only for testing. So, let's avoid showing it in the docs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - actually I already removed from the docs, I've now also removed the docs comments here too

Comment on lines 73 to 74
The container (mongodb/mongodb-atlas-local) documentation can be found here:
[https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/](https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The container (mongodb/mongodb-atlas-local) documentation can be found here:
[https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/](https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/)
The container (mongodb/mongodb-atlas-local) documentation can be found [here](https://www.mongodb.com/docs/atlas/cli/current/atlas-cli-deploy-docker/)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 76 to 77
General information about Atlas Search can be found here:
[https://www.mongodb.com/docs/atlas/atlas-search/](https://www.mongodb.com/docs/atlas/atlas-search/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
General information about Atlas Search can be found here:
[https://www.mongodb.com/docs/atlas/atlas-search/](https://www.mongodb.com/docs/atlas/atlas-search/)
General information about Atlas Search can be found [here](https://www.mongodb.com/docs/atlas/atlas-search/).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants