Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(elasticsearch): elasticsearch implementation #26

Merged
merged 2 commits into from
Jun 30, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@ jobs:
ports:
- 5432:5432
options: --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5
redis:
image: redislabs/rejson:latest
ports:
- 6379:6379
steps:
- uses: actions/checkout@v1
- name: Set up Python 3.7
Expand Down
4 changes: 2 additions & 2 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"files": "poetry.lock",
"lines": null
},
"generated_at": "2021-05-13T15:22:19Z",
"generated_at": "2021-06-22T16:17:18Z",
"plugins_used": [
{
"name": "AWSKeyDetector"
Expand Down Expand Up @@ -70,7 +70,7 @@
{
"hashed_secret": "6eae3a5b062c6d0d79f070c26e6d62486b40cb46",
"is_verified": false,
"line_number": 60,
"line_number": 50,
"type": "Secret Keyword"
}
]
Expand Down
27 changes: 8 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,17 +19,7 @@ The aggregated MDS is a service which caches metadata from commons metadata serv

The aggregate metadata APIs and migrations are disabled by default unless `USE_AGG_MDS=true` is specified.

The aggregate cache is built using Redis and the [RedisJson](http://redisjson.io) module. To quickly populate it you can run the following:

```bash
docker run -p 6379:6379 --name redis-redisjson redislabs/rejson:latest
```

and then

```bash
python src/mds/populate.py --config configs/brh_config.json
```
The aggregate cache is built using Elasticsearch. See the `docker-compose.yaml` file (specifically the `aggregate_migration` service) for details regarding how aggregate data is populated.

## Installation

Expand All @@ -54,14 +44,13 @@ Create a file `.env` in the root directory of the checkout:
(uncomment to override the default)

```python
# DB_HOST = "..." # default: localhost
# DB_PORT = ... # default: 5432
# DB_USER = "..." # default: current user
# DB_PASSWORD = "..." # default: empty
# DB_DATABASE = "..." # default: current user
# USE_AGG_MDS = "..." # default: false
# REDIS_DB_HOST = "..." # default: localhost
# REDIS_DB_PORT = "..." # default: 6379
# DB_HOST = "..." # default: localhost
# DB_PORT = ... # default: 5432
# DB_USER = "..." # default: current user
# DB_PASSWORD = "..." # default: empty
# DB_DATABASE = "..." # default: current user
# USE_AGG_MDS = "..." # default: false
# GEN3_ES_ENDPOINT = "..." # default: empty
```

Run database schema migration:
Expand Down
33 changes: 25 additions & 8 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ services:
- .:/src
depends_on:
- db_migration
- redis_migration
- aggregate_migration
environment:
- DB_HOST=db
- DB_USER=metadata_user
- USE_AGG_MDS=true
- REDIS_DB_HOST=redis
- GEN3_ES_ENDPOINT=http://esproxy-service:9200
command: /env/bin/uvicorn --host 0.0.0.0 --port 80 mds.asgi:app --reload
db_migration:
build: .
Expand All @@ -26,15 +26,14 @@ services:
- DB_HOST=db
- DB_USER=metadata_user
command: /env/bin/alembic upgrade head
redis_migration:
aggregate_migration:
build: .
image: mds
volumes:
- .:/src
environment:
- USE_AGG_MDS=true
- REDIS_DB_HOST=redis
command: /env/bin/python /src/src/mds/populate.py --config /src/configs/brh_config.json --hostname redis
command: /env/bin/python /src/src/mds/populate.py --config /src/configs/brh_config.json --hostname esproxy-service --port 9200
db:
image: postgres
environment:
Expand All @@ -43,7 +42,25 @@ services:
volumes:
- ./postgres-data:/var/lib/postgresql/data
- ./postgres-init:/docker-entrypoint-initdb.d:ro
redis:
image: redislabs/rejson:latest
esproxy-service:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.12
container_name: esproxy-service
environment:
- cluster.name=elasticsearch-cluster
- bootstrap.memory_lock=false
- "ES_JAVA_OPTS=-Xms1g -Xmx1g"
entrypoint:
- /bin/bash
# mmapfs requires systemctl update - see https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-store.html#mmapfs
command:
- -c
- "echo -e 'cluster.name: docker-cluster\nhttp.host: 0.0.0.0\nindex.store.type: niofs' > /usr/share/elasticsearch/config/elasticsearch.yml && /usr/local/bin/docker-entrypoint.sh eswrapper"
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
ports:
- "6379:6379"
- 9200:9200
Loading