Skip to content

Latest commit

 

History

History
107 lines (88 loc) · 4.22 KB

how-to-add-data-source.md

File metadata and controls

107 lines (88 loc) · 4.22 KB

How to Add a New Data Source

Implementation

We have a file named data_source.py located in the app/model. This file contains an enum class called DataSource.

Your task is to add a new data source to this DataSource enum class.

class DataSource(StrEnum):
    postgres = auto()

Additionally, you need to add the corresponding data source and DTO mapping in the enum class DataSourceExtension.

class DataSourceExtension(Enum):
    postgres = QueryPostgresDTO

Create a new DTO (Data Transfer Object) class in the model directory. The model directory is located at app/model and its contents are defined in the __init__.py file.

class QueryPostgresDTO(QueryDTO):
    connection_info: ConnectionUrl | PostgresConnectionInfo = connection_info_field

The connection info depends on ibis.

class PostgresConnectionInfo(BaseModel):
    host: SecretStr = Field(examples=["localhost"])
    port: SecretStr = Field(examples=[5432])
    database: SecretStr
    user: SecretStr
    password: SecretStr

We use the base model of Pydantic to support our class definitions. Pydantic provides a convenient field type called Secret Types that can protect the sensitive information.

Return to the DataSourceExtension enum class to implement the get_{data_source}_connection function. This function should be specific to your new data source. For example, if you've added a PostgreSQL data source, you might implement a get_postgres_connection function.

@staticmethod
def get_postgres_connection(
    info: ConnectionUrl | PostgresConnectionInfo,
) -> BaseBackend:
    if hasattr(info, "connection_url"):
        return ibis.connect(info.connection_url.get_secret_value())
    return ibis.postgres.connect(
        host=info.host.get_secret_value(),
        port=int(info.port.get_secret_value()),
        database=info.database.get_secret_value(),
        user=info.user.get_secret_value(),
        password=info.password.get_secret_value(),
    )

Test

After implementing the new data source, you should add a test case to ensure it's working correctly.

Create a new test file test_postgres.py in the tests/routers/v2/connector directory.

Set up the basic test structure:

import pytest
from fastapi.testclient import TestClient
from app.main import app

pytestmark = pytest.mark.postgres
client = TestClient(app)

We use pytest as our test framework. You can learn more about the pytest marker and fixtures.

As we use a strict marker strategy in pytest, you need to declare the new marker in the pyproject.toml file. Open the pyproject.toml file and locate the [tool.pytest.ini_options] section. Add your new marker to the markers list:

[tool.pytest.ini_options]
markers = [
    "postgres: mark a test as a postgres test",
]

If the data source has a Docker image available, you can use testcontainers-python to simplify your testing setup:

import pytest
from testcontainers.postgres import PostgresContainer

@pytest.fixture(scope="module")
def postgres(request) -> PostgresContainer:
    pg = PostgresContainer("postgres:16-alpine").start()
    request.addfinalizer(pg.stop)
    return pg

Execute the following command to run the test cases and ensure your new feature is working correctly:

poetry run pytest -m postgres

This command runs tests marked with the postgres marker.

Submitting Your Work

After confirming that all tests pass, you can create a Pull Request (PR) to add the new data source to the project.

When creating the PR:

  • If you are solving an existing issue, remember to link the PR to that issue.
  • If this is a new feature, provide detailed information about the feature in the PR description.

Congratulations! You have successfully added a new data source to the project and created tests to verify its functionality.