Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add openlineage_dataset_uri in databricks db #1919

Merged
merged 1 commit into from
May 4, 2023

Conversation

pankajastro
Copy link
Contributor

@pankajastro pankajastro commented May 4, 2023

currently, integration tests fail with below error so adding openlineage_dataset_uri in databrick delta db

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/openlineage/airflow/extractors/manager.py", line 42, in extract_metadata
    task_metadata = extractor.extract_on_complete(task_instance)
  File "/usr/local/lib/python3.9/site-packages/openlineage/airflow/extractors/base.py", line 112, in extract_on_complete
    return self._get_openlineage_facets(
  File "/usr/local/lib/python3.9/site-packages/openlineage/airflow/extractors/base.py", line 120, in _get_openlineage_facets
    facets: OperatorLineage = get_facets_method(*args)
  File "/home/astro/.local/lib/python3.9/site-packages/astro/sql/operators/load_file.py", line 323, in get_openlineage_facets_on_complete
    name=self.output_table.name, uri=self.output_table.openlineage_dataset_uri()
  File "/home/astro/.local/lib/python3.9/site-packages/astro/table.py", line 170, in openlineage_dataset_uri
    return f"{database.openlineage_dataset_uri(table=self)}"
  File "/home/astro/.local/lib/python3.9/site-packages/astro/databases/base.py", line 894, in openlineage_dataset_uri
    raise NotImplementedError
NotImplementedError
[2023-05-04, 00:35:01 UTC] {standard_task_runner.py:100} ERROR - Failed to execute job 44345 for task load_file__23 (Databricks job failed. Job info ***

@pankajastro pankajastro marked this pull request as ready for review May 4, 2023 07:36
@codecov
Copy link

codecov bot commented May 4, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.55 🎉

Comparison is base (ccf277d) 84.78% compared to head (bb34fd0) 85.34%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1919      +/-   ##
==========================================
+ Coverage   84.78%   85.34%   +0.55%     
==========================================
  Files         104      104              
  Lines        5950     5952       +2     
  Branches      677      677              
==========================================
+ Hits         5045     5080      +35     
+ Misses        761      734      -27     
+ Partials      144      138       -6     
Flag Coverage Δ
PythonSDK 93.13% <100.00%> (+0.78%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
python-sdk/src/astro/databases/databricks/delta.py 85.61% <100.00%> (+0.19%) ⬆️

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@pankajastro pankajastro merged commit 4c88d5e into main May 4, 2023
@pankajastro pankajastro deleted the lineage_for_databricks branch May 4, 2023 09:19
utkarsharma2 pushed a commit that referenced this pull request Aug 8, 2023
currently, integration tests fail with below error so adding
`openlineage_dataset_uri` in databrick delta db

```
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/openlineage/airflow/extractors/manager.py", line 42, in extract_metadata
    task_metadata = extractor.extract_on_complete(task_instance)
  File "/usr/local/lib/python3.9/site-packages/openlineage/airflow/extractors/base.py", line 112, in extract_on_complete
    return self._get_openlineage_facets(
  File "/usr/local/lib/python3.9/site-packages/openlineage/airflow/extractors/base.py", line 120, in _get_openlineage_facets
    facets: OperatorLineage = get_facets_method(*args)
  File "/home/astro/.local/lib/python3.9/site-packages/astro/sql/operators/load_file.py", line 323, in get_openlineage_facets_on_complete
    name=self.output_table.name, uri=self.output_table.openlineage_dataset_uri()
  File "/home/astro/.local/lib/python3.9/site-packages/astro/table.py", line 170, in openlineage_dataset_uri
    return f"{database.openlineage_dataset_uri(table=self)}"
  File "/home/astro/.local/lib/python3.9/site-packages/astro/databases/base.py", line 894, in openlineage_dataset_uri
    raise NotImplementedError
NotImplementedError
[2023-05-04, 00:35:01 UTC] {standard_task_runner.py:100} ERROR - Failed to execute job 44345 for task load_file__23 (Databricks job failed. Job info ***
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants