TASK: Cleanup the legacy namespace for lineage event #449

csun-cpointe · 2024-10-31T14:33:57Z

Description

In v1.7.0 we have released the OpenLineage Namesapce Conventions to better follow OpenLineage's guidelines. Moving forward, namespaces should be defined in the data-lineage.properties file. We are cleaning up the data.lineage.namespace properties in a project's data-lineage.properties file, which was supported as a fallback but will no longer be supported in release 1.10

DOD

Acceptance criteria required to complete the work

Clean up the data.lineage.namespace support functions and tests
Clean up the data.lineage.namespace properties in the data-lineage.properties file
The pyspark, spark, and model training pipelines lineage events should still work as expected.

Test Strategy/Script

How will this item be verified?

Using create a new aissemble-based project using the latest archetype snapshot.

mvn archetype:generate '-DarchetypeGroupId=com.boozallen.aissemble' \
                           '-DarchetypeArtifactId=foundation-archetype' \
                           '-DarchetypeVersion=1.10.0-SNAPSHOT' \
                           '-DgroupId=org.test' \
                           '-Dpackage=org.test' \
                           '-DprojectGitUrl=test.org/test.git' \
                           '-DprojectName=Test pyspark lineage' \
                           '-DartifactId=test-449' \
    && cd test-449

Set your Java version to 17 if it is not currently
Under -model/src/main/resources/pipelines add below pipeline models SparkPipeline.json, PythonPipeline.json, and ClassificationTraining.json
Fully generate the project by running mvn clean install and following manual actions
Build the project without the cache and follow the last manual action.
```
mvn clean install -Dmaven.build.cache.skipCache
```
Deploy the project and wait for all services ready
```
tilt up; tilt down
```
Manually trigger the python-pipeline pod and verify no errors in the log
Manually trigger the spark-pipeline pod and verify no errors in the log
Use postman or any rest client to trigger the training pipeline and verify a successful training pipeline id responded
- Url : http://localhost:5001/training-jobs?pipeline_step=logistic-training
- Method: Post
- Body: {}
- Copy the responded service id and run below command to verify the log without any errors. e.g.: kubectl logs job.batch/"model-training-logistic-tr-24cd1662-5b62-4e3c-946f-6b9081e30017"
```
kubectl logs job.batch/<PASTE OUTPUT>
```

References/Additional Context

As needed

The text was updated successfully, but these errors were encountered:

…leanup #449 data/model lineage legacy namespace cleanup

carter-cundiff · 2024-11-01T16:03:52Z

Testing passed:

csun-cpointe added the task label Oct 31, 2024

csun-cpointe self-assigned this Oct 31, 2024

csun-cpointe added the slacktask label Oct 31, 2024

csun-cpointe added this to the 1.10.0 milestone Oct 31, 2024

csun-cpointe added a commit that referenced this issue Nov 1, 2024

Merge pull request #450 from boozallen/449-lineage-legacy-namespace-c…

Loading
Loading status checks…

ac581d3

…leanup #449 data/model lineage legacy namespace cleanup

carter-cundiff closed this as completed Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TASK: Cleanup the legacy namespace for lineage event #449

TASK: Cleanup the legacy namespace for lineage event #449

csun-cpointe commented Oct 31, 2024 •

edited

Loading

carter-cundiff commented Nov 1, 2024

TASK: Cleanup the legacy namespace for lineage event #449

TASK: Cleanup the legacy namespace for lineage event #449

Comments

csun-cpointe commented Oct 31, 2024 • edited Loading

Description

DOD

Test Strategy/Script

References/Additional Context

carter-cundiff commented Nov 1, 2024

csun-cpointe commented Oct 31, 2024 •

edited

Loading