Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python API for restoring delta table #903

Closed

Conversation

Maks-D
Copy link
Contributor

@Maks-D Maks-D commented Jan 22, 2022

  • Add possibility to restore delta table using version or timestamp from pyspark
    Examples:
    DeltaTable.forPath(spark, path).restoreToVersion(0)
    DeltaTable.forPath(spark, path).restoreToTimestamp('2021-01-01 01:01-01')
    

Tested by unit tests.

Fixes #890

Signed-off-by: Maksym Dovhal maksym.dovhal@gmail.com

@Maks-D
Copy link
Contributor Author

Maks-D commented Jan 25, 2022

Hi, @tdas,
Just want to inform you that I've removed cache of filesToRemove from RestoreDeltaTable (according to our discussion #863 (comment))

Copy link
Collaborator

@vkorukanti vkorukanti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Maks-D for the PR. The PR looks good. Minor comments. I am wondering if you could split this into two PRs (one PR handling the python API and other other one handling the refactoring)

python/delta/tests/test_deltatable.py Show resolved Hide resolved
self.__overwriteDeltaTable([('a', 3), ('b', 2)],
schema=["key_new", "value_new"],
overwriteSchema='true')

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add check to verify that data has changed?
self.__checkAnswer(restored, [Row(key='a', value=3), Row(key='b', value=2)])?

Similarly in the above test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both tests are updated

@dennyglee dennyglee added the enhancement New feature or request label Jan 25, 2022
 * Add possibility to restore delta table using version or timestamp from pyspark
   Examples:
   ```
   DeltaTable.forPath(spark, path).restoreToVersion(0)
   DeltaTable.forPath(spark, path).restoreToTimestamp('2021-01-01 01:01-01')
   ```

Fixes delta-io#890

Signed-off-by: Maksym Dovhal <maksym.dovhal@gmail.com>
@Maks-D Maks-D force-pushed the python_api_for_deltatable_restore branch from 2550587 to f5af1c6 Compare January 26, 2022 18:25
@Maks-D
Copy link
Contributor Author

Maks-D commented Jan 26, 2022

@vkorukanti Thank you for review.
I've fixed your comments. Refactoring moved to #912

allisonport-db pushed a commit that referenced this pull request Feb 4, 2022
 * Add possibility to restore delta table using version or timestamp from pyspark
   Examples:
   ```
   DeltaTable.forPath(spark, path).restoreToVersion(0)
   DeltaTable.forPath(spark, path).restoreToTimestamp('2021-01-01 01:01-01')
   ```

Tested by unit tests.

Fixes #890

Signed-off-by: Maksym Dovhal <maksym.dovhal@gmail.com>

Closes #903

Signed-off-by: Venki Korukanti <venki.korukanti@databricks.com>
GitOrigin-RevId: 8ca6a3643d97b1a95ebf3a48edcb23f4f2adb6f4
jbguerraz pushed a commit to jbguerraz/delta that referenced this pull request Jul 6, 2022
 * Add possibility to restore delta table using version or timestamp from pyspark
   Examples:
   ```
   DeltaTable.forPath(spark, path).restoreToVersion(0)
   DeltaTable.forPath(spark, path).restoreToTimestamp('2021-01-01 01:01-01')
   ```

Tested by unit tests.

Fixes delta-io#890

Signed-off-by: Maksym Dovhal <maksym.dovhal@gmail.com>

Closes delta-io#903

Signed-off-by: Venki Korukanti <venki.korukanti@databricks.com>
GitOrigin-RevId: 8ca6a3643d97b1a95ebf3a48edcb23f4f2adb6f4
jbguerraz pushed a commit to jbguerraz/delta that referenced this pull request Jul 6, 2022
 * Add possibility to restore delta table using version or timestamp from pyspark
   Examples:
   ```
   DeltaTable.forPath(spark, path).restoreToVersion(0)
   DeltaTable.forPath(spark, path).restoreToTimestamp('2021-01-01 01:01-01')
   ```

Tested by unit tests.

Fixes delta-io#890

Signed-off-by: Maksym Dovhal <maksym.dovhal@gmail.com>

Closes delta-io#903

Signed-off-by: Venki Korukanti <venki.korukanti@databricks.com>
GitOrigin-RevId: 8ca6a3643d97b1a95ebf3a48edcb23f4f2adb6f4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create Python API for Delta restore command
3 participants