-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build(deps-dev): bump dbldatagen from 0.3.5 to 0.4.0 #637
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dependabot
bot
added
dependencies
Pull requests that update a dependency file
python
Pull requests that update Python code
labels
Jun 10, 2024
Bumps [dbldatagen](https://github.com/databrickslabs/data-generator) from 0.3.5 to 0.4.0. - [Release notes](https://github.com/databrickslabs/data-generator/releases) - [Changelog](https://github.com/databrickslabs/dbldatagen/blob/master/CHANGELOG.md) - [Commits](databrickslabs/dbldatagen@release/v0.3.5...release/v0.4.0) --- updated-dependencies: - dependency-name: dbldatagen dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>
dependabot
bot
force-pushed
the
dependabot/pip/dbldatagen-0.4.0
branch
from
June 10, 2024 08:41
82f3b2e
to
5bf615a
Compare
ireneisdoomed
approved these changes
Jun 13, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constraints is an interesting feature that can be useful in scenarios we currently have:
- When the mock data requires specific conditions at the time of generation.
- The library already covered examples like this one where we define a list of possible values, but these additions adds much more flexibility.
- When the mock data requires specific conditions at the time of usage in a particular unit test.
- For example here, where we need study locus with an empty ldSet.
From their documentation, this is an example of how it works:
import dbldatagen as dg
data_rows = 10000000
dataspec = dg.DataGenerator(spark, rows=10000000, partitions=8)
dataspec = (
dataspec.withColumn("name", "string", template=r"\\w \\w|\\w a. \\w")
.withColumn(
"product_sku", "string", minValue=1000000, maxValue=1000000 + 1000, prefix="dr", random=True
)
.withColumn("email", "string", template=r"\\w.\\w@\\w.com")
.withColumn("qty_ordered", "int", minValue=1, maxValue=10, distribution="normal", random=True)
.withColumn("unit_price", "float", minValue=1.0, maxValue=30.0, step=0.01, distribution="normal",
baseColumn="product_sku", baseColumnType="hash")
.withColumn("order_ts", "timestamp", begin="2020-01-01 01:00:00",
end="2020-12-31 23:59:00",
interval="1 minute", random=True )
.withColumn("shipping_ts", "timestamp", begin="2020-01-05 01:00:00",
end="2020-12-31 23:59:00",
interval="1 minute", random=True, percentNulls=0.1)
.withSqlConstraint(""shipping_ts is null or shipping_ts > order_ts"")
)
df1 = dataspec.build()
I wouldn't spend time in changing what we currently have, but it's just worth knowing it exists.
project-defiant
pushed a commit
that referenced
this pull request
Jun 14, 2024
Bumps [dbldatagen](https://github.com/databrickslabs/data-generator) from 0.3.5 to 0.4.0. - [Release notes](https://github.com/databrickslabs/data-generator/releases) - [Changelog](https://github.com/databrickslabs/dbldatagen/blob/master/CHANGELOG.md) - [Commits](databrickslabs/dbldatagen@release/v0.3.5...release/v0.4.0) --- updated-dependencies: - dependency-name: dbldatagen dependency-type: direct:development update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
project-defiant
pushed a commit
that referenced
this pull request
Jul 12, 2024
This reverts commit fd06ae1.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Build
dependencies
Pull requests that update a dependency file
python
Pull requests that update Python code
size-XS
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bumps dbldatagen from 0.3.5 to 0.4.0.
Release notes
Sourced from dbldatagen's releases.
Changelog
Sourced from dbldatagen's changelog.
Commits
aae8bde
Changed release version to be 0.4.0 (#271)4206b5c
Feature standard datasets - part 2 (#286)da1df6b
Feature standard datasets - part 1 (#258)2d51200
Revert "PR To test updates to process (#278)" (#284)2863ac7
PR To test updates to process (#278)8136ccf
Feature distribution changes - migrated tests to Pytest, use of abstract base...82ce5ce
Feature constraints (#257)b28602d
Misc doc changes (#268)02d529e
changes to actions (#276)2482dca
Feature hotfixes (#274)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
@dependabot rebase
will rebase this PR@dependabot recreate
will recreate this PR, overwriting any edits that have been made to it@dependabot merge
will merge this PR after your CI passes on it@dependabot squash and merge
will squash and merge this PR after your CI passes on it@dependabot cancel merge
will cancel a previously requested merge and block automerging@dependabot reopen
will reopen this PR if it is closed@dependabot close
will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually@dependabot show <dependency name> ignore conditions
will show all of the ignore conditions of the specified dependency@dependabot ignore this major version
will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this minor version
will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)@dependabot ignore this dependency
will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)