Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roll our own generate_string() because mimesis' has gone away #13257

Merged

Conversation

shwina
Copy link
Contributor

@shwina shwina commented May 1, 2023

Starting mimesis=9.0.0, the generate_string function has become private:

In [1]: import mimesis

In [2]: mimesis.__version__
Out[2]: '9.0.0'

In [3]: mimesis.random.random.generate_string
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 1
----> 1 mimesis.random.random.generate_string

AttributeError: 'Random' object has no attribute 'generate_string'

In [4]: mimesis.random.random._generate_string
Out[4]: <bound method Random._generate_string of <mimesis.random.Random object at 0x555992f65a20>>

This PR replaces all uses of the function with a homespun one. Note that the implementation is about as fast (perhaps identical?)

In [6]: %timeit "".join(random.choices(string.printable, k=100))
9.25 µs ± 98.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [7]: %timeit mimesis.random.random._generate_string(string.printable, 100)
9.62 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

@shwina shwina requested a review from a team as a code owner May 1, 2023 15:02
@github-actions github-actions bot added the Python Affects Python cuDF API. label May 1, 2023
@shwina shwina added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels May 1, 2023
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwina Great news! I think this means we can drop the mimesis dependency entirely. Can you do that in this PR?

@bdice
Copy link
Contributor

bdice commented May 1, 2023

Note: We may also need to drop this pinning from the integration repository.

@shwina
Copy link
Contributor Author

shwina commented May 1, 2023

@bdice it looks like there are some non-trivial uses for mimesis in our code base (see for example

generator=lambda g: [g.address.city() for _ in range(40)],
) that I don't think we can drop it as a dependency.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shwina Thanks for noting the nontrivial uses of mimesis. I hadn't caught those when reading before because it isn't directly calling mimesis, it's going through some kind of generator API.

However, I wonder if we could move towards cutting those features from our code... it doesn't seem particularly important to have a "city name generator" as a dependency in cudf.

@ttnghia
Copy link
Contributor

ttnghia commented May 1, 2023

/merge

@rapids-bot rapids-bot bot merged commit 392d09b into rapidsai:branch-23.06 May 1, 2023
@bdice bdice mentioned this pull request Jan 8, 2024
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants