Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feat] Support PythonCodesOperator and BashCodesOperator that wraps an existing python file, or some code snippets to be executed, such as the existing DJ tools. #412

Open
2 tasks done
yxdyc opened this issue Sep 2, 2024 · 3 comments · May be fixed by #480
Assignees
Labels
dj:op issues/PRs about some specific OPs enhancement New feature or request

Comments

@yxdyc
Copy link
Collaborator

yxdyc commented Sep 2, 2024

Search before continuing 先搜索,再继续

  • I have searched the Data-Juicer issues and found no similar feature requests. 我已经搜索了 Data-Juicer 的 issue 列表但是没有发现类似的功能需求。

Description 描述

Often, users require the integration of specific Data-Juicer tools, custom functionalities encapsulated within some helper_func.py, or some short Python scripts, such as a few lambda functions. These may not warrant the creation of a dedicated Operator due to the additional overhead involved, including subclassing, documentation, and unit testing.

To enhance flexibility and cater to diverse user needs, introducing some OPs that seamlessly incorporates existing Python files or executes code snippets would be beneficial. This approach enables users to enrich their data recipe configurations with a wider array of tools and custom code, which can be managed through a streamlined PythonCodesOperator and BashCodesOperator mechanism.

Use case 使用场景

No response

Additional 额外信息

No response

Are you willing to submit a PR for this feature? 您是否乐意为此功能提交一个 PR?

  • Yes I'd like to help by submitting a PR! 是的!我愿意提供帮助并提交一个PR!
@yxdyc yxdyc added the enhancement New feature or request label Sep 2, 2024
Copy link

This issue is marked as stale because there has been no activity for 21 days. Remove stale label or add new comments or this issue will be closed in 3 day.

Copy link

Close this stale issue.

@github-project-automation github-project-automation bot moved this from Todo to Done in data-juicer Sep 28, 2024
@HYLcool HYLcool reopened this Oct 31, 2024
@github-project-automation github-project-automation bot moved this from Done to In Progress in data-juicer Oct 31, 2024
@yxdyc
Copy link
Collaborator Author

yxdyc commented Nov 4, 2024

WIP: add basic abstraction and implementation for ScriptOP and some leaf classes such as PythonCodesOperator and BashCodesOperator, @drcege @BeachWang

@yxdyc yxdyc added dj:op issues/PRs about some specific OPs and removed stale-issue labels Nov 4, 2024
@drcege drcege linked a pull request Nov 11, 2024 that will close this issue
@drcege drcege linked a pull request Nov 11, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dj:op issues/PRs about some specific OPs enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants