-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes for orcacode experiment #3612
Conversation
Besides this I think there is a bug in this line. Config cannot be updated. I removed it from my version. Can you check that too with this PR? @andreaskoepf |
@@ -199,45 +198,37 @@ def __getitem__(self, idx): | |||
class DolphinMix(Dataset): | |||
name = "dophin-mix" | |||
|
|||
def __init__(self, cache_dir, num_samples=100000, max_char_len=8000, seed=42): | |||
def __init__(self, cache_dir, num_samples: Optional[int] = None, max_char_len: int = 8000, seed: int = 42): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to add data_files
as an argument with the default value "flan-5m..."? In this way, we can change to gpt-4 version using config if required.
`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the update_config
function definition too since it's not used?
Nice, thanks. |
No description provided.