Replies: 6 comments 1 reply
-
Yes, this would be helpful. Autotrain seems appealing but the lack of documentation, lack of how to videos, lack of basic guides blows my mind. |
Beta Was this translation helpful? Give feedback.
-
does this help: https://huggingface.co/blog/abhishek/phi3-finetune-macbook? |
Beta Was this translation helpful? Give feedback.
-
Yeah, not really. Same old same old. Even at the bottom where it says "extensive" documentation can be found here (link to huggingface documentation) it really is just a couple of pages explaining what AuoTrain is and how much it costs. I wouldn't exactly call that "extensive". A colab dataset validator with some code to detect and possibly offer solutions for correcting the issue would be fantastic. I've looked, but can't find anything. So, as it stands the process is:
|
Beta Was this translation helpful? Give feedback.
-
@splanker i read your rant :D thanks! that means more things need to be improved. |
Beta Was this translation helpful? Give feedback.
-
@abhishekkrthakur -- Appreciate your efforts to create open source tools. Autotrain makes fine-tuning of llms, a bit less intimidating. I have 2 questions chat-template parameter value | Example dataset | Training method | Applicable open llms Question 2: When chat-template = "tokenizer", is the functionality similar to using tokenizer.apply_chat_template(..,..,tokenize = False) to convert the prompt to a appropriate format for the llm? Thanks in advance. |
Beta Was this translation helpful? Give feedback.
-
@abhishekkrthakur - Can you please provide feedback to the above question? |
Beta Was this translation helpful? Give feedback.
-
Can you please explain when to use each of the following chat_template parameters?
tokenizer, chatml, zephyr, None
I am working to prepare the dataset for finetuning Llama-3-8B-Instruct
Beta Was this translation helpful? Give feedback.
All reactions