Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[User] train-text-from-scratch.exe stop before"begin training" #2131

Closed
SkibaSAY opened this issue Jul 7, 2023 · 5 comments
Closed

[User] train-text-from-scratch.exe stop before"begin training" #2131

SkibaSAY opened this issue Jul 7, 2023 · 5 comments

Comments

@SkibaSAY
Copy link

SkibaSAY commented Jul 7, 2023

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

The command will end after training, the trained model will be saved

Current Behavior

The execution is completed before the start of the training. There are no errors and no trained model.

Environment and Context

Windows 11 Pro(version 21H2)
Intel(R) Core(TM) i5-12600KF 3.70 GHz
RAM 32,0
Cuda 12.0

$ Python 3.10.11
$ cmake version 3.27.0-rc3

Failure Information (for bugs)

It looks like #1869.
I checked, my version already contains the fixes suggested here 5ec8dd5

Steps to Reproduce

1.I have bowed the last master(master-481f793).
2.run train-text-from-scratch, but the application terminates without an error before reaching "begin training":

(from llama.cpp): build\bin\Release\train-text-from-scratch.exe --vocab-model models\ggml-vocab.bin --checkpoint-in chk-lamartine-256x16.bin --checkpoint-out chk-lamartine-256x16.bin --model-out ggml-lamartine-265x16-f32.bin --train-data "shakespeare.txt"

3.execution ends before calling line 3195 printf("%s: begin training\n", func);
in the train-text-from-scratch file

Failure Logs

D:\torrents\LlamaCppTest\llama.cpp>build\bin\Release\train-text-from-scratch.exe --vocab-model models\ggml-vocab.bin --checkpoint-in chk-lamartine-256x16.bin --checkpoint-out chk-lamartine-256x16.bin --model-out ggml-lamartine-265x16-f32.bin --train-data "shakespeare.txt"
main: seed: 1688730007
llama.cpp: loading model from models\ggml-vocab.bin
llama_model_load_internal: format = ggjt v1 (pre #1405)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 1 (mostly F16)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: model size = 7B
main: tokenize training data
main: number of training tokens: 27584
print_params: n_vocab: 32000
print_params: n_ctx: 128
print_params: n_embd: 256
print_params: n_mult: 256
print_params: n_head: 8
print_params: n_ff: 768
print_params: n_layer: 16
print_params: n_rot: 32
main: number of unique tokens: 3070
main: init model
load_checkpoint: Training iterations: 0.
load_checkpoint: Training samples: 0.
load_checkpoint: Training tokens: 0.
main: opt iter 0
used_mem model+cache: 244466304 bytes

D:\torrents\LlamaCppTest\llama.cpp>

My Task(I need advice or an opinion on this task - everything will be useful.)

I need to refine the model based on my data (I have a set of html files from which I need to extract information about the winner (tender purchases) - parsing tools are ineffective because html is generated from ordinary files and we don't know anything about their structure in advance. I have a set of examples for which I know the correct answer, using these examples I want to train a new model or retrain an existing openassistant-llama-30b-ggml-q5_1.bin

If you have any tips or information on how I can retrain an existing model based on my data, please share them - thank you.

@mqy
Copy link
Contributor

mqy commented Jul 9, 2023

I checked out commit 481f793, rebuilt on macOS:

rm -r build/*
cd build
cmake ..
cmake --build . --config Release
./bin/train-text-from-scratch  --vocab-model ../models/ggml-vocab.bin   --checkpoint-in  chk-shakespeare-256x16.bin  --checkpoint-out chk-shakespeare-256x16.bin --model-out ggml-shakespeare-256x16-f32.bin --train-data "../models/shakespeare.txt"

...
used_mem model+cache: 244466304 bytes
main: begin training
main: opt->params.adam.sched 0.00000

Suggest you double check that:

  1. you can run main.
  2. shakespeare.txt exists in llama.cpp/ and not empty?
    The program will abort with an error If the train file does not exist.

Finally, do a clean rebuild?

@SkibaSAY
Copy link
Author

SkibaSAY commented Jul 10, 2023

Thank you, after rebuilding the project, the problem is gone, the training has gone =)

I think the problem is that I could have missed this command: cmake --build . --config Release

@SkibaSAY
Copy link
Author

If it's not difficult for you, please tell me, do you know how to additionally train an existing model using llama.cpp?

As I understand it, train-text-from-scratch is not suitable, tk creates a model from scratch.

@mqy
Copy link
Contributor

mqy commented Jul 10, 2023

If it's not difficult for you, please tell me, do you know how to additionally train an existing model using llama.cpp?

llama.cpp is targeting at inference.

@SkibaSAY
Copy link
Author

Ok, I found information about training my model, maybe it will be useful to someone:
https://tproger.ru/articles/kak-sozdat-prilozhenie-s-nejrosetyu-na-baze-llm-alpaca-bystro-i-prosto/
https://github.com/tatsu-lab/stanford_alpaca/tree/main

I hope they won't ban me for links, I want to help those who are also digging in this direction.
Thank for help =)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants