add Alpaca #191

dkupnicki · 2023-07-17T14:58:26Z

No description provided.

jan-grzybek-ampere · 2023-07-21T09:34:59Z

natural_language_processing/text_generation/alpaca/run.py

+    model.eval()
+    if os.environ.get("TORCH_COMPILE") == "1":
+        aio = '_aio_profiler_print' in dir(torch._C)
+        model = apply_compile(model, aio)


maybe let's change that function to torch_compile_maybe() and check for TORCH_COMPILE==1 there?

jan-grzybek-ampere · 2023-07-21T14:15:14Z

natural_language_processing/text_generation/alpaca/run.py

+
+    def run_single_pass(pytorch_runner, dataset):
+        inputs = encode(dataset.get_input_array())
+        outputs = pytorch_runner.run(1, inputs=inputs.input_ids, max_new_tokens=100)


So, I think number of tokens do differ across runs? I.e. model outputs 0 <= n <= 100 tokens.

This creates a need to reimplement our functions again as it is run() call expecting to receive variable length of workload now (in the place where you hard-coded 1 at the moment). With this kind of models we won't know the magnitude of workload before actually calling run() so we need to add a way to pass length of generated output to runner somehow post run.

Otherwise we will have inaccurate throughput calculation in place.

jan-grzybek-ampere · 2023-08-01T11:40:53Z

utils/pytorch.py

+        A function replacing the last value in the self._workload_size with the new_task_size.
+        Useful for models where the task_size is unknown before finishing the run (eg. text generation models where the number of tokens generated has an upper bound but can be lower)
+        """
+        self._workload_size[-1] = new_task_size


Works but overall mechanics is not too intuitive now.

I would change this a bit so that:

task_size arg in runner.run() function has default value of None

if the task size is not specified at runner.run() stage (task size value remains None) and then it's not set by the function here, and the execution proceeds to the next run() call, error is thrown. I.e. task size has to be set either prior or just after each call.

runner.run() uses this function as well underneath passing the task size value to it for the convenience of the user. setting task size value both through runner.run(task_size=n) and then through declare_task_size(task_size=n) (my suggested naming in place of update_task_size()) direct call, results in error.

all runners (tf, torch, ...) should have this functionality - maybe it can be implemented on root class level?

jan-grzybek-ampere

How I see this is:

by default, run() has task_size=None as it already is
if it is None, then run() function doesn't append anything to work_size array, why maintain a list of something redundant and then create expressions to check if it is array of legit values or not instead of just relying on its length
if the task size value, a positive int, has been provided to run() - append to work_size
when calculating perf metrics check if list of finish/start times is equal to length of workload_size, if your concern is that somebody will waste time waiting for results till the end to learn that it failed due to that then check after each run that workload_size has been updated either manually or by run() func

jan-grzybek-ampere · 2023-08-07T16:51:05Z

utils/benchmark.py

+        A function appending new_task_size to the self._workload_size if not set yet or replacing the last value with the new_task_size if the last value is None.
+        Useful for models where the task_size is unknown before finishing the run (eg. text generation models where the number of tokens generated has an upper bound but can be lower)
+        """
+        if len(self._finish_times) > len(self._workload_size):


Maybe it would be safer to just look for an exact difference of 1, which is what is expected.

If you did it by design for the user to be able to set arbitrary number of task sizes lower than runs completed at any moment, then maybe a better name for a func would be push_task_size().

jan-grzybek-ampere · 2023-08-07T16:54:15Z

utils/ctranslate.py

@@ -34,7 +35,7 @@ def run(self, task_size, *args, **kwargs):

        self._start_times.append(start)
        self._finish_times.append(finish)
-        self._workload_size.append(task_size)
+        self.set_task_size(task_size)


if task size is None, maybe don't push it at all and expect user to take care of things manually

jan-grzybek-ampere · 2023-08-07T16:55:44Z

utils/ctranslate.py

@@ -25,7 +25,8 @@ def __init__(self, model, tokenizer, compute_type):

        print("\nRunning with CTranslate2\n")

-    def run(self, task_size, *args, **kwargs):
+    def run(self, task_size=None, *args, **kwargs):
+        assert len(self._workload_size) == 0 or self._workload_size[-1] is not None, "Task size for previous run has not been set"


Ok, so actually you don't expect user to be able to push multiple task sizes post inference (re. "Maybe it would be safer to just look for an exact difference of 1, which is what is expected.")

jan-grzybek-ampere · 2023-08-14T21:56:18Z

natural_language_processing/text_generation/alpaca/README.md

+Example command for PyTorch:
+
+```
+TORCH_COMPILE=1 AIO_SKIP_MASTER_THREAD=1 python3 run.py -m path/to/alpaca_recovered


Can we help user somehow obtain the actual model?

OSError: LLaMA does not appear to have a file named config.json. Checkout 'https://huggingface.co/LLaMA/None' for available files.
root@78e24df80b17:/ampere/ampere_model_library/natural_language_processing/text_
generation/alpaca# ls LLaMA
checklist.chk params.json tokenizer_checklist.chk
consolidated.00.pth tokenizer.model

jan-grzybek-ampere · 2023-08-14T22:19:10Z

utils/benchmark.py

+        A function appending new_task_size to the self._workload_size if not set yet or replacing the last value with the new_task_size if the last value is None.
+        Useful for models where the task_size is unknown before finishing the run (eg. text generation models where the number of tokens generated has an upper bound but can be lower)
+        """
+        if len(self._finish_times) - len(self._workload_size) == 1:


Currently vulnerable to the wrong order of execution. Why not allow task size to be set before the given execution?
I.e.:

assert abs(len(self._finish_times) - len(self._workload_size)) == 1
self._workload_size.append(new_task_size)

jan-grzybek-ampere

Please describe the process of running the model so that it's enough to run your snippets to get the model to work.

We also need the already converted model in our S3 for mzd use.

jan-grzybek-ampere · 2023-08-29T10:05:50Z

natural_language_processing/text_generation/alpaca/README.md

+```bash
+git clone https://huggingface.co/tatsu-lab/alpaca-7b-wdiff
+cd alpaca-7b-wdiff
+git lfs install


git lfs install git: 'lfs' is not a git command. See 'git --help'.

some hint on installing git-lfs first maybe?

jan-grzybek-ampere · 2023-08-29T10:17:30Z

natural_language_processing/text_generation/alpaca/README.md

+3. Convert the LLaMA weights to HuggingFace format
+```bash
+git clone https://github.com/huggingface/transformers.git
+python transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir <path_to_llama> --model_size 7B --output_dir <output_path>


RuntimeError: Failed to import transformers.models.llama.modeling_llama because of the following error (look up to see its traceback): No module named 'safetensors'

python3 alpaca-7b-wdiff/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir alpaca_recovered --model_size 7B --output_dir output FileNotFoundError: [Errno 2] No such file or directory: 'alpaca_recovered/7B/params.json'

Are you sure you're passing the correct path to this script? convert_llama_weights_to_hf.py is used to convert the original weights from Meta (the ones you have to request access to download them). The params.json file should be there.

I can see that you have alpaca_recovered in your path, which suggests to me that you used this script on the final Alpaca weights (converted from original weights to HF format and merged with weight diff from Stanford). You should be able to use these weights to run inference with the run.py script without any further changes.

tl;dr Are you trying to run step 3 with weights that are the output of step 4?

jan-grzybek-ampere

Please add transformers with Karol's branch as submodule so that AIO compatible version is used.

jan-grzybek-ampere · 2023-10-06T13:43:57Z

utils/benchmark.py

+        A function appending new_task_size to the self._workload_size if not set yet or replacing the last value with the new_task_size if the last value is None.
+        Useful for models where the task_size is unknown before finishing the run (eg. text generation models where the number of tokens generated has an upper bound but can be lower)
+        """
+        assert abs(len(self._finish_times) - len(self._workload_size)) == 1


This will throw assertion error when setting task size prior to execution. Make it so one can use it either just before or just after single inference pass.

jan-grzybek-ampere · 2023-10-06T13:44:11Z

utils/benchmark.py

@@ -116,6 +116,14 @@ def abort_maybe(self):
    def run(self, task_size: int, *args, **kwargs):
        raise NotImplementedError

+    def set_task_size(self, new_task_size):
+        """
+        A function appending new_task_size to the self._workload_size if not set yet or replacing the last value with the new_task_size if the last value is None.


I guess this is no longer accurate on what this func is doing.

jan-grzybek-ampere · 2023-10-06T13:47:22Z

utils/ort.py

        start = time.time()
        outputs = self.session.run(self._output_names, self._feed_dict)
        finish = time.time()

        self._start_times.append(start)
        self._finish_times.append(finish)
-        self._workload_size.append(task_size)
+        if task_size is not None:


maybe better to handle this inside the set_task_size function? i.e. when passed None, ignore?

jan-grzybek-ampere · 2023-10-06T13:51:20Z

utils/benchmark.py

+        """
+        assert abs(len(self._finish_times) - len(self._workload_size)) == 1
+        self._workload_size.append(new_task_size)
+


After appending check if len(self._finish_times) - len(self._workload_size) in [-1, 0] i.e. whether workload_size is being kept on track with inference progress.

jan-grzybek-ampere

Is this ready for the re-review?

jan-grzybek-ampere · 2023-10-10T08:13:07Z

.gitmodules

@@ -16,3 +16,7 @@
 [submodule "speech_recognition/whisper/whisper"]
 	path = speech_recognition/whisper/whisper
 	url = https://github.com/AmpereComputingAI/whisper.git
+[submodule "utils/nlp/transformers"]
+	path = utils/nlp/transformers


this branch of transformers is dedicated to llama only so maybe better place for it would be nlp/text_gen/alpaca/

We're pretty inconsistent in that matter. We have nnUNet, dlrm and DeepCTR submodules in their respective utils/<task> directories, but segment_anything, nanoGPT and whisper submodules are in the <task>/<model> directories.

that depends on whether datasets make use of submodule facilities or just the model runner

dkupnicki · 2023-10-10T09:34:52Z

Yes, it's ready for re-review.

jan-grzybek-ampere

Seems like transformers mismatch version of our submodule vs pypi one will be giving us a hard time once we have to move to >4.32.x version but lets worry about that later.

dkupnicki added 4 commits July 17, 2023 13:11

add alpaca

c59ac35

add readme

a3951b8

update dependencies

d2d4bd1

fix

b41dde9

jan-grzybek-ampere self-requested a review July 17, 2023 15:15

switch to PyTorchRunnerV2

802d436

jan-grzybek-ampere reviewed Jul 21, 2023

View reviewed changes

move TORCH_COMPILE check

048db46

jan-grzybek-ampere requested changes Jul 21, 2023

View reviewed changes

pass variable workload size to the runner post-run

32c09e9

jan-grzybek-ampere requested changes Aug 1, 2023

View reviewed changes

review suggestions

f08174f

jan-grzybek-ampere requested changes Aug 7, 2023

View reviewed changes

don't push None to workload_size

3ff26b3

jan-grzybek-ampere reviewed Aug 14, 2023

View reviewed changes

dkupnicki added 3 commits August 16, 2023 12:21

add model download instructions

89f8dd9

fix instructions

1b3753f

allow settin task size before execution

3480bee

jan-grzybek-ampere reviewed Aug 29, 2023

View reviewed changes

dkupnicki and others added 4 commits September 11, 2023 13:24

Update README.md with instructions about installing git-lfs

222be27

add safetensors dependency

aed39f9

Merge branch 'main' into daniel/llama

2eca6a2

aio works

7524006

jan-grzybek-ampere mentioned this pull request Sep 27, 2023

add GPT2 model #203

Closed

jan-grzybek-ampere requested changes Oct 6, 2023

View reviewed changes

dkupnicki added 2 commits October 9, 2023 11:51

use karol/llama-compile-v2

7b23746

set task size changes

ad0d55b

jan-grzybek-ampere reviewed Oct 10, 2023

View reviewed changes

move submodule

488fb8f

jan-grzybek-ampere approved these changes Oct 11, 2023

View reviewed changes

dkupnicki merged commit 286a747 into main Oct 11, 2023
1 check passed

dkupnicki deleted the daniel/llama branch October 11, 2023 16:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Alpaca #191

add Alpaca #191

dkupnicki commented Jul 17, 2023

jan-grzybek-ampere Jul 21, 2023

jan-grzybek-ampere Jul 21, 2023 •

edited

Loading

jan-grzybek-ampere Jul 21, 2023

jan-grzybek-ampere Aug 1, 2023

jan-grzybek-ampere left a comment

jan-grzybek-ampere Aug 7, 2023

jan-grzybek-ampere Aug 7, 2023

jan-grzybek-ampere Aug 7, 2023

jan-grzybek-ampere Aug 7, 2023

jan-grzybek-ampere Aug 14, 2023

jan-grzybek-ampere Aug 14, 2023

jan-grzybek-ampere Aug 14, 2023

jan-grzybek-ampere left a comment

jan-grzybek-ampere Aug 29, 2023

jan-grzybek-ampere Aug 29, 2023

jan-grzybek-ampere Aug 29, 2023

jan-grzybek-ampere Aug 29, 2023

dkupnicki Sep 11, 2023

jan-grzybek-ampere left a comment

jan-grzybek-ampere Oct 6, 2023

jan-grzybek-ampere Oct 6, 2023

jan-grzybek-ampere Oct 6, 2023

jan-grzybek-ampere Oct 6, 2023

jan-grzybek-ampere left a comment

jan-grzybek-ampere Oct 10, 2023

dkupnicki Oct 10, 2023

jan-grzybek-ampere Oct 10, 2023

dkupnicki commented Oct 10, 2023

jan-grzybek-ampere left a comment

add Alpaca #191

add Alpaca #191

Conversation

dkupnicki commented Jul 17, 2023

Choose a reason for hiding this comment

jan-grzybek-ampere Jul 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jan-grzybek-ampere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jan-grzybek-ampere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jan-grzybek-ampere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jan-grzybek-ampere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dkupnicki commented Oct 10, 2023

jan-grzybek-ampere left a comment

Choose a reason for hiding this comment

jan-grzybek-ampere Jul 21, 2023 •

edited

Loading