Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running inference on Llama3 model #4000

Open
vinven7 opened this issue Apr 24, 2024 · 5 comments
Open

Error running inference on Llama3 model #4000

vinven7 opened this issue Apr 24, 2024 · 5 comments
Labels
bug Something isn't working llm Large Language Model related

Comments

@vinven7
Copy link

vinven7 commented Apr 24, 2024

When I run inference on a Llama3 model finetuned using Ludwig, I keep getting this error:

set_cols, feature, missing_value_strategy, computed_fill_value, backend)
   1756         logger.warning(
   1757             f"DROP_ROW missing value strategy applied. Dropped {len_before_dropped_rows - len_after_dropped_rows} "
   1758             f"samples out of {len_before_dropped_rows} from column {feature[COLUMN]}. The rows containing these "
   1759             f"samples will ultimately be dropped from the dataset."
   1760         )
   1761 else:
-> 1762     raise ValueError(f"Invalid missing value strategy {missing_value_strategy}")

ValueError: Invalid missing value strategy fill_with_const

Here is my training script:

qlora_fine_tuning_config = yaml.safe_load(
"""
  model_type: llm
  base_model:   meta-llama/Meta-Llama-3-8B-Instruct


  input_features:
    - name: Prompt
      type: text
      preprocesssing: 
           max_sequence_length :256

  output_features:
    - name: Response
      type: text
      preprocesssing: 
           max_sequence_length :150

  prompt:
    template: >-

      ### Prompt: {Prompt}

      ### responses : 
      
  
  quantization:
    bits: 4

  generation:
    temperature: 0.1
    max_new_tokens: 150
    
  preprocessing:
    split:
       probabilities:
        - 1.0
        - 0.0
        - 0.0
  
  adapter:
    type: lora

  trainer:
    type: finetune
    epochs: 10
    batch_size: 1
    eval_batch_size: 1
    enable_gradient_checkpointing: true
    gradient_accumulation_steps: 16
    learning_rate: 0.00001
    optimizer:
      type: paged_adam
      params:
        eps: 1.e-8
        betas:
          - 0.9
          - 0.999
        weight_decay: 0
    learning_rate_scheduler:
      warmup_fraction: 0.03
      reduce_on_plateau: 0
  """
  )

new_model = LudwigModel(config=qlora_fine_tuning_config, logging_level=logging.INFO)
results = new_model.train(dataset=train_df)

And for inference:

new_model.predict(_test_df.loc[0:1])

Here is the full trace:

     4 def predict(index):
----> 5         test_predictions = new_model.predict(_test_df.loc[index:index])[0]
      7         completion = oclient.chat.completions.create(
      8         model="gpt-3.5-turbo",
      9         temperature = 0.1,
   (...)
     42         ]
     43         )
     44         results = completion.choices[0].message.content

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/api.py:1141, in LudwigModel.predict(self, dataset, data_format, split, batch_size, generation_config, skip_save_unprocessed_output, skip_save_predictions, output_directory, return_type, callbacks, **kwargs)
   1139 start_time = time.time()
   1140 logger.debug("Preprocessing")
-> 1141 dataset, _ = preprocess_for_prediction(  # TODO (Connor): Refactor to use self.config_obj
   1142     self.config_obj.to_dict(),
   1143     dataset=dataset,
   1144     training_set_metadata=self.training_set_metadata,
   1145     data_format=data_format,
   1146     split=split,
   1147     include_outputs=False,
   1148     backend=self.backend,
   1149     callbacks=self.callbacks + (callbacks or []),
   1150 )
   1152 logger.debug("Predicting")
   1153 with self.backend.create_predictor(self.model, batch_size=batch_size) as predictor:

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:2334, in preprocess_for_prediction(config, dataset, training_set_metadata, data_format, split, include_outputs, backend, callbacks)
   2332         training_set, test_set, validation_set, training_set_metadata = processed
   2333 else:
-> 2334     processed = data_format_processor.preprocess_for_prediction(
   2335         config, dataset, features, preprocessing_params, training_set_metadata, backend, callbacks
   2336     )
   2337     dataset, training_set_metadata, new_hdf5_fp = processed
   2338     training_set_metadata = training_set_metadata.copy()

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:276, in DataFramePreprocessor.preprocess_for_prediction(config, dataset, features, preprocessing_params, training_set_metadata, backend, callbacks)
    273 if isinstance(dataset, pd.DataFrame):
    274     dataset = backend.df_engine.from_pandas(dataset)
--> 276 dataset, training_set_metadata = build_dataset(
    277     config,
    278     dataset,
    279     features,
    280     preprocessing_params,
    281     mode="prediction",
    282     metadata=training_set_metadata,
    283     backend=backend,
    284     callbacks=callbacks,
    285 )
    286 return dataset, training_set_metadata, None

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1271, in build_dataset(config, dataset_df, features, global_preprocessing_parameters, mode, metadata, backend, random_seed, skip_save_processed_input, callbacks)
   1269 for feature_config in feature_configs:
   1270     preprocessing_parameters = feature_name_to_preprocessing_parameters[feature_config[NAME]]
-> 1271     handle_missing_values(dataset_cols, feature_config, preprocessing_parameters, backend)
   1273 # Happens after missing values are handled to avoid NaN casting issues.
   1274 logger.debug("cast columns")

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1703, in handle_missing_values(dataset_cols, feature, preprocessing_parameters, backend)
   1701 missing_value_strategy = preprocessing_parameters["missing_value_strategy"]
   1702 computed_fill_value = preprocessing_parameters.get("computed_fill_value")
-> 1703 _handle_missing_values(dataset_cols, feature, missing_value_strategy, computed_fill_value, backend)

File ~/anaconda3/envs/vineeth_10/lib/python3.10/site-packages/ludwig/data/preprocessing.py:1762, in _handle_missing_values(dataset_cols, feature, missing_value_strategy, computed_fill_value, backend)
   1756         logger.warning(
   1757             f"DROP_ROW missing value strategy applied. Dropped {len_before_dropped_rows - len_after_dropped_rows} "
   1758             f"samples out of {len_before_dropped_rows} from column {feature[COLUMN]}. The rows containing these "
   1759             f"samples will ultimately be dropped from the dataset."
   1760         )
   1761 else:
-> 1762     raise ValueError(f"Invalid missing value strategy {missing_value_strategy}")

ValueError: Invalid missing value strategy fill_with_const

-3.0

  • 0.10.3
@aliarabat
Copy link

I encountered the same error as reported above

@vinven7
Copy link
Author

vinven7 commented May 5, 2024

I didn't quite solve the problem but found a workaround:
Since the model completed finetuning,
I loaded the model to HuggingFace and then ran inference on this model (base model + fine tuned Peft adapater).

@aliarabat
Copy link

Sounds great! Could you please explain how you finetuned the model using Ludwig or used a different approach?
Thanks in advance

@shankarrajus
Copy link

Hi, I am also facing similar error during inference "ValueError: Invalid missing value strategy fill_with_const". Any workarounds ?

@rlleshi
Copy link

rlleshi commented Jul 5, 2024

Same issue. @arnavgarg1 kind ping to perhaps take a look on this too

Env

Ludwig==0.9.1
Transformers==4.42.3
peft==0.11.1

Python 3.8.0

@alexsherstinsky alexsherstinsky added the llm Large Language Model related label Jul 26, 2024
@mhabedank mhabedank assigned mhabedank and unassigned mhabedank Oct 21, 2024
@mhabedank mhabedank added the bug Something isn't working label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working llm Large Language Model related
Projects
None yet
Development

No branches or pull requests

6 participants