ONNXRuntimeError Windows #187

102-97-98-105 · 2024-07-15T19:38:09Z

Running the microphone example on Windows produces the following error.
I cound not figure out the problem or find any fixes.

Stacktrace:

WARNING:root:Tried to import the tflite runtime, but it was not found. Trying to switching to onnxruntime instead, if appropriate models are available.
start
analyse
� [ 1 ; 3 1 m 2 0 2 4 - 0 7 - 1 5   2 1 : 3 4 : 5 7 . 7 5 6 7 7 1 8   [ E : o n n x r u n t i m e : ,   s e q u e n t i a l _ e x e c u t o r . c c : 5 1 6   o n n x r u n t i m e : : E x e c u t e K e r n e l ]   N o n - z e r o   s t a t u s   c o d e   r e t u r n e d   w h i l e   r u n n i n g   C o n v   n o d e .   N a m e : ' C o n v _ 3 '   S t a t u s   M e s s a g e :   I n v a l i d   i n p u t   s h a p e :   { 1 } � [ m 
 Exception ignored from cffi callback <function _StreamBase.__init__.<locals>.callback_ptr at 0x000001F5D4591E40>:
Traceback (most recent call last):
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\sounddevice.py", line 854, in callback_ptr
    return _wrap_callback(callback, data, frames, time, status)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\sounddevice.py", line 2710, in _wrap_callback
    callback(*args)
  File "C:\Users\stuff\source\repos\poc_wake_word\main.py", line 20, in callback
    print(f'prediction: {model.predict(indata)}')
                         ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\model.py", line 275, in predict
    n_prepared_samples = self.preprocessor(x)
                         ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 463, in __call__
    return self._streaming_features(x)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 434, in _streaming_features
    self._streaming_melspectrogram(self.accumulated_samples)
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 397, in _streaming_melspectrogram
    (self.melspectrogram_buffer, self._get_melspectrogram(list(self.raw_data_buffer)[-n_samples-160*3:]))
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 202, in _get_melspectrogram
    outputs = self.melspec_model_predict(x)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 87, in <lambda>
    self.melspec_model_predict = lambda x: self.melspec_model.run(None, {'input': x})
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 220, in run
    return self._sess.run(output_names, input_feed, run_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Name:'Conv_3' Status Message: Invalid input shape: {1}
Traceback (most recent call last):
  File "C:\Users\stuff\source\repos\poc_wake_word\main.py", line 31, in <module>
    wake_word_detection()
  File "C:\Users\stuff\source\repos\poc_wake_word\main.py", line 26, in wake_word_detection
    while True:
          ^^^^
KeyboardInterrupt

Process finished with exit code -1073741510 (0xC000013A: interrupted by Ctrl+C)

Code:

# Imports
import argparse

import numpy as np
import pyaudio
from openwakeword.model import Model

# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument(
    "--chunk_size",
    help="How much audio (in number of samples) to predict on at once",
    type=int,
    default=1280,
    required=False
)
parser.add_argument(
    "--model_path",
    help="The path of a specific model to load",
    type=str,
    default="",
    required=False
)
parser.add_argument(
    "--inference_framework",
    help="The inference framework to use (either 'onnx' or 'tflite'",
    type=str,
    default='tflite',
    required=False
)

args = parser.parse_args()

# Get microphone stream
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
CHUNK = args.chunk_size
audio = pyaudio.PyAudio()
mic_stream = audio.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK)

# Load pre-trained openwakeword models
if args.model_path != "":
    owwModel = Model(wakeword_models=[args.model_path], inference_framework=args.inference_framework)
else:
    owwModel = Model(inference_framework=args.inference_framework)

n_models = len(owwModel.models.keys())

# Run capture loop continuosly, checking for wakewords
if __name__ == "__main__":
    # Generate output string header
    print("\n\n")
    print("#" * 100)
    print("Listening for wakewords...")
    print("#" * 100)
    print("\n" * (n_models * 3))

    while True:
        # Get audio
        audio = np.frombuffer(mic_stream.read(CHUNK), dtype=np.int16)

        # Feed to openWakeWord model
        prediction = owwModel.predict(audio)

        # Column titles
        n_spaces = 16
        output_string_header = """
            Model Name         | Score | Wakeword Status
            --------------------------------------
            """

        for mdl in owwModel.prediction_buffer.keys():
            # Add scores in formatted table
            scores = list(owwModel.prediction_buffer[mdl])
            curr_score = format(scores[-1], '.20f').replace("-", "")

            output_string_header += f"""{mdl}{" " * (n_spaces - len(mdl))}   | {curr_score[0:5]} | {"--" + " " * 20 if scores[-1] <= 0.5 else "Wakeword Detected!"}
            """

        # Print results table
        print("\033[F" * (4 * n_models + 1))
        print(output_string_header, "                             ", end='\r')

Pip List:

certifi            2024.7.4
cffi               1.16.0
charset-normalizer 3.3.2
colorama           0.4.6
coloredlogs        15.0.1
flatbuffers        24.3.25
humanfriendly      10.0
idna               3.7
joblib             1.4.2
mpmath             1.3.0
numpy              1.26.4
onnxruntime        1.18.1
openwakeword       0.6.0
packaging          24.1
pip                23.2.1
protobuf           5.27.2
PyAudio            0.2.14
pycparser          2.22
pyreadline3        3.4.1
requests           2.32.3
scikit-learn       1.5.1
scipy              1.14.0
sounddevice        0.4.7
sympy              1.13.0
threadpoolctl      3.5.0
tqdm               4.66.4
urllib3            2.2.2

The text was updated successfully, but these errors were encountered:

dscripka · 2024-08-25T00:09:28Z

This is likely due to an issue with how PyAudio is getting the audio from your microphone. openWakeWord requires that the input audio be single-channel, 16-bit, 16 khz audio, so you can confirm that PyAudio is producing the correct data by inspecting the results of this line: audio = np.frombuffer(mic_stream.read(CHUNK), dtype=np.int16).

It should produce an array with 1280 16-bit integers. If it does not, then you may need to adjust your PyAudio settings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ONNXRuntimeError Windows #187

ONNXRuntimeError Windows #187

102-97-98-105 commented Jul 15, 2024

dscripka commented Aug 25, 2024

ONNXRuntimeError Windows #187

ONNXRuntimeError Windows #187

Comments

102-97-98-105 commented Jul 15, 2024

dscripka commented Aug 25, 2024