Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNXRuntimeError Windows #187

Open
102-97-98-105 opened this issue Jul 15, 2024 · 1 comment
Open

ONNXRuntimeError Windows #187

102-97-98-105 opened this issue Jul 15, 2024 · 1 comment

Comments

@102-97-98-105
Copy link

Running the microphone example on Windows produces the following error.
I cound not figure out the problem or find any fixes.

Stacktrace:

WARNING:root:Tried to import the tflite runtime, but it was not found. Trying to switching to onnxruntime instead, if appropriate models are available.
start
analyse
� [ 1 ; 3 1 m 2 0 2 4 - 0 7 - 1 5   2 1 : 3 4 : 5 7 . 7 5 6 7 7 1 8   [ E : o n n x r u n t i m e : ,   s e q u e n t i a l _ e x e c u t o r . c c : 5 1 6   o n n x r u n t i m e : : E x e c u t e K e r n e l ]   N o n - z e r o   s t a t u s   c o d e   r e t u r n e d   w h i l e   r u n n i n g   C o n v   n o d e .   N a m e : ' C o n v _ 3 '   S t a t u s   M e s s a g e :   I n v a l i d   i n p u t   s h a p e :   { 1 } � [ m 
 Exception ignored from cffi callback <function _StreamBase.__init__.<locals>.callback_ptr at 0x000001F5D4591E40>:
Traceback (most recent call last):
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\sounddevice.py", line 854, in callback_ptr
    return _wrap_callback(callback, data, frames, time, status)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\sounddevice.py", line 2710, in _wrap_callback
    callback(*args)
  File "C:\Users\stuff\source\repos\poc_wake_word\main.py", line 20, in callback
    print(f'prediction: {model.predict(indata)}')
                         ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\model.py", line 275, in predict
    n_prepared_samples = self.preprocessor(x)
                         ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 463, in __call__
    return self._streaming_features(x)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 434, in _streaming_features
    self._streaming_melspectrogram(self.accumulated_samples)
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 397, in _streaming_melspectrogram
    (self.melspectrogram_buffer, self._get_melspectrogram(list(self.raw_data_buffer)[-n_samples-160*3:]))
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 202, in _get_melspectrogram
    outputs = self.melspec_model_predict(x)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\openwakeword\utils.py", line 87, in <lambda>
    self.melspec_model_predict = lambda x: self.melspec_model.run(None, {'input': x})
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\stuff\source\repos\poc_wake_word\.venv\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 220, in run
    return self._sess.run(output_names, input_feed, run_options)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Non-zero status code returned while running Conv node. Name:'Conv_3' Status Message: Invalid input shape: {1}
Traceback (most recent call last):
  File "C:\Users\stuff\source\repos\poc_wake_word\main.py", line 31, in <module>
    wake_word_detection()
  File "C:\Users\stuff\source\repos\poc_wake_word\main.py", line 26, in wake_word_detection
    while True:
          ^^^^
KeyboardInterrupt

Process finished with exit code -1073741510 (0xC000013A: interrupted by Ctrl+C)

Code:

# Imports
import argparse

import numpy as np
import pyaudio
from openwakeword.model import Model

# Parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument(
    "--chunk_size",
    help="How much audio (in number of samples) to predict on at once",
    type=int,
    default=1280,
    required=False
)
parser.add_argument(
    "--model_path",
    help="The path of a specific model to load",
    type=str,
    default="",
    required=False
)
parser.add_argument(
    "--inference_framework",
    help="The inference framework to use (either 'onnx' or 'tflite'",
    type=str,
    default='tflite',
    required=False
)

args = parser.parse_args()

# Get microphone stream
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
CHUNK = args.chunk_size
audio = pyaudio.PyAudio()
mic_stream = audio.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, frames_per_buffer=CHUNK)

# Load pre-trained openwakeword models
if args.model_path != "":
    owwModel = Model(wakeword_models=[args.model_path], inference_framework=args.inference_framework)
else:
    owwModel = Model(inference_framework=args.inference_framework)

n_models = len(owwModel.models.keys())

# Run capture loop continuosly, checking for wakewords
if __name__ == "__main__":
    # Generate output string header
    print("\n\n")
    print("#" * 100)
    print("Listening for wakewords...")
    print("#" * 100)
    print("\n" * (n_models * 3))

    while True:
        # Get audio
        audio = np.frombuffer(mic_stream.read(CHUNK), dtype=np.int16)

        # Feed to openWakeWord model
        prediction = owwModel.predict(audio)

        # Column titles
        n_spaces = 16
        output_string_header = """
            Model Name         | Score | Wakeword Status
            --------------------------------------
            """

        for mdl in owwModel.prediction_buffer.keys():
            # Add scores in formatted table
            scores = list(owwModel.prediction_buffer[mdl])
            curr_score = format(scores[-1], '.20f').replace("-", "")

            output_string_header += f"""{mdl}{" " * (n_spaces - len(mdl))}   | {curr_score[0:5]} | {"--" + " " * 20 if scores[-1] <= 0.5 else "Wakeword Detected!"}
            """

        # Print results table
        print("\033[F" * (4 * n_models + 1))
        print(output_string_header, "                             ", end='\r')

Pip List:

certifi            2024.7.4
cffi               1.16.0
charset-normalizer 3.3.2
colorama           0.4.6
coloredlogs        15.0.1
flatbuffers        24.3.25
humanfriendly      10.0
idna               3.7
joblib             1.4.2
mpmath             1.3.0
numpy              1.26.4
onnxruntime        1.18.1
openwakeword       0.6.0
packaging          24.1
pip                23.2.1
protobuf           5.27.2
PyAudio            0.2.14
pycparser          2.22
pyreadline3        3.4.1
requests           2.32.3
scikit-learn       1.5.1
scipy              1.14.0
sounddevice        0.4.7
sympy              1.13.0
threadpoolctl      3.5.0
tqdm               4.66.4
urllib3            2.2.2
@dscripka
Copy link
Owner

This is likely due to an issue with how PyAudio is getting the audio from your microphone. openWakeWord requires that the input audio be single-channel, 16-bit, 16 khz audio, so you can confirm that PyAudio is producing the correct data by inspecting the results of this line: audio = np.frombuffer(mic_stream.read(CHUNK), dtype=np.int16).

It should produce an array with 1280 16-bit integers. If it does not, then you may need to adjust your PyAudio settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants