Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micro Speech:How to run audio_preprocessor.py independently? #2926

Open
ctwillson opened this issue Nov 23, 2024 · 2 comments
Open

Micro Speech:How to run audio_preprocessor.py independently? #2926

ctwillson opened this issue Nov 23, 2024 · 2 comments
Labels

Comments

@ctwillson
Copy link

I have read the doc,but I don't know how to run audio_preprocessor.py like "python audio_preprocessor.py"instead of using bazel.
Any ideas for this?

@ctwillson
Copy link
Author

ctwillson commented Nov 25, 2024

@ddavis-2015 I tried to generate feature by using audio_preprocessor_float32.tflite(or audio_processer_int8.tflite) like below code:

_tflm_interpreter = runtime.Interpreter.from_file('audio_preprocessor_float.tflite')
def generate_feature_using_tflm_float(audio_frame: tf.Tensor) -> tf.Tensor:
    """

    Generate a single feature for a single audio frame.  Uses TensorFlow
    graph execution and the TensorFlow model converter to generate a
    TFLM compatible model.  This model is then used by the TFLM
    MicroInterpreter to execute a single inference operation.

    Args:
      audio_frame: tf.Tensor, a single audio frame (self.params.window_size_ms)
      with shape (1, audio_samples_count)

    Returns:
      tf.Tensor, a tensor containing a single audio feature with shape
      (self.params.filter_bank_number_of_channels,)
    """

    global _tflm_interpreter
    input_scale, input_zero_point = input_details["quantization"]
    _tflm_interpreter.set_input(audio_frame, 0)
    _tflm_interpreter.invoke()
    result = _tflm_interpreter.get_output(0)
    result = result / input_scale + input_zero_point
    result = result.astype(input_details["dtype"])
    # print(result.shape)
    return tf.convert_to_tensor(result)

however the generated feature is not equal to the feature when I trained by using audio_microfrontend which provided by tensorflow:

            micro_frontend = frontend_op.audio_microfrontend(
                int16_input,
                sample_rate=16000,
                window_size=30,
                window_step=10,
                num_channels=40,
                upper_band_limit=7600,
                lower_band_limit=20,
                out_scale=1,
                out_type=tf.float32)
            # int16_input dims: [frames, num_channels]
            output_ = tf.multiply(micro_frontend, (10.0 / 256.0))
            #print(output_)
            # print(output_,output_.shape)
            data = output_

            data = np.expand_dims(data, axis=1).astype(np.float32)
            # print('data shape:{}data shape1:{}'.format(data.shape,data.shape[1]))
            input_details = interpreter.get_input_details()[0]
            output_details = interpreter.get_output_details()[0]
            # if model_type == "Quantized":
            input_scale, input_zero_point = input_details["quantization"]
            # print(input_scale,input_zero_point,input_details["dtype"])

            data = data / input_scale + input_zero_point
            data = data.astype(input_details["dtype"])

Copy link
Contributor

"This issue is being marked as stale due to inactivity. Remove label or comment to prevent closure in 5 days."

@github-actions github-actions bot added the Stale label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant