Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cli/paraformer] ali-paraformer inference #2067

Merged
merged 21 commits into from
Oct 30, 2023
Merged

Conversation

Mddct
Copy link
Collaborator

@Mddct Mddct commented Oct 20, 2023

TODO: (in this pr)

  • export jit work
  • 简化代码
  • model.decode,
  • 支持cli
  • 精度校验

TODO: (future pr)

NOTE: streaming paraformer not in current plan

@Mddct Mddct changed the title [cli/paraformer] ali-paraformer load and infer work [cli/paraformer] ali-paraformer load and infer work [WIP] Oct 20, 2023
@Mddct
Copy link
Collaborator Author

Mddct commented Oct 24, 2023

Screenshot 2023-10-24 at 18 37 25 cli work

@Mddct Mddct changed the title [cli/paraformer] ali-paraformer load and infer work [WIP] [cli/paraformer] ali-paraformer inference Oct 24, 2023
@Mddct
Copy link
Collaborator Author

Mddct commented Oct 30, 2023

mmexport1698658547189.jpg

recognize.py works

@Mddct
Copy link
Collaborator Author

Mddct commented Oct 30, 2023

decode info: batch_size=100, beam_size=10
aishell:

model greedy_search beam_search
wenet-ali-paraformer 1.96 1.96

decode info: batch_size=1, beam_size=10
aishell:

model greedy_search beam_search
wenet-ali-paraformer 1.95 1.95
funasr-paraformer 1.95 /

@Mddct
Copy link
Collaborator Author

Mddct commented Oct 30, 2023

Screenshot 2023-10-30 at 22 34 12 confidence in search.py works

TODO(next):

  • resolve conflict to merge main

@Mddct
Copy link
Collaborator Author

Mddct commented Oct 30, 2023

Screenshot 2023-10-30 at 23 01 01

it works!

@robin1001
Copy link
Collaborator

assets 是必须的吗?是否可以放到模型中?

wenet/cli/transcribe.py Outdated Show resolved Hide resolved
@Mddct
Copy link
Collaborator Author

Mddct commented Oct 30, 2023

assets 是必须的吗?是否可以放到模型中?

目前cmvn 这些已经在模型里边力,不过funasr的conf和wenet conf 格式不一样

导出模型的时候,需要用assets里边的文件,目前是我脚本转的

可以等后边自动从funasr里边的conf力转成wenet格式,然后再删掉

@robin1001 robin1001 merged commit af1315c into main Oct 30, 2023
5 of 6 checks passed
@robin1001 robin1001 deleted the Mddct-cli-paraformer branch October 30, 2023 15:39
@Mddct Mddct mentioned this pull request Jan 18, 2024
3 tasks
@donstang
Copy link

streaming paraformer 有计划支持吗?

@Mddct
Copy link
Collaborator Author

Mddct commented Apr 29, 2024

streaming paraformer 有计划支持吗?

暂时没计划, streaming的paraformer指标上差了一些 也不是理想中的流模型,

感兴趣的话 可以请关注wenetspeech2.0的进展

def transcribe(self, audio_file: str, tokens_info: bool = False) -> dict:
waveform, sample_rate = torchaudio.load(audio_file, normalize=False)
waveform = waveform.to(torch.float)
feats = kaldi.fbank(waveform,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default window in the FunASR frontend is hamming. You can find more details here. However, the default window in kaldi.fbank is povey, as specified here. This different window maybe a little mismatch. As mentioned in line 44 of this document:

"povey" is a window I made to be similar to Hamming but to go to zero at the edges, it's pow((0.5 - 0.5cos(n/N2*pi)), 0.85) I just don't think the Hamming window makes sense as a windowing function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants