Skip to content

Support for realtime audio input #10

Closed
@biemster

Description

@biemster

Noting that the processing time is considerably shorter than the length of speech, is it possible to feed the models real time microphone output? Or does the inference run on the complete audio stream, instead of sample by sample?

This would greatly reduce the latency for voice assistants and the like, that the audio does not need to be fully captured and only after that fed to the models. Basically the same as I did here with SODA: https://github.com/biemster/gasr, but then with an open source and multilang model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions