《10-4端到端语音分离技术及应用.pdf》由会员分享,可在线阅读,更多相关《10-4端到端语音分离技术及应用.pdf(39页珍藏版)》请在三个皮匠报告上搜索。
1、Hui Song 2020.07.25 An Overview of Deep Learning Based Speech Separation Technology Formulation Monaural Speech Separation Array-based Speech Separation State of the art An encoder-separator-decoderformulation Frequency-domain or time-domain? Frequency-domain speech separation/extraction methods Tim
2、e-domain speech separation/extraction methods Some interesting variants Separation-based methods Beamforming-based methods 01 02 03 CONTENTS Conclusions and Future Challenges 04 Formulation 1 State of the art Encoder - Transform the input signal into a domain (latent space) suitable for source separ
3、ation. Separator (+ Extractor) - Estimates a mask for each source in the latent space, and outputs an estimate of each source in the latent space by mask multiplication, or beamforming. Decoder - Transform the extracted source signals back to time-domain. Formulation Fig. 1. Generic view of source separation system 1 Observed signal in time-domain: Encoder transformation: Separator (masking) netwo