Issues · BinWang28/audio-ai-hub · GitHub

Labels Milestones

[Auto-suggested] ReNikud: Audio-Supervised Hebrew Grapheme-to-Phoneme Conversion

#92

· github-actions opened

on Jun 22, 2026

[Auto-suggested] Low-Burden Data Augmentation for Dysarthric ASR via Zero-Shot Voice Cloning

#91

· github-actions opened

on Jun 22, 2026

[Auto-suggested] Transcript-Free Flow-Matching Text-to-Speech via Speech Feature Conditioning

#90

· github-actions opened

on Jun 22, 2026

[Auto-suggested] Exploring Pre-training Benefits on Phoneme Addition through Fine-tuning in Speech Synthesis

#89

· github-actions opened

on Jun 22, 2026

[Auto-suggested] Systematic Study of Dysarthric Speech Recognition: Spectral Features and Acoustic Models

#88

· github-actions opened

on Jun 22, 2026

[Auto-suggested] Improving End-to-End Speech Recognition for Dysarthric Speech through In-Domain Data Augmentation

#87

· github-actions opened

on Jun 22, 2026

[Auto-suggested] Investigating Human-Model Discrepancies in Speech Quality Assessment via Acoustic and Prosodic Perturbations

#86

· github-actions opened

on Jun 22, 2026

[Auto-suggested] PASQA: Pitch-Accent-Focused Speech Quality Assessment Model Trained on Synthetic Speech with Accent Errors

#85

· github-actions opened

on Jun 22, 2026

[Auto-suggested] BayLing-Duplex: Native Full-Duplex Speech Dialogue with a Single Autoregressive LLM

#84

· github-actions opened

on Jun 15, 2026

[Auto-suggested] Mask, Sample, Revise: A Revisable CTMC Inference Stack for Guided Discrete Flow Matching Text-to-Speech

#83

· github-actions opened

on Jun 15, 2026

[Auto-suggested] FoleyGenEx: Unified Video-to-Audio Generation with Multi-Modal Control, Temporal Alignment, and Semantic Precision

#82

· github-actions opened

on Jun 15, 2026

[Auto-suggested] Spatio-Temporal Audio Language Modeling for Dynamic Sound Sources

#81

· github-actions opened

on Jun 15, 2026