TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

My Journey into Learning Audio ML

Deep-unlearning / Learning-Audio-ML

6 0 Language: null Updated: 9mo ago

README

My Journey into Learning Audio ML

About

This repository documents my journey into learning Audio ML, bridging the gap in my knowledge as I transition from a more general deep learning background. The exploration is not strictly linear and may branch into broader deep learning topics. It serves as a space for experimentation, implementation of research papers, writing blog posts, and creating tutorials on related subjects.

What am I currently working on

  • ASR:

    • Fast-Conformer: Implementing Fast-Conformer paper (https://arxiv.org/pdf/2305.05084) and probably other variants and optimizations.

    • Finetuning Script for Voxtral model

    • [WIP] Study Slam

    • [WIP] Blog post about Audio LMs

    • [WIP] Blog post about distil-whisper + tutorial

  • TTS:

  • Audio Codecs:

    • Transformers integration of XCodec2: PR

Useful Ressources

  • List of Awesome AudioLM Datasets

  • Torchaudio tutorials: https://pytorch.org/audio/main/index.html

  • WAVLab Lectures on Speech Recognition and Understanding: https://www.youtube.com/@wavlab3016/videos

  • Training recipe for Speech LMs: https://github.com/slp-rl/slamkit

  • Conversational AI Reading Group: https://poonehmousavi.github.io/rg

0 AIs selected
Clear selection
#
Name
Task