Generative AI Music. In the last year or so, Generative AI Music has improved massively. Although early days, today you can generate some pretty decent, short duration music of all kinds with AI. If you like creating music and AI, here is a list of interesting Generative AI music stuff.
Facebook AIR MusicGen. Probably one of the pioneering models in AI quality music generation. MusicGen has sparked a whole universe of MusicGen derivative models of all kinds, and it’s the model behind many musicgen apps. The model is based on a single stage auto-regressive Transformer model, and unlike Google LM, MusicGen doesn’t require a self-supervised semantic representation. Repo and demos here: MusicGen: Simple and Controllable AI Music Generation
Mulbert. One of the early AI musicgen startups, Mulbert is an app for generating high-quality, royalty-free music with AI. Thy this Mulbert text-to-music notebook and get the app here.
Stable Audio 2.0. Recently introduced by Stability AI, Stable Audio 2.0 lets you generate high-quality, full tracks from text & audio with coherent musical structure up to three minutes in length at 44.1kHz stereo. Checkout the blogpost, demos, trial: Introducing Stable Audio 2.0
MusicLang is an app for controllable music generation with AI, mostly oriented to artists and music producers. The MusicLang team recently released MusicLang Predict, your controllable music copilot (repo). You can Try MusicLang here.
The MusicLang Tokeniser. An interesting post explaining how tokenization works inside MusicLang and its capacity to afford users profound control over the musical content generated by transformer models. The MusicLang tokenizer : Toward controllable symbolic music generation.
Glycol. A foundation for some specialised musicgen model. If you love coding and music this is pretty cool. Glycol is an open source, next-gen language for generating music with code. Get Glycol from this repo.
RaveForce Agent is a Python package under MIT license that allows you to define your musical tasks in Python with Glicol syntax, and train an agent to do the task with APIs similar to the OpenAI Gym. Get ReveForce here.
Google MusicFX is powered by Google MusicLM and AudioLM. Simple with no frills but good quality. A neat feature is DJ Mode, that enables you to generate a real-time stream of music by adding and adjusting musical prompts to evolve the music live. You can try Google MusicFX here
Suno AI V3 The latest version of Suno enables you to generate two-minute, radio-quality music from text prompts in just a few seconds. The model behind Suno combines a proprietary AI musicgen model and ChatGPT for the lyrics. Sun has some cool features and you can get some decent outputs. Try Suno.ai V3 here
Udio. Recently released, it’s super trending in the musicgen scene now. Suno enables you to create music from simple text prompts by specifying topics, genres, and other descriptors which are then transformed into professional quality tracks. Udio it’s pretty impressive and you can generate some amazing music output. I really like it! Try Udio here. You can also watch this tutorial on how to use Audio.
.Have a nice week.
-
Building an AI Coach to Tame My Monkey Mind
-
Diffusion & Latent Consistency Models, Explained
-
Building Reliable Systems Out of Unreliable AI Agents
-
Attention in Transformers, Visually Explained
-
Understanding Multimodal AI Models
-
The Perfect Prompt: A Prompt Engineering Cheat Sheet
-
How to: Text2SQL Tasks with DuckDB-NSQL-7B Model
-
“Surprising” Lessons after Churning Half-billion GPT Tokens
-
An Overview of the Latest Google Gemma Family of Models
-
Rerank 3: A New Foundation Model for Efficient Enterprise Search & Retrieval
-
Parler- An Open TTS Model for HQ Natural Sounding Speech
-
nanoLLaVA- A “Small but Mighty” 1B Vision-Language Model
-
aiXcoder-7B – A New SOTA LM Model for All Things Coding
-
Making Deep Learning Go Brrrr From First Principles
-
Notes on Stanford CS224 ML with Large-scale Graphs
-
How I Got into Deep Learning from Knowing Nothing
-
AutoCodeRover: Autonomous Program Improvement
-
MS Rho-1: Not All Tokens Are What You Need (paper, repo)
-
LLM2Vec: LLMs are Secretly Powerful Text Encoders (paper, repo, tutorial)
-
Volga – An Opensource Feature Engine for Real-time AI/ML
-
Airbnb Opensources Chronon ML Feature Platform
-
LLMOps Tutorial: Five Ways to Serve LLMs
-
NVIDIA Audio Dialogues Dataset for Audio & Music
-
Anthropic Persuasion Dataset: Human vs AI Claims
-
Generate Financial Q&A Datasets with AI
Tips? Suggestions? Feedback? email Carlos
Curated by @ds_ldn in the middle of the night.