in

10. Voice Cloning, Eleven Labs API, OpenAI Embeddings, ChatGPT API, Whisper API



I will be starting a spinoff channel on AI in music, art, and gaming in 2023. Subscribe at: https://youtube.com/@parttimeai

In this video, we build a question answering voice assistant that responds with a more realistic voice based on samples. To accomplish this, we combine multiple tools and techniques from the previous videos in the seires: OpenAI Embeddings, ChatGPT API, Whisper API, Eleven Labs API, Gradio UI, and Midjourney images.

0:00 Project Description: Q&A + Voice Cloning
1:45 The movie “Her” and the Idea of Smarter Assistants
2:27 Voice Sampling
3:15 Demo Voice #1 (Samantha Voice)
4:00 Demo Voice #2 (Jay-Z Voice), Rhyming Responses
5:20 Hip Hop Music and Sampling Analogy
6:26 Hip Hop Production, Rick Rubin, Taste and Technical Ability Clip
8:10 Recap of OpenAI For Finance Series So Far, Prerequisites
9:16 Building a Q&A Corpus, Vector Embeddings, Cosine Similarity Review
17:52 Building a User Interface with Gradio, Starter Code from Video #9
20:05 Voice Cloning with Eleven Labs API
24:57 Python Code Walkthrough – config.py constants, voice ID, custom prompts
25:52 Eleven Labs API – Example Request and Response Payloads
26:28 Avatars and AI Art Generation with Midjourney, Nvidia Stock Win
30:37 Python Code Walkthrough – advisor.py, requirements.txt
31:44 Gradio User Interface Development, Microphone Input, Avatar Display
35:15 UI Launch, Debugging Mode, Sharing Your App, Mobile Devices
36:32 Transcribe Function, OpenAI Whisper API
38:40 Incorporating Word Embeddings, Question Vector, Cosine Similarity, Answers
40:25 ChatGPT API, Conversation History, Stuffing the Prompt with Context
42:57 Eleven Labs API Request with Python, Text to Speech, Voice Synthesis Settings
44:25 Outputting Binary Response / MP3 to Audio Output
46:00 Final Words of Advice from Jay-Z

Her Movie Clip: https://www.youtube.com/watch?v=GV01B5kVsC0
Knowledge Base / Transcriptions From: https://www.youtube.com/@TheCompoundNews
Twitter: https://twitter.com/parttimelarry
Buy Me a Coffee: https://buymeacoffee.com/parttimelarry

Code:
Video #5 Word Embeddings:
https://www.youtube.com/watch?v=xzHhZh7F25I
Video #5 Notebook:
https://colab.research.google.com/drive/1tttDqgnWL9yJtmlOFXJqA-BjQ1Pyfpax?usp=sharing
Video #6 Financial Q&A
https://www.youtube.com/watch?v=hR8xhJgKcJ0
https://colab.research.google.com/drive/1cVQNg2-zGQb7qZXFECG6kyq5yVIHyf5o?usp=sharing
Video #9 ChatGPT API + Whisper API and Gradio
https://www.youtube.com/watch?v=Si0vFx_dJ5Y
Video #9 Code:
https://github.com/hackingthemarkets/chatgpt-api-whisper-api-voice-assistant
Video #10 Code (This Video)
https://github.com/hackingthemarkets/qa-assistant-eleven-labs-voice-cloning

Since I am starting to do more content featuring AI + art and sound, I will be starting a spinoff channel on AI in music, art, and gaming in 2023. Subscribe at: https://youtube.com/@parttimeai

OpenAI Gym – Lunar Lander Training Demo

SciTechDaily

Tongue-Controlled MouthPad Enables Computer Interaction for Paralyzed Users