Welcome back to Free Range Fri-AI-day, where I share my progress of my builds in public. This week I will share how I built a real-time voice AI life coach (which I cloned with my sister’s voice!) in my journey of assembling the dream team towards designing and automating a ✨remarkable✨ life.
If you want to access the life-coach agent, you can dial +1 (267) 214 9863. Web accessible version coming soon.
The Big Picture Vision
• Automating various aspects of life to enhance productivity and well-being, towards a ✨remarkable✨ life.
• Be a capable human so that we can cultivate depth holistically in life.
If you see the definition of remarkable:
It is a wonderful goal to design a life that encapsulates a myriad of worthy things. Here, I have captured them into 6 pillars:
personal fulfillment and intellectual growth,
meaningful relationships,
significant professional impact,
health and wellbeing,
essentials and financial security,
adventure and inspiring stories.
I will share what are the details inside my version of a ✨remarkable✨ life in a future post, but for today, I will focus on the process, the journey and blueprint towards that goal.
Journey to a ✨Remarkable✨ Life
Inspired by Cal Newport's wisdom on designing a deep life, I'm breaking this down into two major phases:
Be a capable human: This is the foundation layer. You can think of it as the “hardware” layer.
Cultivate Depth: Once our hardware is upgraded, it’s time to install some ambitious “software” (our goals and dreams) to maximize these new capabilities.
Let’s take an example from video games. You might want to make the next Mario Odyssey, or the next Zelda Breath of the Wild. But if you are still running the chips and circuitry of a Gameboy Color, you can only run Super Mario Bros. No matter how ambition and time and talent you try to program on a Gameboy Color, it just isn’t feasible to achieve it.
So, before we go straight into automating into our ✨remarkable✨ life, we would need to start building foundational capabilities, which includes:
Discipline: establishing routines and habits for our body, mind, and heart.
Competence: developing valuable skills and mastering our craft.
Control: managing our finances and attention to free up space for deeper pursuits.
Simplification: cutting out the clutter that doesn't serve us.
(We will cover the cultivate depth portion in future posts)
Introducing AI Agents: Assisting in Building Capabilities
The plan is to “recruit” and build a dream team of AI agents that can improve your capabilities. These will be our “dream AI crew” that will be responsible of the 6 pillars, and some supporting casts of AI Agents to make it all work.
But, what is an AI Agent?
An AI Agent is a program that leverages LLMs (like ChatGPT, or other models) and various tools (like web browsing, writing to doc, etc.) to perform specific tasks towards a specified goal. Agents engage in “role playing” in order to complete the task more effectively. It also utilizes memory to learn from interactions. This is the current hot topic in the emerging landscape of Gen AI.
If you want to know more about the current state of AI agents, this is a great whitepaper
https://arxiv.org/html/2404.11584v1
Adding Multi-Agent Architecture
Multi-agent architecture consists of multiple AI Agents that work together, communicate and coordinate to achieve common goals. Each AI Agent is given its own goals, strengths, and responsibilities, and simulates more similarly how humans do work (they do some work and then delegate).
For anyone eager to dive deeper, I highly recommend enrolling to this course by Andrew Ng and Joao Moura
https://www.deeplearning.ai/short-courses/multi-ai-agent-systems-with-crewai/
Dream AI Crew, Assemble!
With the multi-agent architecture approach, the future of AI is agentic. We can imagine the blueprint above having different AI Agents responsible of the 6 pillars, and the supporting cast of AI Agents to smoothen up the work. Before we build the corresponding agents in those pillars, we should start with building the foundation agents first.
AI Life Coach - the AI Accountability Buddy
The idea of the life coach is:
Accountability buddy towards long-term goals
Sounding board on ideas and reflection partner on current rumination
Performance coach - makes sure you are in good shape (esp. emotionally, mentally, and maybe even physically)
Effectiveness coach - identify the highest leverage activities
Essentially, it should be a process of doing the right things.
The Chief of Staff: The Efficiency Expert
Parallel to the life coach, the chief of staff is the delegator-in-chief, and ensures things are done right.
Manage scheduling
Enforcing boundaries
Delegating tasks to other AI agents.
From these two AI Agent support function, I started with the AI Life coach build first.
First Step: Build an AI Life Coach
For those curious to interact with this AI life coach, give it a ring at +1 (267) 214 9863 or stay tuned for the upcoming web app!
First, I wanted to know about what sort of life coach apps are already out there in the market. This article talks about the problems with existing mental health apps:
1. High Demand for Mental Health Services: The pandemic increased the need for mental health services, highlighting existing challenges in accessing therapy, such as long wait times and a shortage of therapists.
2. Emergence of Therapy Apps: Apps like Talkspace and BetterHelp promise accessible mental health care via smartphones. These apps gained popularity due to the difficulty in finding traditional in-person therapy.
3. Accessibility Issues: There are significant geographic, linguistic, and cultural mismatches in the availability of therapists. Even with the proliferation of apps, many people struggle to find therapists who meet their specific needs.
4. Quality vs. Quantity: The increase in the use of digital mental health services raises concerns about the quality of care. Traditional therapy emphasizes the therapist-patient relationship, which can be hard to replicate through an app.
5. Economic and Market Dynamics: Digital behavioral health services have attracted significant venture capital, and companies like Talkspace have gone public, emphasizing the large market for mental health services.
6. User Experiences and App Limitations: Users often face challenges such as difficulty finding the right therapist, inconsistent communication, and limited availability of therapists. These issues can lead to frustration and inadequate care.
7. Text-Based Therapy: Texting with therapists is a primary feature of many therapy apps, offering convenience but often resulting in less effective therapy compared to traditional methods.
8. Therapist Perspectives: Therapists working for these apps face low pay, high caseloads, and the challenge of providing quality care through text and video interactions.
9. Ethical and Regulatory Concerns: Issues with privacy, data use, and adherence to professional regulations have emerged, as some companies push the boundaries to meet demand.
10. Technological Solutions and Limitations: While technology can help increase access to mental health services, it often cannot fully replace the in-person experience and may exacerbate existing inequalities.
Here are some observations:
There is a difference in the expectations between therapy and life coaching. We should not cross the line - this is only life coaching.
These are very private conversations that require trust, hence privacy is a key non-functional requirement.
The modalities are via talking and via messaging, and that people are expecting closer to 24/7 access.
Tech Stack
Voice Model: elevenlabs (http://elevenlabs.io/), stability = 0.5 (default), Clarity + Similarity = 0.75 (default)
Voice AI platform: VAPI (http://vapi.ai)
LLM Model: llama3-8b (provided inside VAPI), temperature = 0.7 (default)
Provider: groq (provided inside VAPI)
Phone API: twilio (https://www.twilio.com)
Largely I followed how my friend Eugene did the setup. Please check his post out here about specifics on VAPI setup.
Beyond the VAPI, I did default voice cloning on elevenlabs with my sister’s voice (she is a podcaster btw, please subscribe!).
System Prompts
I suggested a couple of frameworks to guide the AI life coach:
co-active life coach, which focuses on “being” (co-active) and “doing” (action)
cognitive behavioral therapy as a framework for them to use. It will still say it is not a mental health therapist but is useful to know psychological therapy frameworks.
guiding, suggesting, holding you accountable, from deep conversations to daily check-ins.
Here is the V1.0 prompt:
First message:
Hey I'm Jacey a life coach, and today I’d like to talk about you! How are you feeling today?
System message:
You are Jacey, an empathetic, insightful, and supportive coach who helps people manage their mental well-being, improve productivity, and build better relationships using Cognitive Behavioral Therapy (CBT) principles and co-active coaching techniques.
You help people feel better by asking questions to reflect on and evoke feelings of positivity, gratitude, joy, and love. You provide practical advice and guide users through CBT exercises to help them achieve their goals.
You show radical candor and tough love, supporting users through challenges and celebrating their achievements.
Respond in a casual, engaging, and friendly tone. Sprinkle in filler words, contractions, idioms, and other casual speech that we use in conversation. Emulate the user’s speaking style. Be concise and limit responses to 200 words or less.
- Use casual language, phrases like "Umm...", "Well...", and "I mean" are preferred.
- This is a voice conversation, so keep your responses like in a real conversation. Don't ramble for too long.
Example interaction, just follow the big picture of the interaction steps.
Greet the User:"Hi there! I’m Jacey, your AI coach. How can I help you today?
"Check-in on Their Well-being:
Ask about their current state: "How are you feeling today? Anything specific you’d like to talk about or work on?"
Offer Personalized Activities:
Suggest activities based on user preferences and goals:
"Here are today’s activities: 10-minute mindfulness meditation, 30-minute cardio workout, and an article on effective communication."
Guide Through Selected Activity:
Provide step-by-step guidance for the chosen activity: "Let’s start with the mindfulness meditation. Find a comfortable seat, close your eyes, and let’s begin."
Track Progress:
Give feedback on user’s progress: "You’ve completed 5 out of 7 activities this week. Great job! What should we focus on next?"
Performance Insights
Latency is really, really good - feels like I can have a conversation in real-time
Voice Cloning: Utilizing a high-quality recording of my sister’s podcast, the voice clone achieved about 90% similarity. Although there are occasional glitches revealing the AI nature, it generally provides a very realistic interaction.
Model Efficiency: Using the llama3-8b model, the system performed well across various use cases. Considering an upgrade to llama3-70b for smarter interactions, although it might increase latency.
User Feedback: Initial tests with my sister's friends in Canada reported a natural and effective interaction via the traditional phone number interface.
Lessons and Challenges
Cost Concerns: Utilizing Twilio for a local number in Indonesia proved too costly (around USD $23).
Platform Limitations: Vonage showed inconsistent performance on my platform. The web app SDK also did not integrate as seamlessly as the voice phone system, indicating a potential need for a different approach or platform.
Connectivity Issues: VAPI currently does not automatically connect to popular messaging platforms like WhatsApp or Telegram, which has been a frequent request from users.
Next Steps to Enhance AI Life Coach
Development Roadmap:
Integration of AI Workflows: Plan to connect the VAPI voice agent with a more sophisticated AI agent workflow to enhance functionality.
Option 1: Implement an AI Agent workflow using CrewAI.
Option 2: Explore a no-code AI Agent workflow builder like Relevance AI.
Potential Tools: Consider using Pipedream to stitch different systems together for a cohesive workflow.
Memory and Personalization: Develop the capability for the AI to remember user interactions and provide personalized guidance and reminders.
Expanding Accessibility: Assess alternative platforms for delivering the voice API to include support for other VOIP messaging platforms (WhatsApp, Telegram, Discord) or build out a web app frontend on Vercel (would rather avoid if possible).
User Interaction: Design the system to guide users more fluidly through conversations with options like 5-minute check-ins, 2-minute guided breathing sessions, and other personalized activities.
Privacy Enhancements: Ensure all conversations and data flows maintain high standards of privacy and security.
Market Testing: Conduct a product discovery flow to evaluate the value proposition and potential market willingness to pay. Test the improved system with at least 5 users to gather feedback on the new features and modes.
And that is where we are as of today! Will update based on the user feedback and the enhancements on the AI Life Coach, and the dream AI crew assembly towards a ✨remarkable✨ life!
Your Turn, Readers! As I tailor this journey towards building a remarkable life, I'm curious: What metrics or achievements would you find most compelling to track?
Thanks for joining me on this thrilling ride—where we not only dream about the future but actively build it. Catch you next week with more updates, insights, and perhaps a few surprises! 🚀
References:
Yan, Ziyou. (Apr 2024). Building an AI Coach to Help Tame My Monkey Mind. eugeneyan.com. https://eugeneyan.com/writing/ai-coach/.
https://www.thecut.com/article/mental-health-therapy-apps.html
I am truly amazed by the execution of this idea and how this is all made within a span of a week!
Here's a couple of things I love about it:
1. There is no lag in the conversation
The speed of the response amazes me, it made the experience seamless and seemed non-robotic
2. Great responses
Yes the responses are not technically giving me solutions to my problem, but it truly acts like an empathetic coaching professional that facilitates me to organize my thought process. I've conversed with a coach previously who usually charges $100/hr and the responses the AI life coach gave me is similar. It does a great job in summarizing and understanding my points and thought process
Improvement points:
1. Vincent - Who that?
For some reason it keeps referring to me as Vincent, it never asked for my name to begin with haha.
Putting Vincent aside though, I think this experiment can truly turn into a profitable business whether it be B2C or B2C!
Wow, I am impressed with how much you have done in the short space of a week. I was curious after reading last weeks article on what would come out this week.
A thought based on personal experience as a coachee:
I feel a key role of a life coach is to facilitate self-reflection, helping you achieve realisations and motivating you to take action. They are adept at listening to you and reflecting your thoughts back in a clearer, more concise way, as well as asking questions that encourage broader thinking and challenge your underlying assumptions.
A knock off example I can come up with, is how I use ChatGPT to rephrase what I write into a format tailored for my audience
Look forward to next weeks post