Project Zenith Walkthroughs
Self-guided journeys to testing the full capabilities of Zenith's multimodal architecture. Choose a demo below to get started.
menu_bookTable of Contents
- build Preparation
- visibility Visual Context Demo
A deep-dive into the Universal Technical Concierge, multimodal escalation, and contextual text handoffs.
- mood Sentiment Analysis Demo
Test the agent's ability to read your facial expression, empathize, and match your emotional energy in real-time.
Preparation
Getting your environment ready for the demos.
Launch the Application
Open the Zenith Agent by clicking the Launch Agent button in the top right, or on the main landing page.
Prepare External Objects
If you plan on testing the Visual Context Demo, grab an everyday object (like a Starbucks coffee cup, a pen, or a notebook) to show the agent.
camera_viewVisual Context Demo
Phase 1: Text Orchestration
Testing the GECX agent's natural language understanding.
In the chat window, click the Visual Context Demo suggested badge, or type:
"I have an object here but I'm not sure what it is. Can you help me identify it?"
Expected: The GECX agent should offer to assist and immediately request permission to activate your camera for a visual inspection.
Phase 2: Visual Escalation & Handling
Testing the Gemini Live Multimodal WebRTC pipeline.
Accept the Camera Request
Click Connect when the UI slideover requests hardware access. The Pipecat agent will boot up and greet you warmly via spoken audio.
Test Edge Case: Pitch Black Camera
Cover your laptop camera entirely grouping your hand over it. Wait a few seconds and ask via audio:
"Can you still see what I'm doing?"
Expected: The agent should politely inform you that the screen is dark and ask you to adjust your lighting, without speaking any markdown formatting.
Provide Visual Context
Uncover the camera, hold up your object (e.g. coffee cup), and ask:
"Can you tell me what I'm holding and what it is used for?"
Expected: Complete multimodal synthesis. The agent should excitedly identify the object and explain its purpose.
Phase 3: Contextual Handoff
Testing the seamless transition back to text mode with preserved context.
When you are satisfied with the interaction, tell the agent via audio, or press the End button:
"That's all the help I need right now. Thanks!"
Watch the UI: The session will silently end — the camera stream shuts down and the video window collapses automatically. No spoken goodbye, no manual steps.
The Magic Handoff:
Look at your text chat. The GECX text agent will seamlessly resume the conversation with a follow-up message that acknowledges what happened in the video session (e.g., "Is there anything else I can help you with?"). Two completely separate agent brains — one voice, one text — and the context carries over automatically.
sentiment_satisfiedSentiment Analysis Demo
Phase 1: Empathetic Triage
Testing the agent's ability to detect emotional intent and route to the right specialist.
In the chat window, click the Sentiment Analysis suggested badge, or type:
"I've been feeling a bit off today and I'm not sure why. Could you take a look at me and tell me how I'm coming across?"
Expected: The agent should empathize with your feelings and ask for permission to enable your camera so it can see how you're doing.
Phase 2: Emotional Check-In
Testing face-to-face sentiment analysis and empathetic mirroring.
Accept the Camera Request
Grant consent by saying "Yes" or "Sure" in the chat. Then click Connect when the UI requests hardware access. The agent will greet you warmly via spoken audio.
Show an Expression
Ensure your face is clearly visible. Try different expressions — an exaggerated frown, a big smile, or a neutral look — and ask:
"How do I look today?"
Expected: The agent should read your facial expression and respond with genuine empathy — commenting on whether you look happy, tired, stressed, or relaxed, and following up with a natural question.
Conversational Presence: Notice the glowing orb inside the video view — it pulses in sync with the agent's voice, providing visual feedback that the agent is actively listening.
Phase 3: Contextual Handoff
Testing the seamless transition back to text with preserved emotional context.
When you are satisfied with the interaction, tell the agent via audio, or press the End button:
"That's everything, thank you!"
Watch the UI: The session will silently end — the camera stream shuts down and the video window collapses automatically. No spoken goodbye, no manual steps.
The Magic Handoff:
Look at your text chat. The GECX text agent will seamlessly resume the conversation with a follow-up that acknowledges your emotional check-in (e.g., "Is there anything else I can help you with?"). Two separate agent brains — one voice, one text — and the emotional context carries over automatically.