- The Shift
- Posts
- Amazon Brings Emotion to AI Voice
Amazon Brings Emotion to AI Voice
Plus, š Use Deep Research on Gemini for better information synthesis, The State of AI 2025: Smarter, Cheaper, Riskier, and more!

Hello there! Ready to dive into another upgrading, Mind-boggling, and Value-filled Shift?
Today we have:
šļø Amazonās Nova Sonic Brings Emotion to AI Voice
š Use Deep Research on Gemini!
š The State of AI 2025: Smarter, Cheaper, Riskier
š Tools and Shifts you Cannot Miss
šļø Amazonās Nova Sonic Brings Emotion to AI Voice
Insights from Amazon
Amazon has unveiled Nova Sonic, a new foundation model that unifies speech understanding and generation, allowing AI to respond not just to what you sayābut how you say it. With faster response times and deeper acoustic awareness, this model marks a new phase in voice-first AI experiences.

The Decode:
1. A single model that listens and speaks - Nova Sonic merges speech-to-text, LLM understanding, and voice generation into one model. It adapts in real-time to tone, pauses, and emotional cues, creating conversations that sound more human and less robotic.
2. Fast, accurate, and nuanced across use cases - Nova Sonic delivers responses with just 1.09 seconds latency, outperforming GPT-4o in noisy environments with 46.7% better accuracy. It excels at dynamic, multi-turn interactionsāhelping AI agents in industries like travel and enterprise respond appropriately to emotional shifts in conversation.
3. Reel 1.1 unlocks longer, smarter video creation - Amazon also rolled out Nova Reel 1.1, boosting video generation lengths to 2 minutes. Users can now storyboard with shot-by-shot control or use a single prompt.
4. Lower cost, higher value for developers - Available through Amazon Bedrock, both models cost significantly less than OpenAI equivalentsāNova Sonic is reportedly 80% cheaper. Along with browser automation tools like Act SDK, Amazon is building an integrated AI ecosystem with developer utility at the core.
The shift toward emotional, real-time interaction marks a key leap in how AI will integrate into daily toolsāfrom customer service to creative content.
š Use Deep Research on Gemini!
Gemini Advance now has Deep Research with 2.5 Pro, and itās insane. With itās deep research capabilities including sources, sources read but not used, and thought flow.
This gives you a good idea of the authenticity of the information received and makes easier to manually extract information from the right sources in case you want that.
It also allows you to export the whole research report as a doc and create an audio overview.
Hereās a step by step on how to use it.
Log into your Gemini account
Choose the model as Deep Research with 2.5 Pro
Go ahead and prompt your research topic
A document will be created in the chat where you can see the progress of the research
Finally once you have the report, you can click on the button saying export report to export it as a doc or click the dropdown to generate an audio overview.
Try it out now!
š The State of AI 2025: Smarter, Cheaper, Riskier
Stanfordās 2025 AI Index Report shows how artificial intelligence is accelerating at breakneck speedāmassive gains in capability, plunging costs, booming business use, and rising global competition. But that rapid growth comes with safety blind spots, soaring training costs, and a narrowing margin for error.
The Decode:
1. AI performance exploded, but reasoning still lags - AI systems crushed major benchmarks in 2024āSWE-bench scores jumped from 4.4% to 71.7%, and in coding tasks, agents outperformed humans. Yet models still struggle on complex reasoning benchmarks like PlanBench, with top scores under 9%.
2. Business adoption surged as AI got cheaper - 78% of organizations used AI in 2024 (up from 55%), with generative AI adoption doubling to 71%. GPT-3.5-level inference costs dropped over 280x in 2 years. Still, most firms only see modest ROI so farāunder 10% in cost savings and 5% in revenue gains, according to recent studies.
3. Global race intensifies, led by U.S.ābut Chinaās close - The U.S. released 40 top models vs. Chinaās 15, but Chinese models nearly closed the quality gap (down to 1.7% on key benchmarks). Open-weight models are approaching closed models in performance. Training compute now doubles every 5 months, tightening the innovation gap across companies and countries.
4. Costs, regulation, and safety are lagging - Training costs for frontier models crossed $170M (Llama 3.1), with future models projected to hit $1B. Reported AI incidents rose 56% last year, and only 4 of 221 proposed U.S. AI laws passed. New safety benchmarks like HELM Safety and AIR-Bench are emerging, but standardized adoption is still rare.
AI is quickly becoming foundational infrastructureālike electricity or the internetābut its development is outpacing safeguards. Businesses need to adopt fast to stay competitive.
š AI Tools for the Shift
š Motion Expert Agents ā Analyze, optimize, and improve ad performance with AI workflows crafted by the Worldās Best Creative strategists. Get your early access now!
š§ Cerebro ā Turn scattered data into clear, connected insights you can act on.
Simplify decision-making with AI-powered information mapping.
š¤ Victoria by VersionSeven ā Execute complex workflows across your entire toolstack via simple chat.
š„ VideoInsight ā Unlock key insights from your video content with advanced AI analysis.
Automate video review, save time, and scale smarter.
š Alpha Vision ā Next-gen physical AI built to boost security and increase ROI and enhance surveillance and safety with intelligent, real-time detection.
š¾ Quick Shifts
š Microsoft is expanding Copilot Vision beyond the Edge browser, enabling the AI assistant to "see" and interact with any app or screen on Windows. In testing with U.S. users, Copilot can now help with in-app guidance, file search, and more.
š¬ 67% of Americans donāt trust AI to make life-or-death decisions, with most preferring humans for tasks like jury duty, medicine, and legislation. Despite growing use, AI still lacks public confidence in handling high-stakes responsibilities or complex, nuanced judgment.
š¾ Nvidiaās new open-source model, Llama-3.1 Nemotron Ultra, delivers top-tier performance with just 253B parametersāless than half of DeepSeek R1ās. It scored 97% on MATH500 and 66.31% on LiveCodeBench while running cost-effectively on a single 8x H100 GPU.
šļø Samsung is integrating Googleās Gemini AI into its home robot Ballie, enabling it to process audio, video, and deliver personalized suggestions. From outfit feedback to sleep tips, Ballie is evolving into a multimodal AI-powered home companion.
š Googleās Gemini Code Assist now includes āagenticā capabilities, enabling it to take multi-step actions like building apps from Docs or converting code across languages. These AI agents can plan tasks, add features, review code, and generate tests.
Thatās all for todayās edition see you tomorrow as we track down and get you all that matters in the daily AI Shift!
If you loved this edition let us know how much:
How good and useful was today's edition |
Forward it to your pal to give them a daily dose of the shift so they can š
Reply