A few weeks ago, I was catching up with a pal, and he brought up a not-so-distant future where teddy bears with LLM capabilities will be the norm. At first, I was horrified, but then I loved the thought. I couldn’t help but think about the Seth MacFarlane masterpiece, “Ted”, and the idea brought me real joy and excitement, so I decided to dig a bit more into it.
Some History
1902: The teddy bear was invented, named after President Theodore “Teddy” Roosevelt. That's a nice little factoid.
1950s-1960s: Early talking toys emerged during this period. These were often pull-string dolls that played a pre-recorded message when a string was pulled. One of the earliest talking dolls was “Chatty Cathy” by Mattel, introduced in 1960. Super creepy….
1985: Teddy Ruxpin became the first commercially successful talking teddy bear. Teddy Ruxpin used a cassette tape player built into its back to tell stories. The bear’s eyes and mouth moved in sync with the audio, creating a lifelike experience. Also notoriously creepy….
2000s: Voice recognition and digital audio storage replaced older mechanical and cassette-based systems. Toys like “Furby” incorporated digital speech and interactivity. Less creepy.
2010s: The introduction of AI and smart technology brought a new generation of talking teddy bears. An impressive example for its time is the “CogniToys Dino,” which used IBM Watson’s AI to have intelligent conversations with kids. This feels very slow and clunky but definitely ahead of its time.
2020s: Only time will tell, but it's clear that the advent of LLMs and improvements in speech recognition and text-to-speech all make for an interesting case that the timing is now. The age of responsive, low-latency, clever, and hopefully not-so-creepy plushies is upon us.
SR + LLM + TTS = AI Plushy
Speech Recognition (SR): The program inside the bear needs to be able to hear speech and put it into text. It needs to do so quickly and with little to no errors. There are lots of options on the market; notably, OpenAI’s Whisper is available to the open-source community, and it's blazing fast.
LLM: That speech the SR program brilliantly transcribed must then be processed by a large language model. This can be GPT-3.5 or GPT-4, but that really depends on what you want the plushy to accomplish. Some open-source models could also do the trick. The tricky part that will take a nice chunk of data is the fine-tuning of said model. Fine tuning will ensure that your plush wont be teaching a kid how to build an atomic bomb or about the birds and the bees.
Text-to-Speech (TTS): TTS is what takes the output from the LLM and brings it to life. The last thing you want is a plushy that sounds like Siri or worse, like HAL 9000 from 2001: A Space Odyssey. You’re going to want your plush to sound fun, excited, cute, or maybe even down the line, you’ll need it to sound like Mickey Mouse or Buzz Lightyear. There is so much work being done all throughout Silicon Valley on TTS that it is hard to keep track. Some impressive, customizable solutions include 11Labs, Cartesia AI, or Google TTS, which is a bit more on the HAL 9000 end of the spectrum.\
To be Online or to be Offline? That is the question
The billion-dollar question here is: Should an AI plush connect to the internet?
The quick answer is, “Yes, of course!” You want the ability to collect data, push software and firmware updates, and have access to the best LLMs, TTS, and SR options. These are all fair considerations.
But the case can definitely be made for creating an offline plush. This means it can be used anywhere without Wi-Fi. The different components run locally on edge as opposed to the cloud. Small LLMs exist that can run on tiny computer chips, and most of the above components can be found open source. The TTS aspect would take considerable engineering prowess and time, but it’s doable.
The real advantage of going the offline direction is that you would no longer need to depend on OpenAI, Anthropic, or any of the big model providers for answers.
To put it simply: If you used GPT-3.5 and a TTS API like the ones mentioned above, you are paying $0.00158 per word. Sounds like chump change, but we are not building one plush. We want this thing to be in the hands of hundreds of thousands, if not millions, of happy kids.
I ran a quick scenario. Let's say 100,000 kids have the plush and they get 100 words a day out of it (which is conservative). The costs would look as follows:
100,000 kids x 100 words per day = 10 million words per day
10 million words/day x $0.00158 per word = $15,800 per day
Which comes out to:
- Daily Cost: $15,800
- Monthly Cost: $474,000
- Annual Cost: $5,767,000
The answer isn't so clear-cut, but my guess is that both offline and online will have a place in the market. An online version will need to support some kind of monthly subscription fee, which means the plush will need to have real value-add for users/parents. It could be an AI Plush Tutor that justifies monthly costs. While other AI toys could just be for fun, where an offline approach could make more sense.
This is fun. I know! But is this a good business to get into?
Could be. But most signs point to the sad fact that toy companies are really, really hard
LEGO, the gold standard in the toy industry, boasts $9 billion in revenue, with 25-30% operating margins and trades at 3-4x revenue. Hasbro trades at 1.5-2x revenue with 15-18% margins, and Mattel at 1-1.5x revenue with 10-12% margins.
Toy companies face lower margins and revenue multiples due to high production costs, intense competition, seasonal sales, licensing fees, rapid obsolescence, and retailer pressure. They’re not SaaS companies with 90% gross margins—they’re toy companies.
Toy companies often face lower margins and revenue multiples due to high production costs, intense competition, and seasonal sales patterns. Additionally, licensing fees, rapid obsolescence, and pressure from major retailers further squeeze profitability.
It’s not all gloomy, it’s just not going to be easy. I fundamentally believe an insane team is going to take on this challenge and find the right combination of technical and educational magic to build an enduring business that can justify fees, subscription, and a monthly recurring revenue model that can be more profitable than toy companies of the past.
If you’re building in this space, don’t hesitate to reach out.
right the dot analysis !