AUDIO-VISUAL ENHANCEMENT TO SPEECH RECOGNITION SYSTEMS
RealSpeaker 2.0 uses deep learning technology to help professionals convert speech-to-text without the use of a keyboard. The company offers audio-visual enhancements for speech recognitions systems to analyze audio video files and output text translations.
Most speech recognition solutions have severe limitations,
such as: inaccuracy due to background noise, difficulty
recognizing individual words, and variation in the way
individuals speak (accent and dialect). Today’s commercial
natural language systems like Siri, Cortana, and Google Now
only understand commands, and not conversations. The
problem is that people can’t express complex ideas with
Unique Value Proposition
Our patent-pending solution uses video information in addition
to audio information to improve voice recognition accuracy by
at least 20-30 percent when compared to competitors.
RealSpeaker 2.0 addresses the problems identified above by
enhancing speech-recognition accuracy to 95% by
incorporating information from video analysis of a users’ lip
movements. With RealSpeaker 2.0 users can enter text of any
length with voice and video without the need for a keyboard
into any text editor or website; and the software analyzes
audio video files for transcription into text.
US Patent 13/942,689: “System of video enhancement for
audio speech recognition solutions to improve the accuracy of
audio speech recognition due to the analysis of speaker lip
movements” and Russian patent “Lips tracker”.
Since its foundation RealSpeaker has released a beta version
product for Windows OS that is currently used by 3,500 paid
users and 4 paid b2b clients.
We have also received a scientific grant from Microsoft’s Seed
Fund and have been accelerated by Startupbootcamp. We
were selected to participate in the Startup Chile program. Also our investor is Enterprise Ireland organization.
Business model: Subscription & Selling licenses
Currently we offer a 14-day free trial of our service.
Afterwards users can subscribe to our service for a
monthly fee. APIs and SDK are available to
developers and for third party integration. We
already have 4 paid b2b clients.
Our current investor is StartupBootcamp, Enterprise Ireland.
Moreover, we have signed NDAs with Samsung, LG,
Toyota, Itouchu, and Google.
Currently, we are looking for partners as well as
investors for our Series A financing round, which will
be used to further our R&D efforts and increase our
marketing budget to ramp up to our projected
According to TechNavio, currently only 15-20% of
the speech recognition market potential is utilized.
The total global market of speech recognition
technologies in 2010 was estimated at $1 bn. and
will grow to of $2.5 bn. (+25% per year) in 2016.
– CEO Founder: Viktor Osetrov
Our team consists of experts in the field of speech