1
Speech recognition technologies have already quite widely entered our daily lives.
We use them while driving, voice assistants live in smart speakers at home. Smart home control systems are gaining more and more popularity every day.
Someone, probably, even uses voice assistants in their smartphones.
2
Speech recognition technologies have already quite widely entered our daily lives.
We use them while driving, voice assistants live in smart speakers at home. Smart home control systems are gaining more and more popularity every day.
Someone, probably, even uses voice assistants in their smartphones.
3
Speech recognition technologies have already quite widely entered our daily lives.
We use them while driving, voice assistants live in smart speakers at home. Smart home control systems are gaining more and more popularity every day.
Someone, probably, even uses voice assistants in their smartphones.
4
Speech recognition and synthesis technologies certainly make life easier for people who cannot use other input interfaces and cannot see the information displayed on the screen.
It can be both our elderly parents and ourselves, so the accessibility of interfaces should be paid enough attention.
5
Speech recognition and synthesis technologies certainly make life easier for people who cannot use other input interfaces and cannot see the information displayed on the screen.
It can be both our elderly parents and ourselves, so the accessibility of interfaces should be paid enough attention.
6
The Web Speech API enables you to incorporate voice data into web apps. The Web Speech API has two parts:
7
The Web Speech API enables you to incorporate voice data into web apps. The Web Speech API has two parts:
8
The Web Speech API enables you to incorporate voice data into web apps. The Web Speech API has two parts:
Speech recognition is accessed via the SpeechRecognition interface, which provides the ability to recognize voice context from an audio input (normally via the device's default speech recognition service) and respond appropriately. Generally you'll use the interface's constructor to create a new SpeechRecognition object, which has a number of event handlers available for detecting when speech is input through the device's microphone.
9
This API has a set of standard properties, methods, and events.
Here are some of them:
Starts the speech recognition service listening to incoming audio with intent to recognize grammars associated with the current SpeechRecognition.
Stops the speech recognition service from listening to incoming audio, and attempts to return a SpeechRecognitionResult using the audio captured so far.
SpeechRecognition.start()
SpeechRecognition.stop()
SpeechRecognition.abort()
Stops the speech recognition service from listening to incoming audio, and doesn't attempt to return a SpeechRecognitionResult.
10
This is an experimental technology, therefore it has very limited browser support
(in fact, only Сhromium supports it right now)
11
12
<script type="text/javascript"> const recognition = new SpeechRecognition(); recognition.onresult = (event) => { if (event.results.length > 0) { q.value = event.results[0][0].transcript; q.form.submit(); } } </script> <form action="https://www.example.com/search"> <input type="search" id="q" name="q" size=60> <input type="button" value="Click to Speak" onclick="recognition.start()"> </form>
13
Now speech synthesis technologies are widely used for various tasks, such as voicing text in online translators (in one of which I am writing this text right now), or for web-surfing with screen readers.
But speech recognition is used mainly in applications on native platforms and is not very common in the browser web. This is probably because the browser has restrictions on the use of the microphone related to privacy and security, and of course, due to the low browser support.
This makes it difficult a little to use this API in practice, but I tried to find some examples of usage Speech Recognition API that would be interesting to implement for the web-developer in some of his pet projects.
14
with the ability to switch between light and dark themes using speech recognition
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> ... <title>Karasique's Page</title> </head> <body class="page theme theme_dark1"> <button class="button toggler" id="speech-btn">😜</button> <h1 class="heading">most famous cat in web</h1> <p class="output"></p> <section class="section"> <h2 class="heading"> Meow </h2> <div class="wrapper"> <ul class="list"> <li class="list__item"> <button class="button button_red">Meow!</button> </li> <li class="list__item"> <button class="button button_green">Meow?</button> </li> </ul> <p class="paragraph"> Meow meow meow meow meow meow meow meoweow meow meow ... meow meow meow meow. </p> </div> </section> <img class="photo" src="/image.png" alt="Karasique" width="700"> <script src="script.js"></script> </body> </html>
Let's start by adding markup and some content to the page.
15
:root { --transition: all 0.4s ease-in-out; --image-width: 70rem; margin: 0; font-size: 62.5%; overflow: hidden; } .theme_dark { --background-default: #232946; --text-default: #b8c1ec; --text-heading: #fffffe; --background-danger: #b011e0; --background-success: #3886ec; } .theme { background-color: var(--background-default, #abd1c6); color: var(--text-default, #0f3433); transition: var(--transition); }
Now add different handsome stuff
(with css-custom-properties, I really love them)
It's a small fragment of code, but it does the main job:
16
We got this. Pretty nice.
const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; const SpeechGrammarList = window.SpeechGrammarList || window.webkitSpeechGrammarList; const SpeechRecognitionEvent = window.SpeechRecognitionEvent || window.webkitSpeechRecognitionEvent; const colors = ['dark', 'light']; const grammar = '#JSGF V1.0; grammar colors; public <color> = ' + colors.join(' | ') + ' ;' const recognition = new SpeechRecognition(); const speechRecognitionList = new SpeechGrammarList(); speechRecognitionList.addFromString(grammar, 1); recognition.grammars = speechRecognitionList; recognition.lang = 'en-US'; recognition.interimResults = false;
It's time to do some serious things!
Here we connect and configure our Speech Recognition API:
17
let micOn = false; onClick = () => { if (!micOn) { ding.play(); recognition.start(); micOn = true; } } button.addEventListener('click', onClick); recognition.onresult = (event) => { const result = event.results[0][0].transcript; if (colors.includes(result)) { switch (result) { case 'dark': page.classList.add('theme_dark'); break; default: page.classList.remove('theme_dark'); break; } } }
And here we listen to the click on the button and fire the event.
API will do the rest for us 👍
18
19