As developers, we adapt as technologies move from the realm of Science Fiction into readily available SDKs. Thatís certainly, or perhaps especially, true for speech technologies. In the past 5 years, devices have become more personal and demanding of new forms of interaction.
In Windows 10, speech is front-and-center with the Cortana personal assistant, and the Universal Windows Platform (UWP) gives us several ways to plug into that ďHey, Cortana
Ē experience. But thereís much more that we can do when working with speech from a UWP app and thatís true whether working locally on the device or remotely via the cloud.
In this 3-part series, we will dig in to some of those speech capabilities and show that speech can be both a powerful and a relatively easy addition to an app. This series will look atÖ
- the basics of getting speech recognized
- how speech recognition can be guided
- how we can synthesize speech
- additional capabilities in the cloud for our UWP apps
In todayís post, weíll start with the basics.
Just because we can doesnít always mean we should
Using a ďnaturalĒ interaction mechanism like speech requires thought and depends on understanding usersí context:
- What are they trying to do?
- What device are they using?
- What does sensor information tell us about their environment?
As an example, delivering navigation directions via speech when users are driving is helpful because their hands and eyes are tied up doing other things. Itís less of a binary decision, though, if the users are walking down their city streets with their devices held at armsí lengthóspeech might not be what they are looking for in this context.
Context is king, and itís not easy to always get it right even with a modern device thatís packed with sensors. Consider your scenarios carefully and look at our guidance
around these types of interactions before getting started...