I recently completed the Conversational Design course with UX Academy, and wanted to share my experience designing and building for Voice User Interfaces (VUI). I took the course to gain exposure to this fast-growing technology and better understand how it can be used to solve customers’ problems and improve relationships with them.

So, if you’re even remotely curious about designing for voice-first experiences, this is for you. I’ve also included free resources at the end, to help you get started.

Finding the right use case

Like any other UX project, user research is a key part of designing an effective voice experience. In my case, I was designing for Hopper, the mobile-only travel booking app. The first activity I did was to conduct user interviews to better understand users’ experience when planning a holiday trip.

User insights

  • The flight prices shown on most travel aggregators aren’t necessarily accurate, because baggage costs are often not included. Additionally, each airline provider has their own baggage rules, which leaves users with having to dig up this information on airlines’ websites, which isn’t a great experience.
  • Another pain point with these aggregators, is that they don’t factor the transport cost to/from the airports into the total flight price, which makes it harder for users to get a realistic price and more accurately compare their flight options.

Through context mapping I then started framing these insights into “How Might We” opportunities.

I then sketched some ideas to address the chosen HMW (with the green dot), and with the help of fellow course participants we role-played what the conversation between the voice assistant and the user might look like. That’s when I realised that asking users about their flight preferences for them to more accurately compare flight options, wasn’t the right use case for voice.

Limitations of Voice

In turns out that, voice as a medium can be quite limiting because unlike websites and apps:

  • You can’t skim through the information provided.
  • There is a high reliance on recall than recognition — which adds to users’ cognitive load.
  • There is low discoverability of affordances and constraints, and
  • People can’t listen as quickly as they can read.

So, to find the right use case, I had to be mindful of these limitations and play to voice’s strengths. As Nandini Stocker said: “Users make conscious trade-offs when choosing what type of interface to use”. Therefore, users will only use voice if they can get through the conversation faster than typing it out, if the information presented is concise and easily retainable and finally, and if the interactions required to help achieve their goals are short and frequent.

User Journey

With the above in mind, I mapped a user journey that leverages voice to enhance the existing journey and help achieve user’s need of staying up-to-date with the latest price changes for flights they are currently “watching” (i.e. their saved trips) via the Hopper app.

In the outlined scenario, I was assuming that the typical user would have at least one saved trip in the app, created an account and uploaded a payment card.

But if any of these assumptions didn’t hold, the experience would be compromised. To mitigate this, I created an onboarding screen that would both raise awareness and educate the user of how to best use the Hopper Skill.

I also designed the account-linking screens to prompt the user to link their Hopper account with Alexa, after they’ve signed up.

FYI — if you’re wondering what the hell a “Skill” is and how it works, at least conceptually, then the next section will help, otherwise feel free to skip it.

Voice Fundamentals

A skill is essentially what Alexa is capable of doing. There are 3 types of skills:

  1. Enabled by default (Amazon Music, Weather, Alarm)
  2. Internet of Things (Turning on/off lights) and,
  3. Third Party Skills (Vodafone, Fitbit, Uber, Spotify)

If you guessed that the Hopper skill is a 3rd party skill, that would be correct.

The interaction model of a skill is made up of 3 elements:

  • Intent — what the user can do with the skill.
  • Utterance — what the user can say to invoke an intent.
  • Slot — an argument to an intent, which gives Alexa more information about that intent.

This can be better understood through an example:

Prototyping

This was probably the most exciting part. I prototyped the Hopper skill using Voiceflow, which is fairly straight-forward and easy enough to put together a robust conversation that can account for the various ways users might utter or express their intents.

Since I didn’t pay for the pro version of the tool, and wasn’t able to share my skill, I thought I’d record it and upload it to YouTube instead(see below). Included are three short conversations:

Conversation 1: First-time User, Happy Path

Conversation 2: Regular User, Happy Path

Conversation 3: Regular User, Unhappy Path

While I got pretty close to simulating what the real solution could look like, the main difference is that, in reality, the user would be able to link their Hopper account with Alexa, thereby requesting access to user info such as name, notifications etc, and making the conversation feel even more natural and intuitive.

Hope you found this useful, and feel free to reach out if you want to find out more about the course or the project!

Resources:

As promised, here are some free resources to learn more about conversational design:

Articles and keynotes

What is VUI?

Conversational Interfaces Explained

Conversations with Machines — Keynote by Nate Clinton

Books

Designing Voice User Interfaces

Podcasts

VUX World