September 24th, 2015

Add a Conversation to your Android App with Voice Interactions

James Montemagno
Principal Manager, Tech PM

tunein-interactionHigh app engagement rates don’t have to be difficult to achieve. By adding a few additional touches to your app, your users will want to come back time and time again. Google introduced Voice Actions to the Android Google app, enabling users to speak to their device to easily launch apps and allowing developers to set their app to perform specific system actions such as taking pictures, playing music, and more. With Android Wear, these voice actions became extremely important since they were the only real way to interact with apps on the wearable device. However, they were a bit lacking in functionality because there was no additional context given to the developer and no way to carry on a conversation with users. This has all changed with the introduction of Voice Interactions in Android Marshmallow.

Voice Interactions introduce a simple API to extend system Voice Actions and ask follow-up questions to get them to the right place. For instance, if your user decides to check their step count on a pedometer app, you may want to follow up by asking if they want to see the day, week, or month recap. Voice Interactions can also be used for approval, for example, when your user wants to book a taxi. Your app may report back the time of arrival and estimated cost and with with a simple spoken “Yes,” your user can confirm the action.

PictureConversation

Getting Started

If you have already implemented a system or custom Voice Action, you’re ahead of the game, but if you haven’t you should take a look through the supported system Voice Actions to see if one fits your app. For the example in this blog I have a photo taking app that any user can launch by saying “OK Google, take a picture.” This is great, but I want to give my users the power to flip between the front facing and back facing camera and also give them a keyword to take the picture with their voice.    

Adding Voice to Intent Filter

Voice Actions are added as an Intent Filter on top of the Activity that you want launched. For instance I have a TakePictureActivity with the following Intent Filters:

[Activity(Label = "Voice Camera", LaunchMode = Android.Content.PM.LaunchMode.SingleTop)]  
[IntentFilter(new []{MediaStore.ActionImageCapture}, Categories = new []{Intent.CategoryDefault})]
[IntentFilter(new []{MediaStore.ActionImageCaptureSecure}, Categories = new []{Intent.CategoryDefault})]
public class VoiceActivity : BaseActivity
{
}

To define these as Voice Interaction activities, all I need to do is add the new category: Intent.CategoryVoice

[IntentFilter(new []{MediaStore.ActionImageCapture}, Categories = new []{Intent.CategoryDefault, Intent.CategoryVoice})]
[IntentFilter(new []{MediaStore.ActionImageCaptureSecure}, Categories = new []{Intent.CategoryDefault, Intent.CategoryVoice})]

If you have other activities flagged with these intent filters, the Google app will always prefer an activity that has been flagged with the Voice Interaction category.

Receiving Voice Interactions

To check if an intent was launched, the IsVoiceInteraction can be checked on the OnResume() and then the VoiceInteractor will be accessible to requests that are sent through. There are several types of requests that can be sent, such as Abort, Command, PickOption, Complete, and Confirmation. A ConfirmationRequest comes in handy when a simply yes/no confirmation is necessary, such as in the case of booking a taxi.

protected override void OnResume()
{
    base.OnResume();
    if (!IsVoiceInteraction)
      return;

    var prompt = new VoiceInteractor.Prompt("A taxi is about 5 minutes away do you want to be picked up?");
    var request = new ConfirmTaxiRequest(prompt);
    VoiceInteractor.SubmitRequest(request); 
}

class ConfirmTaxiRequest : VoiceInteractor.ConfirmationRequest
{
    public ConfirmTaxiRequest(VoiceInteractor.Prompt prompt) 
      :base(prompt, null)
    {
    }

    public override void OnConfirmationResult(bool confirmed, Bundle result)
    {
      base.OnConfirmationResult(confirmed, result);
      if (confirmed)
      {
        //Finalize taxi confiramation
        Toast.MakeText(Activity, "Your taxi has been confirmed.", ToastLength.Long).Show();
      }
      else
      {
        Toast.MakeText(Activity, "No taxi ordered.", ToastLength.Long).Show();
      }

      Activity.Finish();
    }

    public override void OnCancel()
    {
      base.OnCancel();
      Activity.Finish();
    }
}

Notice that ConfirmTaxiRequest is an implementation of VoiceInteractor.ConfirmationRequest and holds on to the original Activity, allowing it to be closed when the confirmation is completed.

Multiple Choice Interactions

In my photo taking app, I actually want to allow my user to pick the front or rear camera first, which is when a PickOptionRequest comes in handy in allowing multiple options for the user to speak back. It’s important to have a user interface to interact with as a fallback, as it shouldn’t be required to have the user speak commands.

protected override void OnResume()
{
    base.OnResume();
    if (!IsVoiceInteraction)
        return;

    //Send our our first request asking for front or rear facing camera to use.
    //Allow multiple synonyms to be accepted
    var front = new VoiceInteractor.PickOptionRequest.Option("Front Camera", 0);
    front.AddSynonym("Front");
    front.AddSynonym("Selfie");
    front.AddSynonym("Forward");

    var rear = new VoiceInteractor.PickOptionRequest.Option("Rear Camera", 1);
    rear.AddSynonym("Rear");
    rear.AddSynonym("Back");
    rear.AddSynonym("Normal");

    var prompt = new VoiceInteractor.Prompt("Which camera would you like to use?");
    var request = new CameraChoiceRequest(prompt, new [] { front, rear }, new [] {buttonFront, buttonRear});
    
    VoiceInteractor.SubmitRequest(request);    
}

protected class CameraChoiceRequest : VoiceInteractor.PickOptionRequest
{
    public CameraChoiceRequest(VoiceInteractor.Prompt prompt, Option[] choices) 
        : base(prompt, choices, null)
    {
    }

    public override void OnPickOptionResult(bool finished, Option[] selections, Bundle result)
    {
        base.OnPickOptionResult(finished, selections, result);

        if (!finished || selections.Length != 1)
            return;
        //User has selected an option and we can specify the front or rear facing camera and add in the camera fragment.
        var fragment = CameraFragment.NewInstance();
        Activity.Intent.PutExtra("android.intent.extra.USE_FRONT_CAMERA", selections[0].Index == 0);
        //pass down the intent extras as arguments.
        fragment.Arguments = Activity.Intent.Extras;
        Activity.FragmentManager.BeginTransaction().Replace(Resource.Id.container, fragment).Commit();
    }
}

which camera

Say Cheese!

When the CameraFragment is shown, another VoiceInteraction can be prompted to have the user say “cheese” to snap the photo. Specifying additional synonyms will allow additional keywords to be accepted.

void StartVoiceTrigger()
{
    var option = new VoiceInteractor.PickOptionRequest.Option("Cheese", 1);
    option.AddSynonym("Ready");
    option.AddSynonym("Go");
    option.AddSynonym("Take it");
    option.AddSynonym("Ok");

    var prompt = new VoiceInteractor.Prompt("Say Cheese");
    Activity.VoiceInteractor.SubmitRequest(new ChoiceRequest(this, prompt, new []{ option }));
}

SayCheese

There you have it! With just a few lines of code, you can have full Voice Interactions for hands free functionality in Android Marshmallow! Here it is in action:

Learn More

To learn more about getting started with Android Marshmallow, be sure to read through the getting started documentation and browse full samples of the latest features of Marshmallow. Be sure to read through Google’s Voice Interaction documentation and I’ve also provided a full sample of the photo taking app on my GitHub.

Author

James Montemagno
Principal Manager, Tech PM

James Montemagno is a Principal Lead Program Manager for Developer Community at Microsoft. He has been a .NET developer since 2005, working in a wide range of industries including game development, printer software, and web services. Prior to becoming a Principal Program Manager, James was a professional mobile developer and has now been crafting apps since 2011 with Xamarin. In his spare time, he is most likely cycling around Seattle or guzzling gallons of coffee at a local coffee shop. He co-hosts the weekly development podcast Merge Conflict http://mergeconflict.fm.

0 comments

Discussion are closed.

Feedback