Windows PowerShell and the Text-to-Speech REST API (Part 5)

Summary: Send and receive content to the Text-to-Speech API with PowerShell.

Q: Hey, Scripting Guy!

Could you give a buddy a hand in getting the last pieces together for the Text-to-Speech API?

—SR

A: Hello SR,

No problem at all. The last few posts, we dealt with the “Heavy Lifting” (which really wasn’t that heavy):

Authentication
Creating the headers
Defining the SSML body

Now at this point, we only need to call up the REST API to have it do the magic. So just like our Authentication API, we have the following pieces:

Endpoint
Method
Authentication
Headers
Body

We have the final three already. In a previous post, we built the headers (Application ID, GUID, and audio format), and we’ve just finished building the body. The authentication token is contained within the headers.

What we need to know now is the endpoint, and what Invoke-RestMethod needs. This can all be found here at Bing Text to Speech API, under the “Authorization Token” section.

The endpoint is https://speech.platform.bing.com/synthesize. To determine the method required, just glance at the “Example: voice output request.” It shows you the method as the first line.

In many REST API examples, if you see an example of the output, the first line is often the method. You can use this trick in other scenarios with REST APIs in general.

All we need to do now is assemble the pieces, and call up Invoke-RestMethod. To avoid you running back and forth, I’ve assembled the entire script:

Try

{

[string]$Token=$NULL

# Rest API Method

[string]$Method='POST'

# Rest API Endpoint

[string]$Uri='https://api.cognitive.microsoft.com/sts/v1.0/issueToken'

# Authentication Key

[string]$AuthenticationKey='13775361233908722041033142028212'

# Headers to pass to Rest API

$Headers=@{'Ocp-Apim-Subscription-Key' = $AuthenticationKey }

# Get Authentication Token to communicate with Text to Speech Rest API

[string]$Token=Invoke-RestMethod -Method $Method -Uri $Uri -Headers $Headers

}

Catch [System.Net.Webexception]

{

Write-Output 'Failed to Authenticate'

}

$AudioOutputType='riff-16khz-16bit-mono-pcm'

$XSearchAppID='dccd93ecb3cf4535aac9350c9b5fb2f8'

$XSearchClientID='45b403b6ae0d4f9ca13ca05f61a58ab2'

$UserAgent='PowerShellTextToSpeechApp'

$Header=@{ `

'Content-Type' = 'application/ssml+xml'; `

'X-Microsoft-OutputFormat' = $AudioOutputType; `

'X-Search-AppId' = $XSearchAppId; `

'X-Search-ClientId' = $XSearchClientId; `

'Authorization' = $AccessToken `

}

; New Content from this post below here ;

$Locale='en-US'

$ServiceNameMapping='Microsoft Server Speech Text to Speech Voice (en-US, JessaRUS)'

$Content='Hello everyone, this is Azure Text to Speech'

$Body=''+$Content+''

$Endpoint= 'https://speech.platform.bing.com/synthesize'

$Method='POST'

$ContentType='application/ssml+xml'

$Filename='output.wav'

Invoke-RestMethod -Uri $Endpoint -Method $Method `

-Headers $Headers -ContentType $ContentType `

-Body $Body -UserAgent $UserAgent `

-OutFile $Filename

Notice that Invoke-RestMethod has a -filename parameter, which allows you to store received output directly as a file on the file system.

When this process is complete, you will have a WAV file (which is what we chose). It can be launched in your choice of audio application.

The reason the API has regions is that it accepts text in a local language (such as French). If you target the appropriate region, the speaking voice is tuned to speak with the appropriate accent and inflections of the text.

Pretty cool, eh? If you’re tripping over any of the pieces, don’t forget to review the earlier four parts of this series to get it all ironed out!

That’s all for the moment, but keep a lookout for more in the way of PowerShell here.

I invite you to follow the Scripting Guys on Twitter and Facebook. If you have any questions, send email to them at scripter@microsoft.com, or post your questions on the Official Scripting Guys Forum.

Sean Kearney, Premier Field Engineer, Microsoft

Frequent contributor to Hey, Scripting Guy!

Windows PowerShell and the Text-to-Speech REST API (Part 5)

Author

0 comments

Read next

PowerTip: Determine your version of PowerShell and host operating system

Regular Expressions (REGEX): Introduction