Universal Windows Platform – Speaking App

Speaking App demonstrates how to create an application which allows text files to be opened then read out in different Voices using text-to-speech – they can also be saved back as a Text file or the spoken version as an Audio file.

Step 1

If not already, follow Setup and Start on how to Install and get Started with Visual Studio 2017 or in Windows 10 choose Start, and then from the Start Menu find and select Visual Studio 2017.

vs2017

Step 2

Once Visual Studio Community 2017 has started, from the Menu choose File, then New then Project…

vs2017-file-new-project

Step 3

From New Project choose Visual C# from Installed, Templates then choose Blank App (Universal Windows) and then type in a Name and select a Location and then select Ok to create the Project
vs2017-new-project-window

Step 4

Then in New Universal Windows Project you need to select the Target Version this should be at least the Windows 10, version 1803 (10.0; Build 17134) which is the April 2018 Update and the Minimum Version to be the same.

vs2017-target-platform

The Target Version will control what features your application can use in Windows 10 so by picking the most recent version you’ll be able to take advantage of those features. To make sure you always have the most recent version, in Visual Studio 2017 select Tools Extensions and Updates… then and then see if there are any Updates

Step 5

Once done select from the Menu, Project, then Add New Item…

vs2017-project-add-new-item

Step 6

From the Add New Item window select Visual C#, then Code from Installed then select Code File from the list, then type in the Name as Library.cs before selecting Add to add the file to the Project

vs2017-add-new-item-library

Step 7

Once in the Code View for Library.cs the following should be entered:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using Windows.Media.SpeechSynthesis;
using Windows.Storage;
using Windows.Storage.Pickers;
using Windows.Storage.Streams;
using Windows.UI.Xaml.Controls;
using Windows.UI.Xaml.Media;

public class Library
{
    private const string extension_txt = ".txt";
    private const string extension_wav = ".wav";

    private SpeechSynthesizer synth = new SpeechSynthesizer();

    private async Task<string> OpenAsync()
    {
        try
        {
            FileOpenPicker picker = new FileOpenPicker()
            {
                SuggestedStartLocation = PickerLocationId.ComputerFolder
            };
            picker.FileTypeFilter.Add(extension_txt);
            StorageFile open = await picker.PickSingleFileAsync();
            if (open != null)
            {
                return await FileIO.ReadTextAsync(open);
            }
        }
        finally
        {
        }
        return null;
    }

    private async void SaveAsync(string contents)
    {
        try
        {
            FileSavePicker picker = new FileSavePicker()
            {
                SuggestedStartLocation = PickerLocationId.DocumentsLibrary,
                DefaultFileExtension = extension_txt,
                SuggestedFileName = "Document"
            };
            picker.FileTypeChoices.Add("Text File", new List<string>() { extension_txt });
            picker.FileTypeChoices.Add("Wave File", new List<string>() { extension_wav });
            StorageFile save = await picker.PickSaveFileAsync();
            if (save != null)
            {
                if (save.FileType == extension_txt)
                {
                    await FileIO.WriteTextAsync(save, contents);
                }
                else if (save.FileType == extension_wav)
                {
                    using (SpeechSynthesisStream stream = await synth.SynthesizeTextToStreamAsync(contents))
                    {
                        using (DataReader reader = new DataReader(stream))
                        {
                            await reader.LoadAsync((uint)stream.Size);
                            IBuffer buffer = reader.ReadBuffer((uint)stream.Size);
                            await FileIO.WriteBufferAsync(save, buffer);
                        }
                    }
                }
            }
        }
        finally
        {
        }
    }

    private async void Speak(string text, MediaElement media)
    {
        try
        {
            if (media.CurrentState == MediaElementState.Playing)
            {
                media.Stop();
            }
            else
            {
                SpeechSynthesisStream stream = await synth.SynthesizeTextToStreamAsync(text);
                media.AutoPlay = true;
                media.SetSource(stream, stream.ContentType);
                media.Play();
            }
        }
        finally
        {
        }
    }

    public Dictionary<string, string> Voices()
    {
        Dictionary<string, string> results = new Dictionary<string, string>();
        foreach (VoiceInformation voice in SpeechSynthesizer.AllVoices.OrderBy(o => o.DisplayName))
        {
            results.Add(voice.Id, voice.DisplayName);
        }
        return results;
    }

    public void Voice(string id)
    {
        synth.Voice = SpeechSynthesizer.AllVoices.First(f => f.Id == id);
    }

    public void New(ref TextBox text, ref MediaElement media)
    {
        media.Source = null;
        text.Text = string.Empty;
    }

    public async void Open(TextBox text)
    {
        string content = await OpenAsync();
        if (content != null)
        {
            text.Text = content;
        }
    }

    public void Save(ref TextBox text)
    {
        SaveAsync(text.Text);
    }

    public void Play(ref TextBox text, ref MediaElement media)
    {
        Speak(text.Text, media);
    }
}

The Library class defines two const for file extensions for “.txt” and “.wav” then there’s a SpeechSynthesizer which is the core of the text-to-speak functionality. The OpenAsync method uses a FileOpenPicker to select a file with the “.txt” extension and returns a string using PickSingleFileAsync and ReadTextAsync to get the contents of the Text file. The SaveAsync takes in contexts for a Text file as a string and uses a FileSavePicker with a default extension as “.txt” but also there’s two file type choices – one is the “.txt” and the other is the “.wav” extension to get the StorageFile with PickSaveFileAsync, then if the extension is “.txt” then the file is saved with WriteTextAsync, it the extension is “.wav” then a SpeechSynthesisStream is used with a DataReader and LoadAsync to get the contents of the SpeechSynthesisStream as a IBuffer which is then written using WriteBufferAsync.

The Speak method takes in string text to read and a MediaElement to use for playback – if the this is Playing then Stop is called otherwise a SpeechSynthesisStream is obtained from SynthesizeTextToStreamAsync using the text and then the MediaElement is set to have AutoPlay to true and the SetSource set from the SpeechSynthesisStream. Then Voices is used to get a list of all the voices that the SpeechSynthesizer supports and Voice gets a Voice by an Id, New is used to reset the TextBox and MediaElement, Open reads in a Text file, Save writes to a Text File or Wave File and Play reads the Text contents using text-to-speech.

Step 8

In the Solution Explorer select MainPage.xaml

vs2017-mainpage-library

Step 9

From the Menu choose View and then Designer

vs2017-view-designer

Step 10

The Design View will be displayed along with the XAML View and in this between the Grid and /Grid elements, enter the following XAML:

<Grid Margin="50" Name="Display" Loaded="Display_Loaded">
	<Grid.RowDefinitions>
		<RowDefinition Height="Auto"/>
		<RowDefinition Height="*"/>
	</Grid.RowDefinitions>
	<ComboBox Grid.Row="0" Name="Voice" HorizontalAlignment="Stretch"
		SelectedValuePath="Key" DisplayMemberPath="Value"
		SelectionChanged="Voice_SelectionChanged"/>
	<TextBox Grid.Row="1" Name="Input" AcceptsReturn="True" TextWrapping="Wrap"/>
	<MediaElement Name="Media" AutoPlay="False"/>
</Grid>
<CommandBar VerticalAlignment="Bottom">
	<AppBarButton Icon="Page2" Label="New" Click="New_Click"/>
	<AppBarButton Icon="OpenFile" Label="Open" Click="Open_Click"/>
	<AppBarButton Icon="Save" Label="Save" Click="Save_Click"/>
	<AppBarButton Icon="Play" Label="Play" Click="Play_Click"/>
</CommandBar>

The MainPage has a Grid with two rows – the first represents the ComboBox for selecting Voices from then the second row has a TextBox for entering text to be read out with text-to-speech and there’s also a MediaElement. The second block of XAML is is the CommandBar which contains New – to start a new Document, Open – to get a Document, Save – to store a Document and Play to initiate the text-to-speech.

Step 11

From the Menu choose View and then Code

vs2017-view-code

Step 12

Once in the Code View, below the end of public MainPage() { … } the following Code should be entered:

Library library = new Library();

private void Display_Loaded(object sender, RoutedEventArgs e)
{
	Voice.ItemsSource = library.Voices();
	Voice.SelectedIndex = 0;
}

private void Voice_SelectionChanged(object sender, SelectionChangedEventArgs e)
{
	library.Voice((string)Voice.SelectedValue);
}

private void New_Click(object sender, RoutedEventArgs e)
{
	library.New(ref Input, ref Media);
}

private void Open_Click(object sender, RoutedEventArgs e)
{
	library.Open(Input);
}

private void Save_Click(object sender, RoutedEventArgs e)
{
	library.Save(ref Input);
}

private void Play_Click(object sender, RoutedEventArgs e)
{
	library.Play(ref Input, ref Media);
}

Below the MainPage() Method an instance of the Library Class is created, then Display_Loaded is used to set the ComboBox to the Voices to use for the text-to-speech, Voice_SelectionChanged to use the selected Voice, New_Click to call the New Method in the Library Class, Open_Click, Save_Click and Play_Click to call the related Methods in the Library Class.

Step 13

That completes the Universal Windows Platform Application so Save the Project then in Visual Studio select the Local Machine to run the Application

vs2017-local-machine

Step 14

After the Application has started running you can then select New to reset the TextBox or use Open to set the content to a Text Document or Save contents to a Text Document or Speech to a Wave File and use Play to convert content from text-to-speech – you can choose the Voice to speak with from the ComboBox.

ran-speaking-app

Step 15

To Exit the Application select the Close button in the top right of the Application

vs2017-close

Text-to-speech is an interesting technology to use as it enables additional options and functionality when thinking about text beyond what can be done visually by building in the functionality you can have dictation features within any application with just a few lines of code to turn text into speech. This example shows how easy it is to use the SpeechSynthesizer to translate simple text into speech, it also supports Speech Synthesis Markup Language (SSML) which allows more finer control over how the context is spoken if needed.

Creative Commons License

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s