What is the REST API

A REST API (Representational State Transfer Application Programming Interface) is an interface that allows external programs to access the functionality of an application. It enables communication via HTTP requests and responses, allowing you to automate tasks such as sending speech synthesis requests, checking their status, and retrieving results.
By enabling the REST API in ****Voisona Talk, you can control speech synthesis from various programming languages such as Python, C++, or JavaScript. This makes it possible to integrate Voisona Talk’s synthesized voices into your own applications, web services, chatbots, or games.
In the following tutorial, we’ll walk through how to enable the REST API in Voisona Talk and execute speech synthesis using Python.
*The REST API functionality is currently available as a beta version.

Enabling the REST API

  1. Launch Voisona Talk.
  1. If you have not yet authenticated, please do so:
    1. From the ☰ (three-line menu) at the top right, select Help → Authentication.
      1. Mail: your registered email address
      2. Password: the password you set during registration
  1. Ensure that at least one voice library has been downloaded.
  1. Enable the REST API:
    1. From the ☰ (three-line menu) at the top right, select Edit → Preferences, and open the API tab.
      1. Set any desired port number for the API listener (default: 32766).
      2. Verify that the user name matches your registered email address (this field cannot be changed).
      3. Set an API password of your choice. This password does not need to match your authentication password. However, it must not be empty.
      4. Check “Enable REST API”. Once enabled, the port number and password fields become read-only. To change them, uncheck the box, modify the values, and re-enable it.
      5. When enabled, the “Talk API Reference” link next to the checkbox becomes active. Refer to it for more detailed API specifications.

Example: Using the API with Python

Full Sample Code

Below is the complete sample code. Each part of it is explained in the sections that follow.
To run the sample code, you need to install the requests package:
Example of running the sample:
Replace the username and password according to your API settings.

Retrieving Available Voice Libraries

You can get the list of installed voice libraries as follows:
Example output:
If no voice library is downloaded in the editor, the result will be empty.

Synthesizing Speech

The following sends a speech synthesis request to the API server:
In the sample code, the first detected voice library from the retrieved list is used as the voice library for synthesis.
When synthesis completes, you will hear “こんにちは” played through your default audio device.
Note that the server has a limit on the number of requests. If the limit is reached, requests may fail. Setting : true automatically removes older requests to make room for new ones.
You can check the status of the submitted request using the UUID obtained by executing the synthesize_text function.
If is , the request is waiting to be processed. If it’s , synthesis has completed successfully.
You can also delete a request by specifying its UUID:
To save the synthesized result as a file instead of playing it, specify an absolute path:

Controlling Voice Expression

To modify voice expression, include a object. For example, to double the speaking speed:

Detailed Control of Speech Attributes

You can perform fine-grained control by modifying analyzed text data. First, send a text analysis request:
Example response:
By modifying the analyzed text and sending it back to the server, you can adjust the accent and pronunciation. The following is an example using an XML parser.
For more details, please refer to the “Talk API Reference” via the link in the API tab of the Preferences window in the Voisona Talk editor.