Quick start

Attention.

To get started, you need to obtain an API key. To receive a key, write to us at speechkit@support.yandex.ru.

To use SpeechKit for recognizing Russian speech, send a small audio clip (for example, speech.wav) in a POST request.

POST /asr_xml?uuid=<user ID>&key=<API key>&topic=queries HTTP/1.1
Host: asr.yandex.net
Content-Type: audio/x-wav 

... (binary content of the audio file)

The response body contains several variations of the utterance in XML format. The result at the top of the list is considered the closest to the original.

<recognitionResults success="1">
        <variant confidence="0.89">твой номер 212-85-06</variant>
        <variant confidence="0">твой номер 213-85-06</variant>
</recognitionResults>

In the example above, the entire audio file is included in the request body. This approach has some limitations:

  • The maximum size of the audio file is 1 MB.

  • Speech recognition begins only after all the audio data has been transmitted to the server.

To avoid these limitations, you can send data in chunks: see Chunked transfer encoding and Data streaming mode.

Note that SpeechKit Cloud converts the received audio data to mono PCM/16 bit/16 kHz.