Class OnlineRecognizer

ru.yandex.speechkit

java.lang.Object ⇽ OnlineRecognizer

public  class  OnlineRecognizer  implements Recognizer

All Implemented Interfaces:
Recognizer

Class for online speech recognition.

The object manages all processes of recognition, voice activity detection, server communication, and so on. OnlineRecognizer uses RecognizerListener for notification of the main events that occur in the recognition process.

Before working with an object of the OnlineRecognizer class, configure the library using the init(Context, String) or init(Context, String, LocationProvider) method.

Note.

An object of the OnlineRecognizer class uses the android.os.Handler system class method for notifying subscribed listeners (RecognizerListener). Therefore, listeners can only be created in a stream that has android.os.Looper, otherwise an exception will be thrown.

Nested Classes

Methods

synchronized voidcancel()

Cancels the recognition request.

synchronized voiddestroy()

Explicitly releases resources used by the component.

voidfinalize()
AudioSourcegetAudioSource()

Returns the audio source (AudioSource).

intgetEncodingBitrate()

Returns the bitrate of the selected audio codec (bps).

intgetEncodingComplexity()

Returns the complexity of the compression algorithm.

StringgetGrammar()

Returns a string for creating a custom model.

longgetInactiveTimeoutMs()

Returns the time to wait for speech, in milliseconds, after starting an object of the OnlineRecognizer class.

LanguagegetLanguage()

Returns the language of the Language voice query.

OnlineModelgetModel()

Returns an object of the OnlineModel class.

floatgetNewEnergyWeight()

Returns the weight of each new chunk of audio data when calculating the speech power.

OnlineModelgetOnlineModel()

Returns a model (OnlineModel) that is used for recognition.

longgetReachabilityTimeoutMs()

Returns the network connection timeout in milliseconds.

longgetRecordingTimeoutMs()

Returns the maximum length of a voice query record in milliseconds.

Language[]getSKRecognitionLanguages()

A list of recognition languages supported by the library.

UniProxySessiongetSession()

Returns an object of the UniProxySession class that will be used for a network connection.

longgetSilenceBetweenUtterancesMs()

Returns the minimum silence time between utterances, in milliseconds.

SoundFormatgetSoundFormat()

Returns the audio format (SoundFormat).

longgetWaitForResultTimeoutMs()

Returns the server response timeout, in milliseconds, after sending the last fragment.

booleanisDisableAntimat()

Returns the flag of skipping obscene words.

booleanisEnableManualPunctuation()

Returns the flag of replacing punctuation words with marks.

booleanisEnableMusicRecognition()

Returns the flag indicating that the ASR server will send music results.

booleanisEnablePunctuation()

Returns the punctuation flag.

booleanisFinishAfterFirstUtterance()

Returns the flag of ending recognition.

booleanisRequestBiometry()

Returns the flag indicating a request for the user's estimated biometrics (for example, age, gender, or age group).

booleanisUsePlatformRecognizer()

Returns the flag of using platform recognition (android.speech.SpeechRecognizer).

booleanisVadEnabled()

Returns the flag indicating whether the VAD is enabled.

booleanisWaitForConnection()

Returns the flag indicating that audio recording starts after a network connection is established.

synchronized voidprepare()

Prepares a class object that implements the Recognizer interface to speech recognition.

voidrequestPlatformRecognitionLanguages(@NonNull final android.content.Context context, @NonNull GetPlatformLanguagesListener listener)

Requests a list of languages that platform recognition supports.

synchronized voidstartRecording()

Starts the recognition process.

synchronized voidstopRecording()

Aborts audio recording.

StringtoString()

Method Detail

cancel

public synchronized void cancel ()

Cancels the recognition request.

This method cancels the recognition request at any stage of the operation. Audio stops being recorded, and the network connection is terminated if necessary. The method cancels the request synchronously. After calling it, the RecognizerListener interface methods are no longer called.
Note.

If Recognizer is required only while Activity is running, don't forget to stop it using the onPause() method of the Activity system class.

destroy

public synchronized void destroy ()

Explicitly releases resources used by the component.

Once the resources are released, no correct operation of Recognizer is possible.

finalize

public void finalize ()

getAudioSource

public AudioSource getAudioSource ()

Returns the audio source (AudioSource).

Returns:

Audio source (AudioSource).

getEncodingBitrate

public int getEncodingBitrate ()

Returns the bitrate of the selected audio codec (bps).

Returns:

Bitrate of the selected audio codec (bps).

getEncodingComplexity

public int getEncodingComplexity ()

Returns the complexity of the compression algorithm.

The complexity of the algorithm affects the quality of audio and, hence, the quality of recognition. An increase in the complexity at the fixed bitrate allows achieving better audio quality, but also increases the processor load and encoding time. However, the dependency is nonlinear, and with a bitrate of 24000 bps a high recognition quality is achieved with any complexity value, while with a bitrate of 12000 bps a good quality can only be achieved if the complexity value is set to 10. When the PCM format is selected, the compression algorithm complexity value is not used.

Returns:

Complexity of the compression algorithm.

getGrammar

public String getGrammar ()

Returns a string for creating a custom model.

Returns:

String for creating a custom model.

getInactiveTimeoutMs

public long getInactiveTimeoutMs ()

Returns the time to wait for speech, in milliseconds, after starting an object of the OnlineRecognizer class.

Returns:

Time to wait for speech, in milliseconds, after starting the OnlineRecognizer class object.

getLanguage

public Language getLanguage ()

Returns the language of the Language voice query.

Returns:

Voice query language Language.

getModel

public OnlineModel getModel ()

Returns an object of the OnlineModel class.

Returns:

An object of the OnlineModel class.

getNewEnergyWeight

public float getNewEnergyWeight ()

Returns the weight of each new chunk of audio data when calculating the speech power.

Returns:

The weight of each new chunk of audio data when calculating the speech power.

getOnlineModel

public OnlineModel getOnlineModel ()

Returns a model (OnlineModel) that is used for recognition.

Returns:

Model (OnlineModel) used for recognition.

getReachabilityTimeoutMs

public long getReachabilityTimeoutMs ()

Returns the network connection timeout in milliseconds.

Returns:

Network connection timeout in milliseconds.

getRecordingTimeoutMs

public long getRecordingTimeoutMs ()

Returns the maximum length of a voice query record in milliseconds.

Returns:

Maximum length of a voice query record in milliseconds.

getSKRecognitionLanguages

public static Language[] getSKRecognitionLanguages ()

A list of recognition languages supported by the library.

Returns:

Returns a list of recognition languages supported by the library.

getSession

public UniProxySession getSession ()

Returns an object of the UniProxySession class that will be used for a network connection.

Returns:

An object of the UniProxySession class that will be used for a network connection.

getSilenceBetweenUtterancesMs

public long getSilenceBetweenUtterancesMs ()

Returns the minimum silence time between utterances, in milliseconds.

Returns:

Minimum silence time between utterances, in milliseconds.

getSoundFormat

public SoundFormat getSoundFormat ()

Returns the audio format (SoundFormat).

Returns:

Audio format (SoundFormat).

getWaitForResultTimeoutMs

public long getWaitForResultTimeoutMs ()

Returns the server response timeout, in milliseconds, after sending the last fragment.

Returns:

Server response timeout, in milliseconds, after sending the last fragment.

isDisableAntimat

public boolean isDisableAntimat ()

Returns the flag of skipping obscene words.

Returns:

Flag of skipping obscene words.

isEnableManualPunctuation

public boolean isEnableManualPunctuation ()

Returns the flag of replacing punctuation words with marks.

For example, the utterance "Hello comma how are you question mark" can be recognized as "Hello, how are you?".

Returns:

Flag of replacing punctuation words with marks.

isEnableMusicRecognition

public boolean isEnableMusicRecognition ()

Returns the flag indicating that the ASR server will send music results.

Returns:

Flag indicating that the ASR server will send music recognition results.

isEnablePunctuation

public boolean isEnablePunctuation ()

Returns the punctuation flag.

Returns:

Punctuation flag.

isFinishAfterFirstUtterance

public boolean isFinishAfterFirstUtterance ()

Returns the flag of ending recognition.

Returns:

Flag of ending recognition.

isRequestBiometry

public boolean isRequestBiometry ()

Returns the flag indicating a request for the user's estimated biometrics (for example, age, gender, or age group).

Returns:

Flag indicating a request for the user's estimated biometrics (for example, age, gender, or age group).

isUsePlatformRecognizer

public boolean isUsePlatformRecognizer ()

Returns the flag of using platform recognition (android.speech.SpeechRecognizer).

Returns:

Flag of using platform recognition (android.speech.SpeechRecognizer).

isVadEnabled

public boolean isVadEnabled ()

Returns the flag indicating whether the VAD is enabled.

If the VAD is disabled, the OnlineRecognizer class object does not detect the end of speech automatically. In this case, call the stopRecording() method to stop recording and switch to recognition.

Returns:

Flag indicating whether the VAD is enabled.

isWaitForConnection

public boolean isWaitForConnection ()

Returns the flag indicating that audio recording starts after a network connection is established.

Returns:

Flag indicating that audio recording starts after a network connection is established.

prepare

public synchronized void prepare ()

Prepares a class object that implements the Recognizer interface to speech recognition.

Prepares for speech recognition in advance. If the method is not called explicitly, it is called automatically in the startRecording() method. Asynchronous execution.
Note.

We recommend that you call this method before running an object of the class that implements the Recognizer interface.

requestPlatformRecognitionLanguages

public static void requestPlatformRecognitionLanguages (@NonNull final android.content.Context context, @NonNull GetPlatformLanguagesListener listener)

Requests a list of languages that platform recognition supports.

Parameters:
context

An object of the Context system class.

listener

Interface that notifies of getting a list of supported languages.

startRecording

public synchronized void startRecording ()

Starts the recognition process.

The method can be called multiple times for the same object of the class. The method runs asynchronously, the onrecordingbegin (Recognizer) callback is invoked when recording starts.

stopRecording

public synchronized void stopRecording ()

Aborts audio recording.

The method does not cancel the recognition process, it continues until all results are received. In most cases, this method does not need to be called, since the Voice Activity Detector (VAD) automatically detects the end of speech. However, the method may be useful when the VAD is disabled or incorrectly detects the end of speech.
Note.

If Recognizer is required only while Activity is running, don't forget to stop it using the onPause() method of the Activity system class.

toString

public String toString ()