Development and practice of language recognition programs like Siri in iOS (Nuance technology)

title: "Development and Practice of Siri-like Language Recognition Program in iOS (Nuance Technology)"
date: "2012-06-20"
categories:

"mobileinternet"
"sourceandcoding"
tags:
"ios"
"nuance"
"siri"
"internet"

It is believed that after the release of iPhone 4s, Siri technology became popular again. However, the success of Siri is not only attributed to the leadership of Steve Jobs at Apple, but also to the language recognition technology provider Nuance. Nuance is the largest company specializing in the development and sales of speech recognition software, image processing software, and input method software. In addition to Siri, which has already gained popularity, Nuance has another record-breaking product called T9 input method. Users who have used Nokia and other brands of mobile phones before the popularity of Apple's iPhone should be familiar with this. The language recognition function we are implementing here is based on Nuance's ASR technology. Siri is just one implementation of Nuance's technology, and Nuance's speech recognition technology goes far beyond that. In addition to English, French, German, and other Western languages, Chinese, Cantonese, Japanese, and other East Asian languages are also included.

Before getting started, you need to register a developer account with Nuance, obtain an Application Key, and download the SDK. The following is the SDK's description: http://www.nuance.com/for-developers/dragon-mobile-sdk/index.htm

After that, you can use and develop language recognition programs. The SDK mainly includes two versions: iOS and Android. The iOS version was released earlier and is currently more stable. Many iOS apps are based on its development. It currently provides both free and commercial versions. If you want to develop commercially, it is recommended to purchase their commercial solution.

The following are the basic steps and code snippets for developing language recognition:

Import the speech recognition framework SpeechKit.framework.
Set the SpeechKitApplicationKey.

// Please copy from Nuance's email
const unsigned char SpeechKitApplicationKey[] =
{0x47, 0xbe, 0x50, 0x57, 0x05, 0xde, 0x0f, 0x0e,
0x70, 0x63, 0x10, 0x4b, 0xb2, 0xad, 0xfb, 0xab,
0x14, 0x96, 0x99, 0x0d, 0x8e, 0x50, 0x2c, 0x1a,
0xb2, 0x5b, 0xf6, 0x76, 0x7d, 0xd8, 0xd5, 0xc5,
0x97, 0x25, 0x1c, 0x9c, 0x03, 0x2c, 0xaa, 0x74,
0x8f, 0xba, 0xbf, 0x42, 0x67, 0xba, 0xed, 0x7b,
0x50, 0x87, 0x88, 0xde, 0xd7, 0xb4, 0xf8, 0x89,
0x10, 0xef, 0xff, 0x8d, 0xc7, 0xd5, 0x52, 0x51};

NSString *const SpeechKitID =@"XXXX";
NSString *const SpeechHost = @"sandbox.nmdp.nuancemobility.net";

Instantiate the speech recognition object and set parameters.

// Detection type, used to recognize and stop recording, can be brief pause, manual control, etc.
detectionType = SKShortEndOfSpeechDetection;
// Recognition type, usually divided into search and dictation modes.
recoType = SKSearchRecognizerType;
// Recognize Japanese, can be en, de, fr, zh_CN, etc.
langType = @"ja_JP";

if (_voiceSearch) [_voiceSearch release];

_voiceSearch = [[SKRecognizer alloc] initWithType
detection
language
delegate];

Implement delegate methods.

// Recording started, called when recording starts and during recording.

(void)recognizerDidBeginRecording:(SKRecognizer *)recognizer

// Recording finished, called when language input is completed and recognized, recording is finished.

(void)recognizerDidFinishRecording:(SKRecognizer *)recognizer

// Recognition parsing completed, server request completed, and results obtained.

(void)recognizer:(SKRecognizer *)recognizer didFinishWithResults:(SKRecognition *)results

// Recognition error occurred.

(void)recognizer:(SKRecognizer *)recognizer didFinishWithError:(NSError *)error suggestion:(NSString *)suggestion