失眠网 > 微软语音识别SDK总结

微软语音识别SDK总结

时间：2021-03-01 05:43:23

相关推荐

微软语音识别SDK总结

代码

CComPtr m_pSREngine;// 语音识别引擎(recognition)的接口。 CComPtr m_pSRContext;// 识别引擎上下文(context)的接口。 CComPtr m_pSRGrammar;// 识别文法(grammar)的接口。 CComPtr m_pInputStream;// 流()的接口。 CComPtr m_pToken;// 语音特征的(token)接口。 CComPtr m_pAudio;// 音频(Audio)的接口。(用来保存原来默认的输入流) ULONGLONG ullGrammerID ;CoInitialize(NULL); m_pSREngine.CoCreateInstance ( CLSID_SpInprocRecognizer ); m_pSREngine->CreateRecoContext ( &m_pSRContext );//建立上下文//这里是设置事件 HWND hwnd = GetSafeHwnd(); hr = m_pSRContext->SetNotifyWindowMessage(hwnd,WM_RECORD,0,0);hr=m_pSRContext->SetInterest(SPFEI(SPEI_RECOGNITION),SPFEI(SPEI_RECOGNITION));//这里是设置默认的音频输入 hr = SpCreateDefaultObjectFromCategoryId(SPCAT_AUDIOIN, &m_pAudio); m_pSREngine->SetInput(m_pAudio,true);//这里是加载默认的语法规则 ullGrammerID = 1000; hr=m_pSRContext->CreateGrammar(ullGrammerID,&m_pSRGrammar);WCHAR wszXMLFile[20]=L""; MultiByteToWideChar(CP_ACP, 0,(LPCSTR)"cmd.xml" , -1, wszXMLFile, 256); //这里修改XML的目录 hr=m_pSRGrammar->LoadCmdFromFile(wszXMLFile,SPLO_DYNAMIC);//开启语音识别 m_pSRGrammar->SetRuleState( NULL,NULL,SPRS_ACTIVE ); hr=m_pSREngine->SetRecoState(SPRST_ACTIVE);

简单介绍

ISpRecognizer

There are two implementations of the ISpRecognizer and ISpRecoContext in SAPI. One is for recognition "in-process" (InProc), where the SR engine is created in the same process as the application. Only this application can connect to this recognizer. The other implementation is the "shared-recognizer," where the SR engine is created in a separate process. There will only be one shared engine running on a system, and all applications using the shared engine connect to the same recognizer. This allows several speech applications to work simultaneously, and allows the user to speak to any application, as recognition is done from the grammars of all applications. For desktop-based speech applications it is recommended to use the shared recognizer because of the way it allows multiple SAPI applications to work at once. For other types of application, such as recognizing from wave files or a telephony server application where multiple SR engines will be required, the InProc recognizer should be used.

When to Use

Call methods of the ISpRecognizer interface to configure or retrieve the attributes of the SR engine.

How Created

There are two objects that implement this interface. These are created by applications by creating a COM object with either of the following CLSIDs:

SpInprocRecognizer (CLSID_SpInprocRecognizer)

SpSharedRecognizer (CLSID_SpSharedRecognizer)

Alternatively, the shared recognizer can be created by creating a SpSharedRecoContext (CLSID_SpSharedRecoContext), and then calling ISpRecoContext::GetRecognizer on this object to get a reference to the SpSharedRecognizer object.

Methods in Vtable Order

ISpRecognizer Methods Description SetRecognizer Specifies the SR engine to be used.

GetRecognizer Retrieves which SR engine is currently being used.

SetInput Specifies which input stream the SR engine should use.

GetInputObjectToken Retrieves the input token object for the stream.

GetInputStream Retrieves the input stream.

CreateRecoContext Creates a recognition context for this instance of an SR engine.

GetRecoProfile Retrieves the current recognition profile token.

SetRecoProfile Sets the recognition profile to be used by the recognizer.

IsSharedInstance Determines if the recognizer is the shared or InProc implementation.

GetRecoState Retrieves the state of the recognition engine.

SetRecoState Sets the state of the recognition engine.

GetStatus Retrieves current status information for the engine.

GetFormat Retrieves the format of the current audio input.

IsUISupported Checks if the SR engine supports a particular user interface component.

DisplayUI Displays a user interface component.

EmulateRecognition Emulates a recognition from a text phrase rather than from spoken audio.

context

ISpRecognizer::CreateRecoContext creates a recognition context for this instance of an SR engine. The recognition context is used to load recognition grammars, start and stop recognition, and receive events and recognition results.