DialogServiceConnector, Connect/Disconnect impl (#90)
* Implemented DialogConnectorFactory. * WIP * begin add dialog tests * connect impl and first test * headers for browser * expose connect * initial listenOnce * repalce websocket use * listenOnce with browser connection support * listenOnce with activity responses * expose get/setter props * cleanup * generic send fnc * extracting functionality from base reco * Save the day * First incarnation of the BaseAudioPlayer class. # Conflicts: # tests/DialogServiceConnectorTests.ts * Fix bugs in BaseAudioPlayer and added manual tests for it. - server.js - web socket streaming long audio file (node server.js from the folder) - AudioPlayerBaseTest.html lauch http-server in the folder and then load the page * lint cleanup * Added missed server.js file for audio player manual tests * Added audio output stream * TSLint fixes * AudioOutputStreamTests * update message loop * AudioOutputStream fixes Jsonification of activity Added AudioOutputStream to activity ActivityPayloadResponse class * refactor message loop and add sendActivity * Turn management classes * Add integration with turn management * Few fixes - turn.end is not coming Merge branch 'brandom/dialog' of https://github.com/microsoft/cognitive-services-speech-sdk-js into gghizila/TTSStreamIntegration # Conflicts: # src/common.speech/DialogServiceAdapter.ts * Fixed the traces for turns to report "debugturn" to help with turn.end investigation missing * Added timeouts as workaround to not routed messages to 2s. Added debugturn logging for investingating the messages not routed. Added timestamps to console.info * remove promisehelper wait, rm cancel recos * minor code restructuring * * The last audio message is null. When we encounter it we closed the audio stream. * Added isClosed to the audio output stream impl. * Changed tests to defer the shutdown to avoid random red herrings. * * The last audio message is null. When we encounter it we closed the audio stream. * Added isClosed to the audio output stream impl. * Changed tests to defer the shutdown to avoid random red herrings. * Added try / catch for audio message handling. * restrict config to once per connection * Upgrade https-proxy-agent to 2.2.3 after the last alert. * fix reco end, remove speech context on turn.end * Fix machine in the middle vulnerability for "https-proxy-agent" (#97) - changes generated by npm audit fix --force - upgrade "https-proxy-agent" to 3.0.0 as there is inconsistency on reports as what version fixes the vulnerability * fix for mic staying open * split config into BF and Commands * add temporary ApplicationId to tests * add reconnect on websocket timeout, config parity * remove authHeader from botFramework config * Implemented DialogConnectorFactory. * WIP * begin add dialog tests * connect impl and first test * headers for browser * initial listenOnce * expose connect * repalce websocket use * listenOnce with browser connection support * listenOnce with activity responses * expose get/setter props * cleanup * generic send fnc * extracting functionality from base reco * Save the day * First incarnation of the BaseAudioPlayer class. # Conflicts: # tests/DialogServiceConnectorTests.ts * Fix bugs in BaseAudioPlayer and added manual tests for it. - server.js - web socket streaming long audio file (node server.js from the folder) - AudioPlayerBaseTest.html lauch http-server in the folder and then load the page * lint cleanup * Added missed server.js file for audio player manual tests * Added audio output stream * TSLint fixes * AudioOutputStreamTests * update message loop * AudioOutputStream fixes Jsonification of activity Added AudioOutputStream to activity ActivityPayloadResponse class * refactor message loop and add sendActivity * Turn management classes * Add integration with turn management * Few fixes - turn.end is not coming Merge branch 'brandom/dialog' of https://github.com/microsoft/cognitive-services-speech-sdk-js into gghizila/TTSStreamIntegration # Conflicts: # src/common.speech/DialogServiceAdapter.ts * Fixed the traces for turns to report "debugturn" to help with turn.end investigation missing * Added timeouts as workaround to not routed messages to 2s. Added debugturn logging for investingating the messages not routed. Added timestamps to console.info * remove promisehelper wait, rm cancel recos * minor code restructuring * * The last audio message is null. When we encounter it we closed the audio stream. * Added isClosed to the audio output stream impl. * Changed tests to defer the shutdown to avoid random red herrings. * * The last audio message is null. When we encounter it we closed the audio stream. * Added isClosed to the audio output stream impl. * Changed tests to defer the shutdown to avoid random red herrings. * Added try / catch for audio message handling. * restrict config to once per connection * fix reco end, remove speech context on turn.end * fix for mic staying open * split config into BF and Commands * add temporary ApplicationId to tests * add reconnect on websocket timeout, config parity * remove authHeader from botFramework config * merge * restore code coverage * extend gitattributes
This commit is contained in:
Родитель
6c6134687d
Коммит
2053c62454
|
@ -11,6 +11,7 @@ LICENSE text
|
|||
*.ts text
|
||||
*.txt text
|
||||
*.yml text
|
||||
*.html text
|
||||
|
||||
# Bash only with Unix line endings
|
||||
*.sh text eol=lf
|
||||
|
|
|
@ -3,6 +3,7 @@
|
|||
# ignore auto-generated (transpiled) js files
|
||||
src/**/*.js
|
||||
tests/**/*.js
|
||||
!tests/AudioPlayerTests/server/server.js
|
||||
distrib/*
|
||||
secrets/*
|
||||
/**/speech.key
|
||||
|
|
|
@ -21,7 +21,7 @@
|
|||
"request": "launch",
|
||||
"name": "Jest Current File",
|
||||
"program": "${workspaceFolder}/node_modules/.bin/jest",
|
||||
"args": ["--runInBand", "--coverage", "false", "${relativeFile}"],
|
||||
"args": ["--runInBand", "--coverage", "false", "${fileBasename}"],
|
||||
"console": "integratedTerminal",
|
||||
"internalConsoleOptions": "neverOpen",
|
||||
"windows": {
|
||||
|
|
|
@ -1108,9 +1108,9 @@
|
|||
}
|
||||
},
|
||||
"asn1.js": {
|
||||
"version": "4.10.1",
|
||||
"resolved": "https://registry.npmjs.org/asn1.js/-/asn1.js-4.10.1.tgz",
|
||||
"integrity": "sha512-p32cOF5q0Zqs9uBiONKYLm6BClCoBCM5O9JfeUSlnQLBTxYdTK+pW+nXflm8UkKd2UYlEbYz5qEi0JuZR9ckSw==",
|
||||
"version": "5.0.0",
|
||||
"resolved": "https://registry.npmjs.org/asn1.js/-/asn1.js-5.0.0.tgz",
|
||||
"integrity": "sha512-Y+FKviD0uyIWWo/xE0XkUl0x1allKFhzEVJ+//2Dgqpy+n+B77MlPNqvyk7Vx50M9XyVzjnRhDqJAEAsyivlbA==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"bn.js": "^4.0.0",
|
||||
|
@ -6565,6 +6565,19 @@
|
|||
"evp_bytestokey": "^1.0.0",
|
||||
"pbkdf2": "^3.0.3",
|
||||
"safe-buffer": "^5.1.1"
|
||||
},
|
||||
"dependencies": {
|
||||
"asn1.js": {
|
||||
"version": "4.10.1",
|
||||
"resolved": "https://registry.npmjs.org/asn1.js/-/asn1.js-4.10.1.tgz",
|
||||
"integrity": "sha512-p32cOF5q0Zqs9uBiONKYLm6BClCoBCM5O9JfeUSlnQLBTxYdTK+pW+nXflm8UkKd2UYlEbYz5qEi0JuZR9ckSw==",
|
||||
"dev": true,
|
||||
"requires": {
|
||||
"bn.js": "^4.0.0",
|
||||
"inherits": "^2.0.1",
|
||||
"minimalistic-assert": "^1.0.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"parse-filepath": {
|
||||
|
|
|
@ -49,6 +49,7 @@
|
|||
"@types/node": "^12.6.8",
|
||||
"@types/request": "^2.48.2",
|
||||
"@types/ws": "^6.0.1",
|
||||
"asn1.js": "^5.0.0",
|
||||
"dts-bundle-webpack": "^1.0.2",
|
||||
"gulp": "^4.0.2",
|
||||
"gulp-rename": "^1.4.0",
|
||||
|
@ -68,7 +69,7 @@
|
|||
},
|
||||
"scripts": {
|
||||
"build": "gulp compress && gulp build2015",
|
||||
"test": "npm run lint && npm run jest",
|
||||
"test": "npm run lint && npm run jest --coverage",
|
||||
"jest": "jest",
|
||||
"lint": "tslint -p tsconfig.json",
|
||||
"civersion": "node ci/version.js",
|
||||
|
|
|
@ -85,7 +85,7 @@ export class ReplayableAudioNode implements IAudioStreamNode {
|
|||
}
|
||||
|
||||
public replay(): void {
|
||||
if (0 !== this.privBuffers.length) {
|
||||
if (this.privBuffers && 0 !== this.privBuffers.length) {
|
||||
this.privReplay = true;
|
||||
this.privReplayOffset = this.privLastShrinkOffset;
|
||||
}
|
||||
|
|
|
@ -75,7 +75,8 @@ export class WebsocketConnection implements IConnection {
|
|||
this.privUri,
|
||||
this.id,
|
||||
this.privMessageFormatter,
|
||||
proxyInfo);
|
||||
proxyInfo,
|
||||
headers);
|
||||
}
|
||||
|
||||
public dispose = (): void => {
|
||||
|
|
|
@ -51,6 +51,7 @@ export class WebsocketMessageAdapter {
|
|||
private privConnectionId: string;
|
||||
private privUri: string;
|
||||
private proxyInfo: ProxyInfo;
|
||||
private privHeaders: { [key: string]: string; };
|
||||
|
||||
public static forceNpmWebSocket: boolean = false;
|
||||
|
||||
|
@ -58,7 +59,8 @@ export class WebsocketMessageAdapter {
|
|||
uri: string,
|
||||
connectionId: string,
|
||||
messageFormatter: IWebsocketMessageFormatter,
|
||||
proxyInfo: ProxyInfo) {
|
||||
proxyInfo: ProxyInfo,
|
||||
headers: { [key: string]: string; }) {
|
||||
|
||||
if (!uri) {
|
||||
throw new ArgumentNullError("uri");
|
||||
|
@ -74,6 +76,7 @@ export class WebsocketMessageAdapter {
|
|||
this.privMessageFormatter = messageFormatter;
|
||||
this.privConnectionState = ConnectionState.None;
|
||||
this.privUri = uri;
|
||||
this.privHeaders = headers;
|
||||
}
|
||||
|
||||
public get state(): ConnectionState {
|
||||
|
@ -117,7 +120,7 @@ export class WebsocketMessageAdapter {
|
|||
}
|
||||
|
||||
const httpProxyAgent: HttpsProxyAgent = new HttpsProxyAgent(httpProxyOptions);
|
||||
const httpsOptions: http.RequestOptions = { agent: httpProxyAgent };
|
||||
const httpsOptions: http.RequestOptions = { agent: httpProxyAgent , headers: this.privHeaders };
|
||||
|
||||
this.privWebsocketClient = new ws(this.privUri, httpsOptions as ws.ClientOptions);
|
||||
|
||||
|
@ -150,7 +153,7 @@ export class WebsocketMessageAdapter {
|
|||
this.privCertificateValidatedDeferral.resolve(true);
|
||||
|
||||
const ocspAgent: ocsp.Agent = new ocsp.Agent({});
|
||||
const options: ws.ClientOptions = { agent: ocspAgent };
|
||||
const options: ws.ClientOptions = { agent: ocspAgent, headers: this.privHeaders };
|
||||
this.privWebsocketClient = new ws(this.privUri, options);
|
||||
}
|
||||
}
|
||||
|
|
|
@ -0,0 +1,34 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
/**
|
||||
* Represents the JSON used in the agent.config message sent to the speech service.
|
||||
*/
|
||||
export class AgentConfig {
|
||||
private iPrivConfig: IAgentConfig;
|
||||
|
||||
public toJsonString(): string {
|
||||
return JSON.stringify(this.iPrivConfig);
|
||||
}
|
||||
|
||||
public get(): IAgentConfig {
|
||||
return this.iPrivConfig;
|
||||
}
|
||||
|
||||
/**
|
||||
* Setter for the agent.config object.
|
||||
* @param value a JSON serializable object.
|
||||
*/
|
||||
public set(value: IAgentConfig): void {
|
||||
this.iPrivConfig = value;
|
||||
}
|
||||
}
|
||||
|
||||
export interface IAgentConfig {
|
||||
botInfo: {
|
||||
commType: string,
|
||||
connectionId: string,
|
||||
conversationId: string
|
||||
};
|
||||
version: number;
|
||||
}
|
|
@ -0,0 +1,81 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import {
|
||||
ProxyInfo,
|
||||
WebsocketConnection,
|
||||
} from "../common.browser/Exports";
|
||||
import { IConnection, IStringDictionary } from "../common/Exports";
|
||||
import { PropertyId } from "../sdk/Exports";
|
||||
import { ConnectionFactoryBase } from "./ConnectionFactoryBase";
|
||||
import { AuthInfo, RecognizerConfig, WebsocketMessageFormatter } from "./Exports";
|
||||
import { QueryParameterNames } from "./QueryParameterNames";
|
||||
|
||||
const baseUrl: string = "convai.speech.microsoft.com";
|
||||
|
||||
interface IBackendValues {
|
||||
authHeader: string;
|
||||
resourcePath: string;
|
||||
version: string;
|
||||
}
|
||||
|
||||
const botFramework: IBackendValues = {
|
||||
authHeader: "X-DLS-Secret",
|
||||
resourcePath: "",
|
||||
version: "v3"
|
||||
};
|
||||
|
||||
const speechCommands: IBackendValues = {
|
||||
authHeader: "X-CommandsAppId",
|
||||
resourcePath: "commands",
|
||||
version: "v1"
|
||||
};
|
||||
|
||||
const pathSuffix: string = "api";
|
||||
|
||||
function getDialogSpecificValues(dialogType: string): IBackendValues {
|
||||
switch (dialogType) {
|
||||
case "speech_commands": {
|
||||
return speechCommands;
|
||||
}
|
||||
case "bot_framework": {
|
||||
return botFramework;
|
||||
}
|
||||
}
|
||||
throw new Error(`Invalid dialog type '${dialogType}'`);
|
||||
}
|
||||
|
||||
export class DialogConnectionFactory extends ConnectionFactoryBase {
|
||||
|
||||
public create = (
|
||||
config: RecognizerConfig,
|
||||
authInfo: AuthInfo,
|
||||
connectionId?: string): IConnection => {
|
||||
|
||||
const applicationId: string = config.parameters.getProperty(PropertyId.Conversation_ApplicationId, "");
|
||||
const dialogType: string = config.parameters.getProperty(PropertyId.Conversation_DialogType);
|
||||
const region: string = config.parameters.getProperty(PropertyId.SpeechServiceConnection_Region);
|
||||
|
||||
const language: string = config.parameters.getProperty(PropertyId.SpeechServiceConnection_RecoLanguage, "en-US");
|
||||
|
||||
const queryParams: IStringDictionary<string> = {};
|
||||
queryParams[QueryParameterNames.LanguageParamName] = language;
|
||||
|
||||
const {resourcePath, version, authHeader} = getDialogSpecificValues(dialogType);
|
||||
|
||||
const headers: IStringDictionary<string> = {};
|
||||
headers[authInfo.headerName] = authInfo.token;
|
||||
headers[QueryParameterNames.ConnectionIdHeader] = connectionId;
|
||||
|
||||
let endpoint: string;
|
||||
// ApplicationId is only required for CustomCommands
|
||||
if (applicationId === "") {
|
||||
endpoint = `wss://${region}.${baseUrl}/${pathSuffix}/${version}`;
|
||||
} else {
|
||||
endpoint = `wss://${region}.${baseUrl}/${resourcePath}/${pathSuffix}/${version}`;
|
||||
headers[authHeader] = applicationId;
|
||||
}
|
||||
|
||||
return new WebsocketConnection(endpoint, queryParams, headers, new WebsocketMessageFormatter(), ProxyInfo.fromRecognizerConfig(config), connectionId);
|
||||
}
|
||||
}
|
|
@ -0,0 +1,784 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { ReplayableAudioNode } from "../common.browser/Exports";
|
||||
import {
|
||||
ConnectionEvent,
|
||||
ConnectionMessage,
|
||||
ConnectionOpenResponse,
|
||||
ConnectionState,
|
||||
createGuid,
|
||||
createNoDashGuid,
|
||||
Deferred,
|
||||
IAudioSource,
|
||||
IAudioStreamNode,
|
||||
IConnection,
|
||||
IStreamChunk,
|
||||
MessageType,
|
||||
Promise,
|
||||
PromiseHelper,
|
||||
PromiseResult,
|
||||
} from "../common/Exports";
|
||||
import { PullAudioOutputStreamImpl } from "../sdk/Audio/AudioOutputStream";
|
||||
import { AudioStreamFormatImpl } from "../sdk/Audio/AudioStreamFormat";
|
||||
import {
|
||||
ActivityReceivedEventArgs,
|
||||
AudioOutputStream,
|
||||
CancellationErrorCode,
|
||||
CancellationReason,
|
||||
DialogServiceConnector,
|
||||
PropertyCollection,
|
||||
PropertyId,
|
||||
PullAudioOutputStream,
|
||||
RecognitionEventArgs,
|
||||
ResultReason,
|
||||
SessionEventArgs,
|
||||
SpeechRecognitionCanceledEventArgs,
|
||||
SpeechRecognitionEventArgs,
|
||||
SpeechRecognitionResult,
|
||||
} from "../sdk/Exports";
|
||||
import { DialogServiceTurnStateManager } from "./DialogServiceTurnStateManager";
|
||||
import {
|
||||
AgentConfig,
|
||||
CancellationErrorCodePropertyName,
|
||||
EnumTranslation,
|
||||
ISpeechConfigAudioDevice,
|
||||
RecognitionStatus,
|
||||
RequestSession,
|
||||
ServiceRecognizerBase,
|
||||
SimpleSpeechPhrase,
|
||||
SpeechDetected,
|
||||
SpeechHypothesis,
|
||||
} from "./Exports";
|
||||
import { AuthInfo, IAuthentication } from "./IAuthentication";
|
||||
import { IConnectionFactory } from "./IConnectionFactory";
|
||||
import { RecognitionMode, RecognizerConfig } from "./RecognizerConfig";
|
||||
import { ActivityPayloadResponse } from "./ServiceMessages/ActivityResponsePayload";
|
||||
import { SpeechConnectionMessage } from "./SpeechConnectionMessage.Internal";
|
||||
|
||||
export class DialogServiceAdapter extends ServiceRecognizerBase {
|
||||
private privDialogServiceConnector: DialogServiceConnector;
|
||||
private privDialogConnectionFactory: IConnectionFactory;
|
||||
private privDialogAuthFetchEventId: string;
|
||||
private privDialogIsDisposed: boolean;
|
||||
private privDialogAuthentication: IAuthentication;
|
||||
private privDialogAudioSource: IAudioSource;
|
||||
private privDialogRequestSession: RequestSession;
|
||||
|
||||
// A promise for a configured connection.
|
||||
// Do not consume directly, call fetchDialogConnection instead.
|
||||
private privConnectionConfigPromise: Promise<IConnection>;
|
||||
|
||||
// A promise for a connection, but one that has not had the speech context sent yet.
|
||||
// Do not consume directly, call fetchDialogConnection instead.
|
||||
private privDialogConnectionPromise: Promise<IConnection>;
|
||||
|
||||
private privSuccessCallback: (e: SpeechRecognitionResult) => void;
|
||||
private privConnectionLoop: Promise<IConnection>;
|
||||
private terminateMessageLoop: boolean;
|
||||
private agentConfigSent: boolean;
|
||||
|
||||
// Turns are of two kinds:
|
||||
// 1: SR turns, end when the SR result is returned and then turn end.
|
||||
// 2: Service turns where an activity is sent by the service along with the audio.
|
||||
private privTurnStateManager: DialogServiceTurnStateManager;
|
||||
|
||||
public constructor(
|
||||
authentication: IAuthentication,
|
||||
connectionFactory: IConnectionFactory,
|
||||
audioSource: IAudioSource,
|
||||
recognizerConfig: RecognizerConfig,
|
||||
dialogServiceConnector: DialogServiceConnector) {
|
||||
|
||||
super(authentication, connectionFactory, audioSource, recognizerConfig, dialogServiceConnector);
|
||||
|
||||
this.privDialogServiceConnector = dialogServiceConnector;
|
||||
this.privDialogAuthentication = authentication;
|
||||
this.receiveMessageOverride = this.receiveDialogMessageOverride;
|
||||
this.privTurnStateManager = new DialogServiceTurnStateManager();
|
||||
this.recognizeOverride = this.listenOnce;
|
||||
this.connectImplOverride = this.dialogConnectImpl;
|
||||
this.configConnectionOverride = this.configConnection;
|
||||
this.fetchConnectionOverride = this.fetchDialogConnection;
|
||||
this.disconnectOverride = this.privDisconnect;
|
||||
this.privDialogAudioSource = audioSource;
|
||||
this.privDialogRequestSession = new RequestSession(audioSource.id());
|
||||
this.privDialogConnectionFactory = connectionFactory;
|
||||
this.privDialogIsDisposed = false;
|
||||
this.agentConfigSent = false;
|
||||
}
|
||||
|
||||
public isDisposed(): boolean {
|
||||
return this.privDialogIsDisposed;
|
||||
}
|
||||
|
||||
public dispose(reason?: string): void {
|
||||
this.privDialogIsDisposed = true;
|
||||
if (this.privConnectionConfigPromise) {
|
||||
this.privConnectionConfigPromise.onSuccessContinueWith((connection: IConnection) => {
|
||||
connection.dispose(reason);
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
public sendMessage = (message: string): void => {
|
||||
const interactionGuid: string = createGuid();
|
||||
const requestId: string = createNoDashGuid();
|
||||
|
||||
const agentMessage: any = {
|
||||
context: {
|
||||
interactionId: interactionGuid
|
||||
},
|
||||
messagePayload: message,
|
||||
version: 0.5
|
||||
};
|
||||
|
||||
const agentMessageJson = JSON.stringify(agentMessage);
|
||||
|
||||
this.fetchDialogConnection().onSuccessContinueWith((connection: IConnection) => {
|
||||
connection.send(new SpeechConnectionMessage(
|
||||
MessageType.Text,
|
||||
"agent",
|
||||
requestId,
|
||||
"application/json",
|
||||
agentMessageJson));
|
||||
});
|
||||
}
|
||||
|
||||
protected privDisconnect(): void {
|
||||
this.cancelRecognition(this.privDialogRequestSession.sessionId,
|
||||
this.privDialogRequestSession.requestId,
|
||||
CancellationReason.Error,
|
||||
CancellationErrorCode.NoError,
|
||||
"Disconnecting",
|
||||
undefined);
|
||||
|
||||
this.terminateMessageLoop = true;
|
||||
this.agentConfigSent = false;
|
||||
if (this.privDialogConnectionPromise.result().isCompleted) {
|
||||
if (!this.privDialogConnectionPromise.result().isError) {
|
||||
this.privDialogConnectionPromise.result().result.dispose();
|
||||
this.privDialogConnectionPromise = null;
|
||||
}
|
||||
} else {
|
||||
this.privDialogConnectionPromise.onSuccessContinueWith((connection: IConnection) => {
|
||||
connection.dispose();
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
protected processTypeSpecificMessages(
|
||||
connectionMessage: SpeechConnectionMessage,
|
||||
successCallback?: (e: SpeechRecognitionResult) => void,
|
||||
errorCallBack?: (e: string) => void): void {
|
||||
|
||||
const resultProps: PropertyCollection = new PropertyCollection();
|
||||
if (connectionMessage.messageType === MessageType.Text) {
|
||||
resultProps.setProperty(PropertyId.SpeechServiceResponse_JsonResult, connectionMessage.textBody);
|
||||
}
|
||||
|
||||
let result: SpeechRecognitionResult;
|
||||
|
||||
switch (connectionMessage.path.toLowerCase()) {
|
||||
case "speech.phrase":
|
||||
const speechPhrase: SimpleSpeechPhrase = SimpleSpeechPhrase.fromJSON(connectionMessage.textBody);
|
||||
|
||||
this.privDialogRequestSession.onPhraseRecognized(this.privDialogRequestSession.currentTurnAudioOffset + speechPhrase.Offset + speechPhrase.Duration);
|
||||
|
||||
if (speechPhrase.RecognitionStatus === RecognitionStatus.Success) {
|
||||
const args: SpeechRecognitionEventArgs = this.fireEventForResult(speechPhrase, resultProps);
|
||||
if (!!this.privDialogServiceConnector.recognized) {
|
||||
try {
|
||||
this.privDialogServiceConnector.recognized(this.privDialogServiceConnector, args);
|
||||
/* tslint:disable:no-empty */
|
||||
} catch (error) {
|
||||
// Not going to let errors in the event handler
|
||||
// trip things up.
|
||||
}
|
||||
}
|
||||
|
||||
// report result to promise.
|
||||
if (!!this.privSuccessCallback) {
|
||||
try {
|
||||
this.privSuccessCallback(args.result);
|
||||
} catch (e) {
|
||||
if (!!errorCallBack) {
|
||||
errorCallBack(e);
|
||||
}
|
||||
}
|
||||
// Only invoke the call back once.
|
||||
// and if it's successful don't invoke the
|
||||
// error after that.
|
||||
this.privSuccessCallback = undefined;
|
||||
errorCallBack = undefined;
|
||||
}
|
||||
}
|
||||
break;
|
||||
case "speech.hypothesis":
|
||||
const hypothesis: SpeechHypothesis = SpeechHypothesis.fromJSON(connectionMessage.textBody);
|
||||
const offset: number = hypothesis.Offset + this.privDialogRequestSession.currentTurnAudioOffset;
|
||||
|
||||
result = new SpeechRecognitionResult(
|
||||
this.privDialogRequestSession.requestId,
|
||||
ResultReason.RecognizingSpeech,
|
||||
hypothesis.Text,
|
||||
hypothesis.Duration,
|
||||
offset,
|
||||
undefined,
|
||||
connectionMessage.textBody,
|
||||
resultProps);
|
||||
|
||||
this.privDialogRequestSession.onHypothesis(offset);
|
||||
|
||||
const ev = new SpeechRecognitionEventArgs(result, hypothesis.Duration, this.privDialogRequestSession.sessionId);
|
||||
|
||||
if (!!this.privDialogServiceConnector.recognizing) {
|
||||
try {
|
||||
this.privDialogServiceConnector.recognizing(this.privDialogServiceConnector, ev);
|
||||
/* tslint:disable:no-empty */
|
||||
} catch (error) {
|
||||
// Not going to let errors in the event handler
|
||||
// trip things up.
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
case "audio":
|
||||
{
|
||||
const audioRequestId = connectionMessage.requestId.toUpperCase();
|
||||
const turn = this.privTurnStateManager.GetTurn(audioRequestId);
|
||||
try {
|
||||
// Empty binary message signals end of stream.
|
||||
if (!connectionMessage.binaryBody) {
|
||||
turn.endAudioStream();
|
||||
} else {
|
||||
turn.audioStream.write(connectionMessage.binaryBody);
|
||||
}
|
||||
} catch (error) {
|
||||
// Not going to let errors in the event handler
|
||||
// trip things up.
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
case "response":
|
||||
{
|
||||
const responseRequestId = connectionMessage.requestId.toUpperCase();
|
||||
const activityPayload: ActivityPayloadResponse = ActivityPayloadResponse.fromJSON(connectionMessage.textBody);
|
||||
const turn = this.privTurnStateManager.GetTurn(responseRequestId);
|
||||
|
||||
// update the conversation Id
|
||||
if (activityPayload.conversationId) {
|
||||
const updateAgentConfig = this.agentConfig.get();
|
||||
updateAgentConfig.botInfo.conversationId = activityPayload.conversationId;
|
||||
this.agentConfig.set(updateAgentConfig);
|
||||
}
|
||||
|
||||
const pullAudioOutputStream: PullAudioOutputStreamImpl = turn.processActivityPayload(activityPayload);
|
||||
const activity = new ActivityReceivedEventArgs(activityPayload.messagePayload, pullAudioOutputStream);
|
||||
if (!!this.privDialogServiceConnector.activityReceived) {
|
||||
try {
|
||||
this.privDialogServiceConnector.activityReceived(this.privDialogServiceConnector, activity);
|
||||
/* tslint:disable:no-empty */
|
||||
} catch (error) {
|
||||
// Not going to let errors in the event handler
|
||||
// trip things up.
|
||||
}
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
// Cancels recognition.
|
||||
protected cancelRecognition(
|
||||
sessionId: string,
|
||||
requestId: string,
|
||||
cancellationReason: CancellationReason,
|
||||
errorCode: CancellationErrorCode,
|
||||
error: string,
|
||||
cancelRecoCallback: (e: SpeechRecognitionResult) => void): void {
|
||||
|
||||
this.terminateMessageLoop = true;
|
||||
|
||||
if (!!this.privDialogRequestSession.isRecognizing) {
|
||||
this.privDialogRequestSession.onStopRecognizing();
|
||||
}
|
||||
|
||||
if (!!this.privDialogServiceConnector.canceled) {
|
||||
const properties: PropertyCollection = new PropertyCollection();
|
||||
properties.setProperty(CancellationErrorCodePropertyName, CancellationErrorCode[errorCode]);
|
||||
|
||||
const cancelEvent: SpeechRecognitionCanceledEventArgs = new SpeechRecognitionCanceledEventArgs(
|
||||
cancellationReason,
|
||||
error,
|
||||
errorCode,
|
||||
undefined,
|
||||
sessionId);
|
||||
|
||||
try {
|
||||
this.privDialogServiceConnector.canceled(this.privDialogServiceConnector, cancelEvent);
|
||||
/* tslint:disable:no-empty */
|
||||
} catch { }
|
||||
|
||||
if (!!cancelRecoCallback) {
|
||||
const result: SpeechRecognitionResult = new SpeechRecognitionResult(
|
||||
undefined, // ResultId
|
||||
ResultReason.Canceled,
|
||||
undefined, // Text
|
||||
undefined, // Druation
|
||||
undefined, // Offset
|
||||
error,
|
||||
undefined, // Json
|
||||
properties);
|
||||
try {
|
||||
cancelRecoCallback(result);
|
||||
/* tslint:disable:no-empty */
|
||||
} catch { }
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
protected listenOnce = (
|
||||
recoMode: RecognitionMode,
|
||||
successCallback: (e: SpeechRecognitionResult) => void,
|
||||
errorCallback: (e: string) => void
|
||||
): any => {
|
||||
this.privRecognizerConfig.recognitionMode = recoMode;
|
||||
|
||||
this.privDialogRequestSession.startNewRecognition();
|
||||
this.privDialogRequestSession.listenForServiceTelemetry(this.privDialogAudioSource.events);
|
||||
|
||||
// Start the connection to the service. The promise this will create is stored and will be used by configureConnection().
|
||||
this.dialogConnectImpl();
|
||||
|
||||
this.sendPreAudioMessages();
|
||||
|
||||
this.privSuccessCallback = successCallback;
|
||||
|
||||
return this.privDialogAudioSource
|
||||
.attach(this.privDialogRequestSession.audioNodeId)
|
||||
.continueWithPromise<boolean>((result: PromiseResult<IAudioStreamNode>) => {
|
||||
let audioNode: ReplayableAudioNode;
|
||||
|
||||
if (result.isError) {
|
||||
this.cancelRecognition(this.privDialogRequestSession.sessionId, this.privDialogRequestSession.requestId, CancellationReason.Error, CancellationErrorCode.ConnectionFailure, result.error, successCallback);
|
||||
return PromiseHelper.fromError<boolean>(result.error);
|
||||
} else {
|
||||
audioNode = new ReplayableAudioNode(result.result, this.privDialogAudioSource.format as AudioStreamFormatImpl);
|
||||
this.privDialogRequestSession.onAudioSourceAttachCompleted(audioNode, false);
|
||||
}
|
||||
|
||||
return this.privDialogAudioSource.deviceInfo.onSuccessContinueWithPromise<boolean>((deviceInfo: ISpeechConfigAudioDevice): Promise<boolean> => {
|
||||
this.privRecognizerConfig.SpeechServiceConfig.Context.audio = { source: deviceInfo };
|
||||
|
||||
return this.configConnection()
|
||||
.on((_: IConnection) => {
|
||||
const sessionStartEventArgs: SessionEventArgs = new SessionEventArgs(this.privDialogRequestSession.sessionId);
|
||||
|
||||
if (!!this.privRecognizer.sessionStarted) {
|
||||
this.privRecognizer.sessionStarted(this.privRecognizer, sessionStartEventArgs);
|
||||
}
|
||||
|
||||
const audioSendPromise = this.sendAudio(audioNode);
|
||||
|
||||
// /* tslint:disable:no-empty */
|
||||
audioSendPromise.on((_: boolean) => { /*add? return true;*/ }, (error: string) => {
|
||||
this.cancelRecognition(this.privDialogRequestSession.sessionId, this.privDialogRequestSession.requestId, CancellationReason.Error, CancellationErrorCode.RuntimeError, error, successCallback);
|
||||
});
|
||||
|
||||
}, (error: string) => {
|
||||
this.cancelRecognition(this.privDialogRequestSession.sessionId, this.privDialogRequestSession.requestId, CancellationReason.Error, CancellationErrorCode.ConnectionFailure, error, successCallback);
|
||||
}).continueWithPromise<boolean>((result: PromiseResult<IConnection>): Promise<boolean> => {
|
||||
if (result.isError) {
|
||||
return PromiseHelper.fromError(result.error);
|
||||
} else {
|
||||
return PromiseHelper.fromResult<boolean>(true);
|
||||
}
|
||||
});
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
protected sendAudio = (
|
||||
audioStreamNode: IAudioStreamNode): Promise<boolean> => {
|
||||
// NOTE: Home-baked promises crash ios safari during the invocation
|
||||
// of the error callback chain (looks like the recursion is way too deep, and
|
||||
// it blows up the stack). The following construct is a stop-gap that does not
|
||||
// bubble the error up the callback chain and hence circumvents this problem.
|
||||
// TODO: rewrite with ES6 promises.
|
||||
const deferred = new Deferred<boolean>();
|
||||
|
||||
// The time we last sent data to the service.
|
||||
let nextSendTime: number = Date.now();
|
||||
|
||||
const audioFormat: AudioStreamFormatImpl = this.privDialogAudioSource.format as AudioStreamFormatImpl;
|
||||
|
||||
// Max amount to send before we start to throttle
|
||||
const fastLaneSizeMs: string = this.privRecognizerConfig.parameters.getProperty("SPEECH-TransmitLengthBeforThrottleMs", "5000");
|
||||
const maxSendUnthrottledBytes: number = audioFormat.avgBytesPerSec / 1000 * parseInt(fastLaneSizeMs, 10);
|
||||
const startRecogNumber: number = this.privDialogRequestSession.recogNumber;
|
||||
|
||||
const readAndUploadCycle = () => {
|
||||
|
||||
// If speech is done, stop sending audio.
|
||||
if (!this.privDialogIsDisposed &&
|
||||
!this.privDialogRequestSession.isSpeechEnded &&
|
||||
this.privDialogRequestSession.isRecognizing &&
|
||||
this.privDialogRequestSession.recogNumber === startRecogNumber) {
|
||||
this.fetchDialogConnection().on((connection: IConnection) => {
|
||||
audioStreamNode.read().on(
|
||||
(audioStreamChunk: IStreamChunk<ArrayBuffer>) => {
|
||||
// we have a new audio chunk to upload.
|
||||
if (this.privDialogRequestSession.isSpeechEnded) {
|
||||
// If service already recognized audio end then don't send any more audio
|
||||
deferred.resolve(true);
|
||||
return;
|
||||
}
|
||||
|
||||
let payload: ArrayBuffer;
|
||||
let sendDelay: number;
|
||||
|
||||
if (audioStreamChunk.isEnd) {
|
||||
payload = null;
|
||||
sendDelay = 0;
|
||||
} else {
|
||||
payload = audioStreamChunk.buffer;
|
||||
this.privDialogRequestSession.onAudioSent(payload.byteLength);
|
||||
|
||||
if (maxSendUnthrottledBytes >= this.privDialogRequestSession.bytesSent) {
|
||||
sendDelay = 0;
|
||||
} else {
|
||||
sendDelay = Math.max(0, nextSendTime - Date.now());
|
||||
}
|
||||
}
|
||||
|
||||
// Are we ready to send, or need we delay more?
|
||||
setTimeout(() => {
|
||||
if (payload !== null) {
|
||||
nextSendTime = Date.now() + (payload.byteLength * 1000 / (audioFormat.avgBytesPerSec * 2));
|
||||
}
|
||||
|
||||
const uploaded: Promise<boolean> = connection.send(
|
||||
new SpeechConnectionMessage(
|
||||
MessageType.Binary, "audio", this.privDialogRequestSession.requestId, null, payload));
|
||||
|
||||
if (!audioStreamChunk.isEnd) {
|
||||
uploaded.continueWith((_: PromiseResult<boolean>) => {
|
||||
|
||||
// Regardless of success or failure, schedule the next upload.
|
||||
// If the underlying connection was broken, the next cycle will
|
||||
// get a new connection and re-transmit missing audio automatically.
|
||||
readAndUploadCycle();
|
||||
});
|
||||
} else {
|
||||
// the audio stream has been closed, no need to schedule next
|
||||
// read-upload cycle.
|
||||
this.privDialogRequestSession.onSpeechEnded();
|
||||
deferred.resolve(true);
|
||||
}
|
||||
}, sendDelay);
|
||||
},
|
||||
(error: string) => {
|
||||
if (this.privDialogRequestSession.isSpeechEnded) {
|
||||
// For whatever reason, Reject is used to remove queue subscribers inside
|
||||
// the Queue.DrainAndDispose invoked from DetachAudioNode down below, which
|
||||
// means that sometimes things can be rejected in normal circumstances, without
|
||||
// any errors.
|
||||
deferred.resolve(true); // TODO: remove the argument, it's is completely meaningless.
|
||||
} else {
|
||||
// Only reject, if there was a proper error.
|
||||
deferred.reject(error);
|
||||
}
|
||||
});
|
||||
}, (error: string) => {
|
||||
deferred.reject(error);
|
||||
});
|
||||
}
|
||||
};
|
||||
|
||||
readAndUploadCycle();
|
||||
|
||||
return deferred.promise();
|
||||
}
|
||||
|
||||
// Establishes a websocket connection to the end point.
|
||||
private dialogConnectImpl(isUnAuthorized: boolean = false): Promise<IConnection> {
|
||||
if (this.privDialogConnectionPromise) {
|
||||
if (this.privDialogConnectionPromise.result().isCompleted &&
|
||||
(this.privDialogConnectionPromise.result().isError
|
||||
|| this.privDialogConnectionPromise.result().result.state() === ConnectionState.Disconnected)) {
|
||||
this.agentConfigSent = false;
|
||||
this.privDialogConnectionPromise = null;
|
||||
} else {
|
||||
return this.privDialogConnectionPromise;
|
||||
}
|
||||
}
|
||||
|
||||
this.privDialogAuthFetchEventId = createNoDashGuid();
|
||||
|
||||
// keep the connectionId for reconnect events
|
||||
if (this.privConnectionId === undefined) {
|
||||
this.privConnectionId = createNoDashGuid();
|
||||
}
|
||||
|
||||
this.privDialogRequestSession.onPreConnectionStart(this.privDialogAuthFetchEventId, this.privConnectionId);
|
||||
|
||||
const authPromise = isUnAuthorized ? this.privDialogAuthentication.fetchOnExpiry(this.privDialogAuthFetchEventId) : this.privDialogAuthentication.fetch(this.privDialogAuthFetchEventId);
|
||||
|
||||
this.privDialogConnectionPromise = authPromise
|
||||
.continueWithPromise((result: PromiseResult<AuthInfo>) => {
|
||||
if (result.isError) {
|
||||
this.privDialogRequestSession.onAuthCompleted(true, result.error);
|
||||
throw new Error(result.error);
|
||||
} else {
|
||||
this.privDialogRequestSession.onAuthCompleted(false);
|
||||
}
|
||||
|
||||
const connection: IConnection = this.privDialogConnectionFactory.create(this.privRecognizerConfig, result.result, this.privConnectionId);
|
||||
|
||||
this.privDialogRequestSession.listenForServiceTelemetry(connection.events);
|
||||
|
||||
// Attach to the underlying event. No need to hold onto the detach pointers as in the event the connection goes away,
|
||||
// it'll stop sending events.
|
||||
connection.events.attach((event: ConnectionEvent) => {
|
||||
this.connectionEvents.onEvent(event);
|
||||
});
|
||||
|
||||
return connection.open().onSuccessContinueWithPromise((response: ConnectionOpenResponse): Promise<IConnection> => {
|
||||
if (response.statusCode === 200) {
|
||||
this.privDialogRequestSession.onPreConnectionStart(this.privDialogAuthFetchEventId, this.privConnectionId);
|
||||
this.privDialogRequestSession.onConnectionEstablishCompleted(response.statusCode);
|
||||
|
||||
return PromiseHelper.fromResult<IConnection>(connection);
|
||||
} else if (response.statusCode === 403 && !isUnAuthorized) {
|
||||
return this.dialogConnectImpl(true);
|
||||
} else {
|
||||
this.privDialogRequestSession.onConnectionEstablishCompleted(response.statusCode, response.reason);
|
||||
return PromiseHelper.fromError<IConnection>(`Unable to contact server. StatusCode: ${response.statusCode}, ${this.privRecognizerConfig.parameters.getProperty(PropertyId.SpeechServiceConnection_Endpoint)} Reason: ${response.reason}`);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
this.privConnectionLoop = this.startMessageLoop();
|
||||
return this.privDialogConnectionPromise;
|
||||
}
|
||||
|
||||
private receiveDialogMessageOverride = (
|
||||
successCallback?: (e: SpeechRecognitionResult) => void,
|
||||
errorCallBack?: (e: string) => void
|
||||
): Promise<IConnection> => {
|
||||
|
||||
// we won't rely on the cascading promises of the connection since we want to continually be available to receive messages
|
||||
const communicationCustodian: Deferred<IConnection> = new Deferred<IConnection>();
|
||||
|
||||
this.fetchDialogConnection().on((connection: IConnection): Promise<IConnection> => {
|
||||
return connection.read()
|
||||
.onSuccessContinueWithPromise((message: ConnectionMessage): Promise<IConnection> => {
|
||||
const isDisposed: boolean = this.isDisposed();
|
||||
const terminateMessageLoop = (!this.isDisposed() && this.terminateMessageLoop);
|
||||
if (isDisposed || terminateMessageLoop) {
|
||||
// We're done.
|
||||
communicationCustodian.resolve(undefined);
|
||||
return PromiseHelper.fromResult<IConnection>(undefined);
|
||||
}
|
||||
|
||||
if (!message) {
|
||||
return this.receiveDialogMessageOverride();
|
||||
}
|
||||
|
||||
const connectionMessage = SpeechConnectionMessage.fromConnectionMessage(message);
|
||||
|
||||
switch (connectionMessage.path.toLowerCase()) {
|
||||
case "turn.start":
|
||||
{
|
||||
const turnRequestId = connectionMessage.requestId.toUpperCase();
|
||||
const audioSessionReqId = this.privDialogRequestSession.requestId.toUpperCase();
|
||||
|
||||
// turn started by the service
|
||||
if (turnRequestId !== audioSessionReqId) {
|
||||
this.privTurnStateManager.StartTurn(turnRequestId);
|
||||
}
|
||||
}
|
||||
break;
|
||||
case "speech.startdetected":
|
||||
const speechStartDetected: SpeechDetected = SpeechDetected.fromJSON(connectionMessage.textBody);
|
||||
|
||||
const speechStartEventArgs = new RecognitionEventArgs(speechStartDetected.Offset, this.privDialogRequestSession.sessionId);
|
||||
|
||||
if (!!this.privRecognizer.speechStartDetected) {
|
||||
this.privRecognizer.speechStartDetected(this.privRecognizer, speechStartEventArgs);
|
||||
}
|
||||
|
||||
break;
|
||||
case "speech.enddetected":
|
||||
|
||||
let json: string;
|
||||
|
||||
if (connectionMessage.textBody.length > 0) {
|
||||
json = connectionMessage.textBody;
|
||||
} else {
|
||||
// If the request was empty, the JSON returned is empty.
|
||||
json = "{ Offset: 0 }";
|
||||
}
|
||||
|
||||
const speechStopDetected: SpeechDetected = SpeechDetected.fromJSON(json);
|
||||
|
||||
this.privDialogRequestSession.onServiceRecognized(speechStopDetected.Offset + this.privDialogRequestSession.currentTurnAudioOffset);
|
||||
|
||||
const speechStopEventArgs = new RecognitionEventArgs(speechStopDetected.Offset + this.privDialogRequestSession.currentTurnAudioOffset, this.privDialogRequestSession.sessionId);
|
||||
|
||||
if (!!this.privRecognizer.speechEndDetected) {
|
||||
this.privRecognizer.speechEndDetected(this.privRecognizer, speechStopEventArgs);
|
||||
}
|
||||
break;
|
||||
|
||||
case "turn.end":
|
||||
{
|
||||
const turnEndRequestId = connectionMessage.requestId.toUpperCase();
|
||||
|
||||
const audioSessionReqId = this.privDialogRequestSession.requestId.toUpperCase();
|
||||
|
||||
// turn started by the service
|
||||
if (turnEndRequestId !== audioSessionReqId) {
|
||||
this.privTurnStateManager.CompleteTurn(turnEndRequestId);
|
||||
} else {
|
||||
// Audio session turn
|
||||
|
||||
const sessionStopEventArgs: SessionEventArgs = new SessionEventArgs(this.privDialogRequestSession.sessionId);
|
||||
this.privDialogRequestSession.onServiceTurnEndResponse(false);
|
||||
|
||||
if (this.privDialogRequestSession.isSpeechEnded) {
|
||||
if (!!this.privRecognizer.sessionStopped) {
|
||||
this.privRecognizer.sessionStopped(this.privRecognizer, sessionStopEventArgs);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
break;
|
||||
|
||||
default:
|
||||
this.processTypeSpecificMessages(
|
||||
connectionMessage,
|
||||
successCallback,
|
||||
errorCallBack);
|
||||
}
|
||||
|
||||
return this.receiveDialogMessageOverride();
|
||||
});
|
||||
}, (error: string) => {
|
||||
this.terminateMessageLoop = true;
|
||||
});
|
||||
|
||||
return communicationCustodian.promise();
|
||||
}
|
||||
|
||||
private startMessageLoop(): Promise<IConnection> {
|
||||
|
||||
this.terminateMessageLoop = false;
|
||||
|
||||
const messageRetrievalPromise = this.receiveDialogMessageOverride();
|
||||
|
||||
return messageRetrievalPromise.on((r: IConnection) => {
|
||||
return true;
|
||||
}, (error: string) => {
|
||||
this.cancelRecognition(this.privDialogRequestSession.sessionId, this.privDialogRequestSession.requestId, CancellationReason.Error, CancellationErrorCode.RuntimeError, error, this.privSuccessCallback);
|
||||
});
|
||||
}
|
||||
|
||||
// Takes an established websocket connection to the endpoint and sends speech configuration information.
|
||||
private configConnection(): Promise<IConnection> {
|
||||
if (this.privConnectionConfigPromise) {
|
||||
if (this.privConnectionConfigPromise.result().isCompleted &&
|
||||
(this.privConnectionConfigPromise.result().isError
|
||||
|| this.privConnectionConfigPromise.result().result.state() === ConnectionState.Disconnected)) {
|
||||
|
||||
this.privConnectionConfigPromise = null;
|
||||
return this.configConnection();
|
||||
} else {
|
||||
return this.privConnectionConfigPromise;
|
||||
}
|
||||
}
|
||||
|
||||
this.privConnectionConfigPromise = this.dialogConnectImpl().onSuccessContinueWithPromise((connection: IConnection): Promise<IConnection> => {
|
||||
return this.sendSpeechServiceConfig(connection, this.privDialogRequestSession, this.privRecognizerConfig.SpeechServiceConfig.serialize())
|
||||
.onSuccessContinueWithPromise((_: boolean) => {
|
||||
return this.sendAgentConfig(connection).onSuccessContinueWith((_: boolean) => {
|
||||
return connection;
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
return this.privConnectionConfigPromise;
|
||||
}
|
||||
|
||||
private fetchDialogConnection = (): Promise<IConnection> => {
|
||||
return this.configConnection();
|
||||
}
|
||||
|
||||
private sendPreAudioMessages(): void {
|
||||
this.fetchDialogConnection().onSuccessContinueWith((connection: IConnection): void => {
|
||||
this.sendAgentContext(connection);
|
||||
});
|
||||
}
|
||||
|
||||
private sendAgentConfig = (connection: IConnection): Promise<boolean> => {
|
||||
if (this.agentConfig && !this.agentConfigSent) {
|
||||
const agentConfigJson = this.agentConfig.toJsonString();
|
||||
|
||||
this.agentConfigSent = true;
|
||||
|
||||
return connection.send(new SpeechConnectionMessage(
|
||||
MessageType.Text,
|
||||
"agent.config",
|
||||
this.privDialogRequestSession.requestId,
|
||||
"application/json",
|
||||
agentConfigJson));
|
||||
}
|
||||
|
||||
return PromiseHelper.fromResult(true);
|
||||
}
|
||||
|
||||
private sendAgentContext = (connection: IConnection): Promise<boolean> => {
|
||||
const guid: string = createGuid();
|
||||
|
||||
const agentContext: any = {
|
||||
channelData: "",
|
||||
context: {
|
||||
interactionId: guid
|
||||
},
|
||||
version: 0.5
|
||||
};
|
||||
|
||||
const agentContextJson = JSON.stringify(agentContext);
|
||||
|
||||
return connection.send(new SpeechConnectionMessage(
|
||||
MessageType.Text,
|
||||
"speech.agent.context",
|
||||
this.privDialogRequestSession.requestId,
|
||||
"application/json",
|
||||
agentContextJson));
|
||||
}
|
||||
|
||||
private fireEventForResult(serviceResult: SimpleSpeechPhrase, properties: PropertyCollection): SpeechRecognitionEventArgs {
|
||||
const resultReason: ResultReason = EnumTranslation.implTranslateRecognitionResult(serviceResult.RecognitionStatus);
|
||||
|
||||
const offset: number = serviceResult.Offset + this.privDialogRequestSession.currentTurnAudioOffset;
|
||||
|
||||
const result = new SpeechRecognitionResult(
|
||||
this.privDialogRequestSession.requestId,
|
||||
resultReason,
|
||||
serviceResult.DisplayText,
|
||||
serviceResult.Duration,
|
||||
offset,
|
||||
undefined,
|
||||
JSON.stringify(serviceResult),
|
||||
properties);
|
||||
|
||||
const ev = new SpeechRecognitionEventArgs(result, offset, this.privDialogRequestSession.sessionId);
|
||||
return ev;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,68 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { AudioOutputStream, PullAudioOutputStreamImpl } from "../sdk/Audio/AudioOutputStream";
|
||||
import { DialogServiceTurnStateManager } from "./DialogServiceTurnStateManager";
|
||||
import { ActivityPayloadResponse, MessageDataStreamType } from "./ServiceMessages/ActivityResponsePayload";
|
||||
|
||||
export class DialogServiceTurnState {
|
||||
private privRequestId: string;
|
||||
private privIsCompleted: boolean;
|
||||
private privAudioStream: PullAudioOutputStreamImpl;
|
||||
private privTimeoutToken: any;
|
||||
private privTurnManager: DialogServiceTurnStateManager;
|
||||
|
||||
constructor(manager: DialogServiceTurnStateManager, requestId: string) {
|
||||
this.privRequestId = requestId;
|
||||
this.privIsCompleted = false;
|
||||
this.privAudioStream = null;
|
||||
this.privTurnManager = manager;
|
||||
this.resetTurnEndTimeout();
|
||||
// tslint:disable-next-line:no-console
|
||||
// console.info("DialogServiceTurnState debugturn start:" + this.privRequestId);
|
||||
}
|
||||
|
||||
public get audioStream(): PullAudioOutputStreamImpl {
|
||||
// Called when is needed to stream.
|
||||
this.resetTurnEndTimeout();
|
||||
return this.privAudioStream;
|
||||
}
|
||||
|
||||
public processActivityPayload(payload: ActivityPayloadResponse): PullAudioOutputStreamImpl {
|
||||
if (payload.messageDataStreamType === MessageDataStreamType.TextToSpeechAudio) {
|
||||
this.privAudioStream = AudioOutputStream.createPullStream() as PullAudioOutputStreamImpl;
|
||||
// tslint:disable-next-line:no-console
|
||||
// console.info("Audio start debugturn:" + this.privRequestId);
|
||||
}
|
||||
return this.privAudioStream;
|
||||
}
|
||||
|
||||
public endAudioStream(): void {
|
||||
if (this.privAudioStream !== null && !this.privAudioStream.isClosed) {
|
||||
this.privAudioStream.close();
|
||||
}
|
||||
}
|
||||
|
||||
public complete(): void {
|
||||
if (this.privTimeoutToken !== undefined) {
|
||||
clearTimeout(this.privTimeoutToken);
|
||||
}
|
||||
this.endAudioStream();
|
||||
}
|
||||
|
||||
private resetTurnEndTimeout(): void {
|
||||
if (this.privTimeoutToken !== undefined) {
|
||||
clearTimeout(this.privTimeoutToken);
|
||||
}
|
||||
// tslint:disable-next-line:no-console
|
||||
// console.info("Timeout reset debugturn:" + this.privRequestId);
|
||||
|
||||
this.privTimeoutToken = setTimeout((): void => {
|
||||
// tslint:disable-next-line:no-console
|
||||
// console.info("Timeout complete debugturn:" + this.privRequestId);
|
||||
|
||||
this.privTurnManager.CompleteTurn(this.privRequestId);
|
||||
return;
|
||||
}, 2000);
|
||||
}
|
||||
}
|
|
@ -0,0 +1,39 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { InvalidOperationError } from "../common/Error";
|
||||
import { AudioOutputStream, PullAudioOutputStreamImpl } from "../sdk/Audio/AudioOutputStream";
|
||||
import { DialogServiceTurnState } from "./DialogServiceTurnState";
|
||||
import { ActivityPayloadResponse } from "./ServiceMessages/ActivityResponsePayload";
|
||||
|
||||
export class DialogServiceTurnStateManager {
|
||||
private privTurnMap: Map<string, DialogServiceTurnState>;
|
||||
|
||||
constructor() {
|
||||
this.privTurnMap = new Map<string, DialogServiceTurnState>();
|
||||
return;
|
||||
}
|
||||
|
||||
public StartTurn(id: string): DialogServiceTurnState {
|
||||
if (this.privTurnMap.has(id)) {
|
||||
throw new InvalidOperationError("Service error: There is already a turn with id:" + id);
|
||||
}
|
||||
const turnState: DialogServiceTurnState = new DialogServiceTurnState(this, id);
|
||||
this.privTurnMap.set(id, turnState);
|
||||
return this.privTurnMap.get(id);
|
||||
}
|
||||
|
||||
public GetTurn(id: string): DialogServiceTurnState {
|
||||
return this.privTurnMap.get(id);
|
||||
}
|
||||
|
||||
public CompleteTurn(id: string): DialogServiceTurnState {
|
||||
if (!this.privTurnMap.has(id)) {
|
||||
throw new InvalidOperationError("Service error: Received turn end for an unknown turn id:" + id);
|
||||
}
|
||||
const turnState = this.privTurnMap.get(id);
|
||||
turnState.complete();
|
||||
this.privTurnMap.delete(id);
|
||||
return turnState;
|
||||
}
|
||||
}
|
|
@ -33,6 +33,8 @@ export * from "./RequestSession";
|
|||
export * from "./SpeechContext";
|
||||
export * from "./DynamicGrammarBuilder";
|
||||
export * from "./DynamicGrammarInterfaces";
|
||||
export * from "./DialogServiceAdapter";
|
||||
export * from "./AgentConfig";
|
||||
|
||||
export const OutputFormatPropertyName: string = "OutputFormat";
|
||||
export const CancellationErrorCodePropertyName: string = "CancellationErrorCode";
|
||||
|
|
|
@ -103,6 +103,7 @@ export class RequestSession {
|
|||
|
||||
public onAudioSourceAttachCompleted = (audioNode: ReplayableAudioNode, isError: boolean, error?: string): void => {
|
||||
this.privAudioNode = audioNode;
|
||||
this.privIsAudioNodeDetached = false;
|
||||
|
||||
if (isError) {
|
||||
this.onComplete();
|
||||
|
|
|
@ -0,0 +1,43 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
// response
|
||||
|
||||
export interface IActivityPayloadResponse {
|
||||
conversationId: string;
|
||||
messageDataStreamType: number;
|
||||
messagePayload: any;
|
||||
version: number;
|
||||
}
|
||||
|
||||
export class ActivityPayloadResponse implements IActivityPayloadResponse {
|
||||
private privActivityResponse: IActivityPayloadResponse;
|
||||
|
||||
private constructor(json: string) {
|
||||
this.privActivityResponse = JSON.parse(json);
|
||||
}
|
||||
|
||||
public static fromJSON(json: string): ActivityPayloadResponse {
|
||||
return new ActivityPayloadResponse(json);
|
||||
}
|
||||
|
||||
public get conversationId(): string {
|
||||
return this.privActivityResponse.conversationId;
|
||||
}
|
||||
|
||||
public get messageDataStreamType(): number {
|
||||
return this.privActivityResponse.messageDataStreamType;
|
||||
}
|
||||
|
||||
public get messagePayload(): any {
|
||||
return this.privActivityResponse.messagePayload;
|
||||
}
|
||||
|
||||
public get version(): number {
|
||||
return this.privActivityResponse.version;
|
||||
}
|
||||
}
|
||||
|
||||
export enum MessageDataStreamType {
|
||||
None = 0,
|
||||
TextToSpeechAudio = 1,
|
||||
}
|
|
@ -9,6 +9,7 @@ import {
|
|||
ConnectionMessage,
|
||||
ConnectionOpenResponse,
|
||||
ConnectionState,
|
||||
createGuid,
|
||||
createNoDashGuid,
|
||||
Deferred,
|
||||
EventSource,
|
||||
|
@ -33,6 +34,7 @@ import {
|
|||
SpeechRecognitionResult,
|
||||
} from "../sdk/Exports";
|
||||
import {
|
||||
AgentConfig,
|
||||
DynamicGrammarBuilder,
|
||||
ISpeechConfigAudio,
|
||||
ISpeechConfigAudioDevice,
|
||||
|
@ -56,22 +58,23 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
private privSpeechServiceConfigConnectionId: string;
|
||||
|
||||
// A promise for a configured connection.
|
||||
// Do not consume directly, call fethConnection instead.
|
||||
// Do not consume directly, call fetchConnection instead.
|
||||
private privConnectionConfigurationPromise: Promise<IConnection>;
|
||||
|
||||
// A promise for a connection, but one that has not had the speech context sent yet.
|
||||
// Do no consume directly, call fetchConnection insted.
|
||||
// Do not consume directly, call fetchConnection instead.
|
||||
private privConnectionPromise: Promise<IConnection>;
|
||||
private privConnectionId: string;
|
||||
private privAuthFetchEventId: string;
|
||||
private privIsDisposed: boolean;
|
||||
private privRecognizer: Recognizer;
|
||||
private privMustReportEndOfStream: boolean;
|
||||
private privConnectionEvents: EventSource<ConnectionEvent>;
|
||||
private privSpeechContext: SpeechContext;
|
||||
private privDynamicGrammar: DynamicGrammarBuilder;
|
||||
private privAgentConfig: AgentConfig;
|
||||
protected privRequestSession: RequestSession;
|
||||
protected privConnectionId: string;
|
||||
protected privRecognizerConfig: RecognizerConfig;
|
||||
protected privRecognizer: Recognizer;
|
||||
|
||||
public constructor(
|
||||
authentication: IAuthentication,
|
||||
|
@ -107,6 +110,7 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
this.privConnectionEvents = new EventSource<ConnectionEvent>();
|
||||
this.privDynamicGrammar = new DynamicGrammarBuilder();
|
||||
this.privSpeechContext = new SpeechContext(this.privDynamicGrammar);
|
||||
this.privAgentConfig = new AgentConfig();
|
||||
}
|
||||
|
||||
public get audioSource(): IAudioSource {
|
||||
|
@ -121,6 +125,10 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
return this.privDynamicGrammar;
|
||||
}
|
||||
|
||||
public get agentConfig(): AgentConfig {
|
||||
return this.privAgentConfig;
|
||||
}
|
||||
|
||||
public isDisposed(): boolean {
|
||||
return this.privIsDisposed;
|
||||
}
|
||||
|
@ -142,12 +150,18 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
return this.privRecognizerConfig.recognitionMode;
|
||||
}
|
||||
|
||||
protected recognizeOverride: (recoMode: RecognitionMode, sc: (e: SpeechRecognitionResult) => void, ec: (e: string) => void) => any = undefined;
|
||||
|
||||
public recognize(
|
||||
recoMode: RecognitionMode,
|
||||
successCallback: (e: SpeechRecognitionResult) => void,
|
||||
errorCallBack: (e: string) => void,
|
||||
): Promise<boolean> {
|
||||
|
||||
if (this.recognizeOverride !== undefined) {
|
||||
return this.recognizeOverride(recoMode, successCallback, errorCallBack);
|
||||
}
|
||||
|
||||
// Clear the existing configuration promise to force a re-transmission of config and context.
|
||||
this.privConnectionConfigurationPromise = null;
|
||||
this.privRecognizerConfig.recognitionMode = recoMode;
|
||||
|
@ -225,7 +239,14 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
this.connectImpl().result();
|
||||
}
|
||||
|
||||
protected disconnectOverride: () => any = undefined;
|
||||
|
||||
public disconnect(): void {
|
||||
if (this.disconnectOverride !== undefined) {
|
||||
this.disconnectOverride();
|
||||
return;
|
||||
}
|
||||
|
||||
this.cancelRecognitionLocal(CancellationReason.Error,
|
||||
CancellationErrorCode.NoError,
|
||||
"Disconnecting",
|
||||
|
@ -248,6 +269,8 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
public static telemetryData: (json: string) => void;
|
||||
public static telemetryDataEnabled: boolean = true;
|
||||
|
||||
public sendMessage(message: string): void {}
|
||||
|
||||
protected abstract processTypeSpecificMessages(
|
||||
connectionMessage: SpeechConnectionMessage,
|
||||
successCallback?: (e: SpeechRecognitionResult) => void,
|
||||
|
@ -309,101 +332,18 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
}
|
||||
}
|
||||
|
||||
private fetchConnection = (): Promise<IConnection> => {
|
||||
return this.configureConnection();
|
||||
}
|
||||
protected receiveMessageOverride: (sc?: (e: SpeechRecognitionResult) => void, ec?: (e: string) => void) => any = undefined;
|
||||
|
||||
// Establishes a websocket connection to the end point.
|
||||
private connectImpl(isUnAuthorized: boolean = false): Promise<IConnection> {
|
||||
if (this.privConnectionPromise) {
|
||||
if (this.privConnectionPromise.result().isCompleted &&
|
||||
(this.privConnectionPromise.result().isError
|
||||
|| this.privConnectionPromise.result().result.state() === ConnectionState.Disconnected)) {
|
||||
this.privConnectionId = null;
|
||||
this.privConnectionPromise = null;
|
||||
return this.connectImpl();
|
||||
} else {
|
||||
return this.privConnectionPromise;
|
||||
}
|
||||
}
|
||||
|
||||
this.privAuthFetchEventId = createNoDashGuid();
|
||||
this.privConnectionId = createNoDashGuid();
|
||||
|
||||
this.privRequestSession.onPreConnectionStart(this.privAuthFetchEventId, this.privConnectionId);
|
||||
|
||||
const authPromise = isUnAuthorized ? this.privAuthentication.fetchOnExpiry(this.privAuthFetchEventId) : this.privAuthentication.fetch(this.privAuthFetchEventId);
|
||||
|
||||
this.privConnectionPromise = authPromise
|
||||
.continueWithPromise((result: PromiseResult<AuthInfo>) => {
|
||||
if (result.isError) {
|
||||
this.privRequestSession.onAuthCompleted(true, result.error);
|
||||
throw new Error(result.error);
|
||||
} else {
|
||||
this.privRequestSession.onAuthCompleted(false);
|
||||
}
|
||||
|
||||
const connection: IConnection = this.privConnectionFactory.create(this.privRecognizerConfig, result.result, this.privConnectionId);
|
||||
|
||||
this.privRequestSession.listenForServiceTelemetry(connection.events);
|
||||
|
||||
// Attach to the underlying event. No need to hold onto the detach pointers as in the event the connection goes away,
|
||||
// it'll stop sending events.
|
||||
connection.events.attach((event: ConnectionEvent) => {
|
||||
this.connectionEvents.onEvent(event);
|
||||
});
|
||||
|
||||
return connection.open().onSuccessContinueWithPromise((response: ConnectionOpenResponse): Promise<IConnection> => {
|
||||
if (response.statusCode === 200) {
|
||||
this.privRequestSession.onPreConnectionStart(this.privAuthFetchEventId, this.privConnectionId);
|
||||
this.privRequestSession.onConnectionEstablishCompleted(response.statusCode);
|
||||
|
||||
return PromiseHelper.fromResult<IConnection>(connection);
|
||||
} else if (response.statusCode === 403 && !isUnAuthorized) {
|
||||
return this.connectImpl(true);
|
||||
} else {
|
||||
this.privRequestSession.onConnectionEstablishCompleted(response.statusCode, response.reason);
|
||||
return PromiseHelper.fromError<IConnection>(`Unable to contact server. StatusCode: ${response.statusCode}, ${this.privRecognizerConfig.parameters.getProperty(PropertyId.SpeechServiceConnection_Endpoint)} Reason: ${response.reason}`);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
return this.privConnectionPromise;
|
||||
}
|
||||
|
||||
// Takes an established websocket connection to the endpoint and sends speech configuration information.
|
||||
private configureConnection(): Promise<IConnection> {
|
||||
if (this.privConnectionConfigurationPromise) {
|
||||
if (this.privConnectionConfigurationPromise.result().isCompleted &&
|
||||
(this.privConnectionConfigurationPromise.result().isError
|
||||
|| this.privConnectionConfigurationPromise.result().result.state() === ConnectionState.Disconnected)) {
|
||||
|
||||
this.privConnectionConfigurationPromise = null;
|
||||
return this.configureConnection();
|
||||
} else {
|
||||
return this.privConnectionConfigurationPromise;
|
||||
}
|
||||
}
|
||||
|
||||
this.privConnectionConfigurationPromise = this.connectImpl().onSuccessContinueWithPromise((connection: IConnection): Promise<IConnection> => {
|
||||
return this.sendSpeechServiceConfig(connection, this.privRequestSession, this.privRecognizerConfig.SpeechServiceConfig.serialize())
|
||||
.onSuccessContinueWithPromise((_: boolean) => {
|
||||
return this.sendSpeechContext(connection).onSuccessContinueWith((_: boolean) => {
|
||||
return connection;
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
return this.privConnectionConfigurationPromise;
|
||||
}
|
||||
|
||||
private receiveMessage = (
|
||||
protected receiveMessage = (
|
||||
successCallback: (e: SpeechRecognitionResult) => void,
|
||||
errorCallBack: (e: string) => void,
|
||||
): Promise<IConnection> => {
|
||||
return this.fetchConnection().on((connection: IConnection): Promise<IConnection> => {
|
||||
return connection.read()
|
||||
.onSuccessContinueWithPromise((message: ConnectionMessage) => {
|
||||
if (this.receiveMessageOverride !== undefined) {
|
||||
return this.receiveMessageOverride();
|
||||
}
|
||||
if (this.privIsDisposed || !this.privRequestSession.isRecognizing) {
|
||||
// We're done.
|
||||
return PromiseHelper.fromResult(undefined);
|
||||
|
@ -496,7 +436,90 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
});
|
||||
}
|
||||
|
||||
private sendSpeechServiceConfig = (connection: IConnection, requestSession: RequestSession, SpeechServiceConfigJson: string): Promise<boolean> => {
|
||||
protected sendSpeechContext = (connection: IConnection): Promise<boolean> => {
|
||||
const speechContextJson = this.speechContext.toJSON();
|
||||
|
||||
if (speechContextJson) {
|
||||
return connection.send(new SpeechConnectionMessage(
|
||||
MessageType.Text,
|
||||
"speech.context",
|
||||
this.privRequestSession.requestId,
|
||||
"application/json",
|
||||
speechContextJson));
|
||||
}
|
||||
return PromiseHelper.fromResult(true);
|
||||
}
|
||||
|
||||
protected connectImplOverride: (isUnAuthorized: boolean) => any = undefined;
|
||||
|
||||
// Establishes a websocket connection to the end point.
|
||||
protected connectImpl(isUnAuthorized: boolean = false): Promise<IConnection> {
|
||||
|
||||
if (this.connectImplOverride !== undefined) {
|
||||
return this.connectImplOverride(isUnAuthorized);
|
||||
}
|
||||
|
||||
if (this.privConnectionPromise) {
|
||||
if (this.privConnectionPromise.result().isCompleted &&
|
||||
(this.privConnectionPromise.result().isError
|
||||
|| this.privConnectionPromise.result().result.state() === ConnectionState.Disconnected)) {
|
||||
this.privConnectionId = null;
|
||||
this.privConnectionPromise = null;
|
||||
return this.connectImpl();
|
||||
} else {
|
||||
return this.privConnectionPromise;
|
||||
}
|
||||
}
|
||||
|
||||
this.privAuthFetchEventId = createNoDashGuid();
|
||||
this.privConnectionId = createNoDashGuid();
|
||||
|
||||
this.privRequestSession.onPreConnectionStart(this.privAuthFetchEventId, this.privConnectionId);
|
||||
|
||||
const authPromise = isUnAuthorized ? this.privAuthentication.fetchOnExpiry(this.privAuthFetchEventId) : this.privAuthentication.fetch(this.privAuthFetchEventId);
|
||||
|
||||
this.privConnectionPromise = authPromise
|
||||
.continueWithPromise((result: PromiseResult<AuthInfo>) => {
|
||||
if (result.isError) {
|
||||
this.privRequestSession.onAuthCompleted(true, result.error);
|
||||
throw new Error(result.error);
|
||||
} else {
|
||||
this.privRequestSession.onAuthCompleted(false);
|
||||
}
|
||||
|
||||
const connection: IConnection = this.privConnectionFactory.create(this.privRecognizerConfig, result.result, this.privConnectionId);
|
||||
|
||||
this.privRequestSession.listenForServiceTelemetry(connection.events);
|
||||
|
||||
// Attach to the underlying event. No need to hold onto the detach pointers as in the event the connection goes away,
|
||||
// it'll stop sending events.
|
||||
connection.events.attach((event: ConnectionEvent) => {
|
||||
this.connectionEvents.onEvent(event);
|
||||
});
|
||||
|
||||
return connection.open().onSuccessContinueWithPromise((response: ConnectionOpenResponse): Promise<IConnection> => {
|
||||
if (response.statusCode === 200) {
|
||||
this.privRequestSession.onPreConnectionStart(this.privAuthFetchEventId, this.privConnectionId);
|
||||
this.privRequestSession.onConnectionEstablishCompleted(response.statusCode);
|
||||
|
||||
return PromiseHelper.fromResult<IConnection>(connection);
|
||||
} else if (response.statusCode === 403 && !isUnAuthorized) {
|
||||
return this.connectImpl(true);
|
||||
} else {
|
||||
this.privRequestSession.onConnectionEstablishCompleted(response.statusCode, response.reason);
|
||||
return PromiseHelper.fromError<IConnection>(`Unable to contact server. StatusCode: ${response.statusCode}, ${this.privRecognizerConfig.parameters.getProperty(PropertyId.SpeechServiceConnection_Endpoint)} Reason: ${response.reason}`);
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
return this.privConnectionPromise;
|
||||
}
|
||||
|
||||
protected configConnectionOverride: () => any = undefined;
|
||||
|
||||
protected fetchConnectionOverride: () => any = undefined;
|
||||
|
||||
protected sendSpeechServiceConfig = (connection: IConnection, requestSession: RequestSession, SpeechServiceConfigJson: string): Promise<boolean> => {
|
||||
// filter out anything that is not required for the service to work.
|
||||
if (ServiceRecognizerBase.telemetryDataEnabled !== true) {
|
||||
const withTelemetry = JSON.parse(SpeechServiceConfigJson);
|
||||
|
@ -523,37 +546,7 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
return PromiseHelper.fromResult(true);
|
||||
}
|
||||
|
||||
private sendSpeechContext = (connection: IConnection): Promise<boolean> => {
|
||||
const speechContextJson = this.speechContext.toJSON();
|
||||
|
||||
if (speechContextJson) {
|
||||
return connection.send(new SpeechConnectionMessage(
|
||||
MessageType.Text,
|
||||
"speech.context",
|
||||
this.privRequestSession.requestId,
|
||||
"application/json",
|
||||
speechContextJson));
|
||||
}
|
||||
return PromiseHelper.fromResult(true);
|
||||
}
|
||||
|
||||
private sendFinalAudio(): Promise<boolean> {
|
||||
const deferred = new Deferred<boolean>();
|
||||
|
||||
this.fetchConnection().on((connection: IConnection) => {
|
||||
connection.send(new SpeechConnectionMessage(MessageType.Binary, "audio", this.privRequestSession.requestId, null, null)).on((_: boolean) => {
|
||||
deferred.resolve(true);
|
||||
}, (error: string) => {
|
||||
deferred.reject(error);
|
||||
});
|
||||
}, (error: string) => {
|
||||
deferred.reject(error);
|
||||
});
|
||||
|
||||
return deferred.promise();
|
||||
}
|
||||
|
||||
private sendAudio = (
|
||||
protected sendAudio = (
|
||||
audioStreamNode: IAudioStreamNode): Promise<boolean> => {
|
||||
// NOTE: Home-baked promises crash ios safari during the invocation
|
||||
// of the error callback chain (looks like the recursion is way too deep, and
|
||||
|
@ -654,4 +647,58 @@ export abstract class ServiceRecognizerBase implements IDisposable {
|
|||
|
||||
return deferred.promise();
|
||||
}
|
||||
|
||||
private sendFinalAudio(): Promise<boolean> {
|
||||
const deferred = new Deferred<boolean>();
|
||||
|
||||
this.fetchConnection().on((connection: IConnection) => {
|
||||
connection.send(new SpeechConnectionMessage(MessageType.Binary, "audio", this.privRequestSession.requestId, null, null)).on((_: boolean) => {
|
||||
deferred.resolve(true);
|
||||
}, (error: string) => {
|
||||
deferred.reject(error);
|
||||
});
|
||||
}, (error: string) => {
|
||||
deferred.reject(error);
|
||||
});
|
||||
|
||||
return deferred.promise();
|
||||
}
|
||||
|
||||
private fetchConnection = (): Promise<IConnection> => {
|
||||
if (this.fetchConnectionOverride !== undefined) {
|
||||
return this.fetchConnectionOverride();
|
||||
}
|
||||
|
||||
return this.configureConnection();
|
||||
}
|
||||
|
||||
// Takes an established websocket connection to the endpoint and sends speech configuration information.
|
||||
private configureConnection(): Promise<IConnection> {
|
||||
if (this.configConnectionOverride !== undefined) {
|
||||
return this.configConnectionOverride();
|
||||
}
|
||||
|
||||
if (this.privConnectionConfigurationPromise) {
|
||||
if (this.privConnectionConfigurationPromise.result().isCompleted &&
|
||||
(this.privConnectionConfigurationPromise.result().isError
|
||||
|| this.privConnectionConfigurationPromise.result().result.state() === ConnectionState.Disconnected)) {
|
||||
|
||||
this.privConnectionConfigurationPromise = null;
|
||||
return this.configureConnection();
|
||||
} else {
|
||||
return this.privConnectionConfigurationPromise;
|
||||
}
|
||||
}
|
||||
|
||||
this.privConnectionConfigurationPromise = this.connectImpl().onSuccessContinueWithPromise((connection: IConnection): Promise<IConnection> => {
|
||||
return this.sendSpeechServiceConfig(connection, this.privRequestSession, this.privRecognizerConfig.SpeechServiceConfig.serialize())
|
||||
.onSuccessContinueWithPromise((_: boolean) => {
|
||||
return this.sendSpeechContext(connection).onSuccessContinueWith((_: boolean) => {
|
||||
return connection;
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
return this.privConnectionConfigurationPromise;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -113,7 +113,7 @@ export class StreamReader<TBuffer> {
|
|||
return this.privReaderQueue
|
||||
.dequeue()
|
||||
.onSuccessContinueWith((streamChunk: IStreamChunk<TBuffer>) => {
|
||||
if (streamChunk.isEnd) {
|
||||
if (streamChunk === undefined || streamChunk.isEnd) {
|
||||
this.privReaderQueue.dispose("End of stream reached");
|
||||
}
|
||||
|
||||
|
|
|
@ -0,0 +1,38 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { PullAudioOutputStream } from "./Audio/AudioOutputStream";
|
||||
|
||||
/**
|
||||
* Defines contents of received message/events.
|
||||
* @class ActivityReceivedEventArgs
|
||||
*/
|
||||
export class ActivityReceivedEventArgs {
|
||||
private privActivity: any;
|
||||
private privAudioStream: PullAudioOutputStream;
|
||||
|
||||
/**
|
||||
* Creates and initializes an instance of this class.
|
||||
* @constructor
|
||||
* @param {any} activity - The activity..
|
||||
*/
|
||||
public constructor(activity: any, audioStream?: PullAudioOutputStream) {
|
||||
this.privActivity = activity;
|
||||
this.privAudioStream = audioStream;
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets the received activity
|
||||
* @member ActivityReceivedEventArgs.prototype.activity
|
||||
* @function
|
||||
* @public
|
||||
* @returns {any} the received activity.
|
||||
*/
|
||||
public get activity(): any {
|
||||
return this.privActivity;
|
||||
}
|
||||
|
||||
public get audioStream(): PullAudioOutputStream {
|
||||
return this.privAudioStream;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,186 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { createNoDashGuid } from "../../../src/common/Guid";
|
||||
import {
|
||||
ChunkedArrayBufferStream,
|
||||
IStreamChunk,
|
||||
Promise,
|
||||
PromiseHelper,
|
||||
Stream,
|
||||
StreamReader,
|
||||
} from "../../common/Exports";
|
||||
import { AudioStreamFormat } from "../Exports";
|
||||
import { AudioStreamFormatImpl } from "./AudioStreamFormat";
|
||||
|
||||
export const bufferSize: number = 4096;
|
||||
|
||||
/**
|
||||
* Represents audio input stream used for custom audio input configurations.
|
||||
* @class AudioInputStream
|
||||
*/
|
||||
export abstract class AudioOutputStream {
|
||||
|
||||
/**
|
||||
* Creates and initializes an instance.
|
||||
* @constructor
|
||||
*/
|
||||
protected constructor() { }
|
||||
|
||||
/**
|
||||
* Creates a memory backed PullAudioOutputStream with the specified audio format.
|
||||
* @member AudioInputStream.createPullStream
|
||||
* @function
|
||||
* @public
|
||||
* @param {AudioStreamFormat} format - The audio data format in which audio will be
|
||||
* written to the push audio stream's write() method (currently only support 16 kHz 16bit mono PCM).
|
||||
* @returns {PullAudioOutputStream} The audio input stream being created.
|
||||
*/
|
||||
public static createPullStream(format?: AudioStreamFormat): PullAudioOutputStream {
|
||||
return PullAudioOutputStream.create(format);
|
||||
}
|
||||
|
||||
/**
|
||||
* Explicitly frees any external resource attached to the object
|
||||
* @member AudioInputStream.prototype.close
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public abstract close(): void;
|
||||
}
|
||||
|
||||
/**
|
||||
* Represents memory backed push audio input stream used for custom audio input configurations.
|
||||
* @class PullAudioOutputStream
|
||||
*/
|
||||
// tslint:disable-next-line:max-classes-per-file
|
||||
export abstract class PullAudioOutputStream extends AudioOutputStream {
|
||||
|
||||
/**
|
||||
* Creates a memory backed PullAudioOutputStream with the specified audio format.
|
||||
* @member PullAudioOutputStream.create
|
||||
* @function
|
||||
* @public
|
||||
* @param {AudioStreamFormat} format - The audio data format in which audio will be written to the
|
||||
* push audio stream's write() method (currently only support 16 kHz 16bit mono PCM).
|
||||
* @returns {PullAudioOutputStream} The push audio input stream being created.
|
||||
*/
|
||||
public static create(format?: AudioStreamFormat): PullAudioOutputStream {
|
||||
return new PullAudioOutputStreamImpl(bufferSize, format);
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads audio data from the internal buffer.
|
||||
* @member PullAudioOutputStream.prototype.read
|
||||
* @function
|
||||
* @public
|
||||
* @returns {Promise<ArrayBuffer>} Audio buffer data.
|
||||
*/
|
||||
public abstract read(): Promise<ArrayBuffer>;
|
||||
|
||||
/**
|
||||
* Closes the stream.
|
||||
* @member PullAudioOutputStream.prototype.close
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public abstract close(): void;
|
||||
}
|
||||
|
||||
/**
|
||||
* Represents memory backed push audio input stream used for custom audio input configurations.
|
||||
* @private
|
||||
* @class PullAudioOutputStreamImpl
|
||||
*/
|
||||
// tslint:disable-next-line:max-classes-per-file
|
||||
export class PullAudioOutputStreamImpl extends PullAudioOutputStream {
|
||||
|
||||
private privFormat: AudioStreamFormatImpl;
|
||||
private privId: string;
|
||||
private privStream: Stream<ArrayBuffer>;
|
||||
private streamReader: StreamReader<ArrayBuffer>;
|
||||
|
||||
/**
|
||||
* Creates and initalizes an instance with the given values.
|
||||
* @constructor
|
||||
* @param {AudioStreamFormat} format - The audio stream format.
|
||||
*/
|
||||
public constructor(chunkSize: number, format?: AudioStreamFormat) {
|
||||
super();
|
||||
if (format === undefined) {
|
||||
this.privFormat = AudioStreamFormatImpl.getDefaultInputFormat();
|
||||
} else {
|
||||
this.privFormat = format as AudioStreamFormatImpl;
|
||||
}
|
||||
|
||||
this.privId = createNoDashGuid();
|
||||
this.privStream = new ChunkedArrayBufferStream(chunkSize);
|
||||
this.streamReader = this.privStream.getReader();
|
||||
}
|
||||
|
||||
/**
|
||||
* Format information for the audio
|
||||
*/
|
||||
public get format(): AudioStreamFormat {
|
||||
return this.privFormat;
|
||||
}
|
||||
|
||||
/**
|
||||
* Checks if the stream is closed
|
||||
* @member PullAudioOutputStreamImpl.prototype.isClosed
|
||||
* @property
|
||||
* @public
|
||||
*/
|
||||
public get isClosed(): boolean {
|
||||
return this.privStream.isClosed;
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets the id of the stream
|
||||
* @member PullAudioOutputStreamImpl.prototype.id
|
||||
* @property
|
||||
* @public
|
||||
*/
|
||||
public get id(): string {
|
||||
return this.privId;
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads data from the buffer
|
||||
* @member PullAudioOutputStreamImpl.prototype.read
|
||||
* @function
|
||||
* @public
|
||||
* @param {ArrayBuffer} dataBuffer - The audio buffer of which this function will make a copy.
|
||||
*/
|
||||
public read(): Promise<ArrayBuffer> {
|
||||
return this.streamReader.read()
|
||||
.onSuccessContinueWithPromise<ArrayBuffer>((chunk: IStreamChunk<ArrayBuffer>) => {
|
||||
return PromiseHelper.fromResult(chunk.buffer);
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Writes the audio data specified by making an internal copy of the data.
|
||||
* @member PullAudioOutputStreamImpl.prototype.write
|
||||
* @function
|
||||
* @public
|
||||
* @param {ArrayBuffer} dataBuffer - The audio buffer of which this function will make a copy.
|
||||
*/
|
||||
public write(dataBuffer: ArrayBuffer): void {
|
||||
this.privStream.writeStreamChunk({
|
||||
buffer: dataBuffer,
|
||||
isEnd: false,
|
||||
timeReceived: Date.now()
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Closes the stream.
|
||||
* @member PullAudioOutputStreamImpl.prototype.close
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public close(): void {
|
||||
this.privStream.close();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,137 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { InvalidOperationError } from "../../common/Error";
|
||||
import { AudioStreamFormat, PullAudioInputStreamCallback } from "../Exports";
|
||||
import { AudioStreamFormatImpl } from "./AudioStreamFormat";
|
||||
|
||||
type AudioDataTypedArray = Int8Array | Uint8Array | Int16Array | Uint16Array | Int32Array | Uint32Array;
|
||||
/**
|
||||
* Base audio player class
|
||||
* TODO: Plays only PCM for now.
|
||||
* @class
|
||||
*/
|
||||
export class BaseAudioPlayer {
|
||||
|
||||
private audioContext: AudioContext = null;
|
||||
private gainNode: GainNode = null;
|
||||
private audioFormat: AudioStreamFormatImpl;
|
||||
private autoUpdateBufferTimer: any = 0;
|
||||
private samples: Float32Array;
|
||||
private startTime: number;
|
||||
|
||||
/**
|
||||
* Creates and initializes an instance of this class.
|
||||
* @constructor
|
||||
*/
|
||||
public constructor(audioFormat: AudioStreamFormat) {
|
||||
this.init(audioFormat);
|
||||
}
|
||||
|
||||
/**
|
||||
* play Audio sample
|
||||
* @param newAudioData audio data to be played.
|
||||
*/
|
||||
public playAudioSample(newAudioData: ArrayBuffer): void {
|
||||
this.ensureInitializedContext();
|
||||
const audioData = this.formatAudioData(newAudioData);
|
||||
const newSamplesData = new Float32Array(this.samples.length + audioData.length);
|
||||
newSamplesData.set(this.samples, 0);
|
||||
newSamplesData.set(audioData, this.samples.length);
|
||||
this.samples = newSamplesData;
|
||||
}
|
||||
/**
|
||||
* stops audio and clears the buffers
|
||||
*/
|
||||
public stopAudio(): void {
|
||||
if (this.audioContext !== null) {
|
||||
this.samples = new Float32Array();
|
||||
clearInterval(this.autoUpdateBufferTimer);
|
||||
this.audioContext.close();
|
||||
this.audioContext = null;
|
||||
}
|
||||
}
|
||||
|
||||
private init(audioFormat: AudioStreamFormat): void {
|
||||
this.audioFormat = audioFormat as AudioStreamFormatImpl;
|
||||
this.samples = new Float32Array();
|
||||
}
|
||||
|
||||
private ensureInitializedContext(): void {
|
||||
if (this.audioContext === null) {
|
||||
this.createAudioContext();
|
||||
const timerPeriod = 200;
|
||||
this.autoUpdateBufferTimer = setInterval(() => {
|
||||
this.updateAudioBuffer();
|
||||
}, timerPeriod);
|
||||
}
|
||||
}
|
||||
|
||||
private createAudioContext(): void {
|
||||
// new ((window as any).AudioContext || (window as any).webkitAudioContext)();
|
||||
this.audioContext = new AudioContext();
|
||||
|
||||
// TODO: Various examples shows this gain node, it does not seem to be needed unless we plan
|
||||
// to control the volume, not likely
|
||||
this.gainNode = this.audioContext.createGain();
|
||||
this.gainNode.gain.value = 1;
|
||||
this.gainNode.connect(this.audioContext.destination);
|
||||
this.startTime = this.audioContext.currentTime;
|
||||
}
|
||||
|
||||
private formatAudioData(audioData: ArrayBuffer): Float32Array {
|
||||
switch (this.audioFormat.bitsPerSample) {
|
||||
case 8:
|
||||
return this.formatArrayBuffer(new Int8Array(audioData), 128);
|
||||
case 16:
|
||||
return this.formatArrayBuffer(new Int16Array(audioData), 32768);
|
||||
case 32:
|
||||
return this.formatArrayBuffer(new Int32Array(audioData), 2147483648);
|
||||
default:
|
||||
throw new InvalidOperationError("Only WAVE_FORMAT_PCM (8/16/32 bps) format supported at this time");
|
||||
}
|
||||
}
|
||||
|
||||
private formatArrayBuffer(audioData: AudioDataTypedArray, maxValue: number): Float32Array {
|
||||
const float32Data = new Float32Array(audioData.length);
|
||||
for (let i = 0; i < audioData.length; i++) {
|
||||
float32Data[i] = audioData[i] / maxValue;
|
||||
}
|
||||
return float32Data;
|
||||
}
|
||||
|
||||
private updateAudioBuffer(): void {
|
||||
if (this.samples.length === 0) {
|
||||
return;
|
||||
}
|
||||
|
||||
const channelCount = this.audioFormat.channels;
|
||||
const bufferSource = this.audioContext.createBufferSource();
|
||||
const frameCount = this.samples.length / channelCount;
|
||||
const audioBuffer = this.audioContext.createBuffer(channelCount, frameCount, this.audioFormat.samplesPerSec);
|
||||
|
||||
// TODO: Should we do the conversion in the pushAudioSample instead?
|
||||
for (let channel = 0; channel < channelCount; channel++) {
|
||||
// Fill in individual channel data
|
||||
let channelOffset = channel;
|
||||
const audioData = audioBuffer.getChannelData(channel);
|
||||
for (let i = 0; i < this.samples.length; i++, channelOffset += channelCount) {
|
||||
audioData[i] = this.samples[channelOffset];
|
||||
}
|
||||
}
|
||||
|
||||
if (this.startTime < this.audioContext.currentTime) {
|
||||
this.startTime = this.audioContext.currentTime;
|
||||
}
|
||||
|
||||
bufferSource.buffer = audioBuffer;
|
||||
bufferSource.connect(this.gainNode);
|
||||
bufferSource.start(this.startTime);
|
||||
|
||||
// Make sure we play the next sample after the current one.
|
||||
this.startTime += audioBuffer.duration;
|
||||
|
||||
// Clear the samples for the next pushed data.
|
||||
this.samples = new Float32Array();
|
||||
}
|
||||
}
|
|
@ -0,0 +1,65 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { Contracts } from "./Contracts";
|
||||
import { DialogServiceConfigImpl } from "./DialogServiceConfig";
|
||||
import { PropertyId } from "./Exports";
|
||||
|
||||
/**
|
||||
* Class that defines configurations for the dialog service connector object for using a Bot Framework backend.
|
||||
* @class BotFrameworkConfig
|
||||
*/
|
||||
export class BotFrameworkConfig extends DialogServiceConfigImpl {
|
||||
|
||||
/**
|
||||
* Creates an instance of BotFrameworkConfig.
|
||||
*/
|
||||
public constructor() {
|
||||
super();
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an instance of the bot framework config with the specified subscription and region.
|
||||
* @member BotFrameworkConfig.fromSubscription
|
||||
* @function
|
||||
* @public
|
||||
* @param subscription Subscription key associated with the bot
|
||||
* @param region The region name (see the <a href="https://aka.ms/csspeech/region">region page</a>).
|
||||
* @returns {BotFrameworkConfig} A new bot framework config.
|
||||
*/
|
||||
public static fromSubscription(subscription: string, region: string): BotFrameworkConfig {
|
||||
Contracts.throwIfNullOrWhitespace(subscription, "subscription");
|
||||
Contracts.throwIfNullOrWhitespace(region, "region");
|
||||
|
||||
const botFrameworkConfig: BotFrameworkConfig = new DialogServiceConfigImpl();
|
||||
botFrameworkConfig.setProperty(PropertyId.Conversation_DialogType, "bot_framework");
|
||||
botFrameworkConfig.setProperty(PropertyId.SpeechServiceConnection_Key, subscription);
|
||||
botFrameworkConfig.setProperty(PropertyId.SpeechServiceConnection_Region, region);
|
||||
return botFrameworkConfig;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an instance of the bot framework config with the specified authorization token and region.
|
||||
* Note: The caller needs to ensure that the authorization token is valid. Before the authorization token
|
||||
* expires, the caller needs to refresh it by calling this setter with a new valid token.
|
||||
* As configuration values are copied when creating a new recognizer, the new token value will not apply to recognizers that have already been created.
|
||||
* For recognizers that have been created before, you need to set authorization token of the corresponding recognizer
|
||||
* to refresh the token. Otherwise, the recognizers will encounter errors during recognition.
|
||||
* @member BotFrameworkConfig.fromAuthorizationToken
|
||||
* @function
|
||||
* @public
|
||||
* @param authorizationToken The authorization token associated with the bot
|
||||
* @param region The region name (see the <a href="https://aka.ms/csspeech/region">region page</a>).
|
||||
* @returns {BotFrameworkConfig} A new bot framework config.
|
||||
*/
|
||||
public static fromAuthorizationToken(authorizationToken: string, region: string): BotFrameworkConfig {
|
||||
Contracts.throwIfNullOrWhitespace(authorizationToken, "authorizationToken");
|
||||
Contracts.throwIfNullOrWhitespace(region, "region");
|
||||
|
||||
const botFrameworkConfig: BotFrameworkConfig = new DialogServiceConfigImpl();
|
||||
botFrameworkConfig.setProperty(PropertyId.Conversation_DialogType, "bot_framework");
|
||||
botFrameworkConfig.setProperty(PropertyId.SpeechServiceAuthorization_Token, authorizationToken);
|
||||
botFrameworkConfig.setProperty(PropertyId.SpeechServiceConnection_Region, region);
|
||||
return botFrameworkConfig;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,203 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { Contracts } from "./Contracts";
|
||||
import { PropertyCollection, PropertyId, ServicePropertyChannel, SpeechConfigImpl } from "./Exports";
|
||||
|
||||
/**
|
||||
* Class that defines base configurations for dialog service connector
|
||||
* @class DialogServiceConfig
|
||||
*/
|
||||
export abstract class DialogServiceConfig {
|
||||
|
||||
/**
|
||||
* Creates an instance of DialogService config.
|
||||
* @constructor
|
||||
*/
|
||||
protected constructor() { }
|
||||
|
||||
/**
|
||||
* Sets an arbitrary property.
|
||||
* @member DialogServiceConfig.prototype.setProperty
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} name - The name of the property to set.
|
||||
* @param {string} value - The new value of the property.
|
||||
*/
|
||||
public abstract setProperty(name: string, value: string): void;
|
||||
|
||||
/**
|
||||
* Returns the current value of an arbitrary property.
|
||||
* @member DialogServiceConfig.prototype.getProperty
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} name - The name of the property to query.
|
||||
* @param {string} def - The value to return in case the property is not known.
|
||||
* @returns {string} The current value, or provided default, of the given property.
|
||||
*/
|
||||
public abstract getProperty(name: string, def?: string): string;
|
||||
|
||||
/**
|
||||
* @member DialogServiceConfig.prototype.setServiceProperty
|
||||
* @function
|
||||
* @public
|
||||
* @param {name} The name of the property.
|
||||
* @param {value} Value to set.
|
||||
* @param {channel} The channel used to pass the specified property to service.
|
||||
* @summary Sets a property value that will be passed to service using the specified channel.
|
||||
*/
|
||||
public abstract setServiceProperty(name: string, value: string, channel: ServicePropertyChannel): void;
|
||||
|
||||
/**
|
||||
* Sets the proxy configuration.
|
||||
* Only relevant in Node.js environments.
|
||||
* Added in version 1.4.0.
|
||||
* @param proxyHostName The host name of the proxy server.
|
||||
* @param proxyPort The port number of the proxy server.
|
||||
*/
|
||||
public abstract setProxy(proxyHostName: string, proxyPort: number): void;
|
||||
|
||||
/**
|
||||
* Sets the proxy configuration.
|
||||
* Only relevant in Node.js environments.
|
||||
* Added in version 1.4.0.
|
||||
* @param proxyHostName The host name of the proxy server, without the protocol scheme (http://)
|
||||
* @param porxyPort The port number of the proxy server.
|
||||
* @param proxyUserName The user name of the proxy server.
|
||||
* @param proxyPassword The password of the proxy server.
|
||||
*/
|
||||
public abstract setProxy(proxyHostName: string, proxyPort: number, proxyUserName: string, proxyPassword: string): void;
|
||||
|
||||
/**
|
||||
* Returns the configured language.
|
||||
* @member DialogServiceConfig.prototype.speechRecognitionLanguage
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public abstract get speechRecognitionLanguage(): string;
|
||||
|
||||
/**
|
||||
* Gets/Sets the input language.
|
||||
* @member DialogServiceConfig.prototype.speechRecognitionLanguage
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} value - The language to use for recognition.
|
||||
*/
|
||||
public abstract set speechRecognitionLanguage(value: string);
|
||||
|
||||
/**
|
||||
* Not used in DialogServiceConfig
|
||||
* @member DialogServiceConfig.applicationId
|
||||
*/
|
||||
public applicationId: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Dialog Service configuration.
|
||||
* @class DialogServiceConfigImpl
|
||||
*/
|
||||
// tslint:disable-next-line:max-classes-per-file
|
||||
export class DialogServiceConfigImpl extends DialogServiceConfig {
|
||||
|
||||
private privSpeechConfig: SpeechConfigImpl;
|
||||
|
||||
/**
|
||||
* Creates an instance of dialogService config.
|
||||
*/
|
||||
public constructor() {
|
||||
super();
|
||||
this.privSpeechConfig = new SpeechConfigImpl();
|
||||
}
|
||||
|
||||
/**
|
||||
* Provides access to custom properties.
|
||||
* @member DialogServiceConfigImpl.prototype.properties
|
||||
* @function
|
||||
* @public
|
||||
* @returns {PropertyCollection} The properties.
|
||||
*/
|
||||
public get properties(): PropertyCollection {
|
||||
return this.privSpeechConfig.properties;
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets the speech recognition language.
|
||||
* @member DialogServiceConfigImpl.prototype.speechRecognitionLanguage
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public get speechRecognitionLanguage(): string {
|
||||
return this.privSpeechConfig.speechRecognitionLanguage;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the speech recognition language.
|
||||
* @member DialogServiceConfigImpl.prototype.speechRecognitionLanguage
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} value - The language to set.
|
||||
*/
|
||||
public set speechRecognitionLanguage(value: string) {
|
||||
Contracts.throwIfNullOrWhitespace(value, "value");
|
||||
this.privSpeechConfig.speechRecognitionLanguage = value;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets a named property as value
|
||||
* @member DialogServiceConfigImpl.prototype.setProperty
|
||||
* @function
|
||||
* @public
|
||||
* @param {PropertyId | string} name - The property to set.
|
||||
* @param {string} value - The value.
|
||||
*/
|
||||
public setProperty(name: string | PropertyId, value: string): void {
|
||||
this.privSpeechConfig.setProperty(name, value);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets a named property as value
|
||||
* @member DialogServiceConfigImpl.prototype.getProperty
|
||||
* @function
|
||||
* @public
|
||||
* @param {PropertyId | string} name - The property to get.
|
||||
* @param {string} def - The default value to return in case the property is not known.
|
||||
* @returns {string} The current value, or provided default, of the given property.
|
||||
*/
|
||||
public getProperty(name: string | PropertyId, def?: string): string {
|
||||
return this.privSpeechConfig.getProperty(name);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the proxy configuration.
|
||||
* Only relevant in Node.js environments.
|
||||
* Added in version 1.4.0.
|
||||
* @param proxyHostName The host name of the proxy server, without the protocol scheme (http://)
|
||||
* @param proxyPort The port number of the proxy server.
|
||||
* @param proxyUserName The user name of the proxy server.
|
||||
* @param proxyPassword The password of the proxy server.
|
||||
*/
|
||||
public setProxy(proxyHostName: string, proxyPort: number, proxyUserName?: string, proxyPassword?: string): void {
|
||||
this.setProperty(PropertyId.SpeechServiceConnection_ProxyHostName, proxyHostName);
|
||||
this.setProperty(PropertyId.SpeechServiceConnection_ProxyPort, `${proxyPort}`);
|
||||
if (proxyUserName) {
|
||||
this.setProperty(PropertyId.SpeechServiceConnection_ProxyUserName, proxyUserName);
|
||||
}
|
||||
if (proxyPassword) {
|
||||
this.setProperty(PropertyId.SpeechServiceConnection_ProxyPassword, proxyPassword);
|
||||
}
|
||||
}
|
||||
|
||||
public setServiceProperty(name: string, value: string, channel: import("./ServicePropertyChannel").ServicePropertyChannel): void {
|
||||
this.privSpeechConfig.setServiceProperty(name, value, channel);
|
||||
}
|
||||
|
||||
/**
|
||||
* Dispose of associated resources.
|
||||
* @member DialogServiceConfigImpl.prototype.close
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public close(): void {
|
||||
return;
|
||||
}
|
||||
}
|
|
@ -0,0 +1,246 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { DialogConnectionFactory } from "../common.speech/DialogConnectorFactory";
|
||||
import {
|
||||
DialogServiceAdapter,
|
||||
IAgentConfig,
|
||||
IAuthentication,
|
||||
IConnectionFactory,
|
||||
RecognitionMode,
|
||||
RecognizerConfig,
|
||||
ServiceRecognizerBase,
|
||||
SpeechServiceConfig,
|
||||
} from "../common.speech/Exports";
|
||||
import { ActivityReceivedEventArgs } from "./ActivityReceivedEventArgs";
|
||||
import { AudioConfigImpl } from "./Audio/AudioConfig";
|
||||
import { Contracts } from "./Contracts";
|
||||
import { DialogServiceConfig, DialogServiceConfigImpl } from "./DialogServiceConfig";
|
||||
import {
|
||||
AudioConfig,
|
||||
PropertyCollection,
|
||||
Recognizer,
|
||||
SpeechRecognitionCanceledEventArgs,
|
||||
SpeechRecognitionEventArgs,
|
||||
SpeechRecognitionResult
|
||||
} from "./Exports";
|
||||
import { PropertyId } from "./PropertyId";
|
||||
|
||||
/**
|
||||
* Dialog Service Connector
|
||||
* @class DialogServiceConnector
|
||||
*/
|
||||
export class DialogServiceConnector extends Recognizer {
|
||||
private privIsDisposed: boolean;
|
||||
|
||||
/**
|
||||
* Initializes an instance of the DialogServiceConnector.
|
||||
* @constructor
|
||||
* @param {DialogServiceConfig} dialogConfig - Set of properties to configure this recognizer.
|
||||
* @param {AudioConfig} audioConfig - An optional audio config associated with the recognizer
|
||||
*/
|
||||
public constructor(dialogConfig: DialogServiceConfig, audioConfig?: AudioConfig) {
|
||||
const dialogServiceConfigImpl = dialogConfig as DialogServiceConfigImpl;
|
||||
Contracts.throwIfNull(dialogConfig, "dialogConfig");
|
||||
|
||||
super(audioConfig, dialogServiceConfigImpl.properties, new DialogConnectionFactory());
|
||||
|
||||
this.privIsDisposed = false;
|
||||
this.privProperties = dialogServiceConfigImpl.properties.clone();
|
||||
|
||||
const agentConfig = this.buildAgentConfig();
|
||||
this.privReco.agentConfig.set(agentConfig);
|
||||
}
|
||||
|
||||
/**
|
||||
* The event recognizing signals that an intermediate recognition result is received.
|
||||
* @member DialogServiceConnector.prototype.recognizing
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public recognizing: (sender: DialogServiceConnector, event: SpeechRecognitionEventArgs) => void;
|
||||
|
||||
/**
|
||||
* The event recognized signals that a final recognition result is received.
|
||||
* @member DialogServiceConfig.prototype.recognized
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public recognized: (sender: DialogServiceConnector, event: SpeechRecognitionEventArgs) => void;
|
||||
|
||||
/**
|
||||
* The event canceled signals that an error occurred during recognition.
|
||||
* @member DialogServiceConnector.prototype.canceled
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public canceled: (sender: DialogServiceConnector, event: SpeechRecognitionCanceledEventArgs) => void;
|
||||
|
||||
/**
|
||||
* The event activityReceived signals that an activity has been received.
|
||||
* @member DialogServiceConnector.prototype.activityReceived
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public activityReceived: (sender: DialogServiceConnector, event: ActivityReceivedEventArgs) => void;
|
||||
|
||||
/**
|
||||
* Starts a connection to the service.
|
||||
* Users can optionally call connect() to manually set up a connection in advance, before starting interactions.
|
||||
*
|
||||
* Note: On return, the connection might not be ready yet. Please subscribe to the Connected event to
|
||||
* be notified when the connection is established.
|
||||
* @member DialogServiceConnector.prototype.connect
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public connect(): void {
|
||||
this.privReco.connect();
|
||||
}
|
||||
|
||||
/**
|
||||
* Closes the connection the service.
|
||||
* Users can optionally call disconnect() to manually shutdown the connection of the associated DialogServiceConnector.
|
||||
*
|
||||
* If disconnect() is called during a recognition, recognition will fail and cancel with an error.
|
||||
*/
|
||||
public disconnect(): void {
|
||||
this.privReco.disconnect();
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets the authorization token used to communicate with the service.
|
||||
* @member DialogServiceConnector.prototype.authorizationToken
|
||||
* @function
|
||||
* @public
|
||||
* @returns {string} Authorization token.
|
||||
*/
|
||||
public get authorizationToken(): string {
|
||||
return this.properties.getProperty(PropertyId.SpeechServiceAuthorization_Token);
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the authorization token used to communicate with the service.
|
||||
* @member DialogServiceConnector.prototype.authorizationToken
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} token - Authorization token.
|
||||
*/
|
||||
public set authorizationToken(token: string) {
|
||||
Contracts.throwIfNullOrWhitespace(token, "token");
|
||||
this.properties.setProperty(PropertyId.SpeechServiceAuthorization_Token, token);
|
||||
}
|
||||
|
||||
/**
|
||||
* The collection of properties and their values defined for this DialogServiceConnector.
|
||||
* @member DialogServiceConnector.prototype.properties
|
||||
* @function
|
||||
* @public
|
||||
* @returns {PropertyCollection} The collection of properties and their values defined for this DialogServiceConnector.
|
||||
*/
|
||||
public get properties(): PropertyCollection {
|
||||
return this.privProperties;
|
||||
}
|
||||
|
||||
/**
|
||||
* Starts recognition and stops after the first utterance is recognized.
|
||||
* @member DialogServiceConnector.prototype.listenOnceAsync
|
||||
* @function
|
||||
* @public
|
||||
* @param cb - Callback that received the result when the reco has completed.
|
||||
* @param err - Callback invoked in case of an error.
|
||||
*/
|
||||
public listenOnceAsync(cb?: (e: SpeechRecognitionResult) => void, err?: (e: string) => void): void {
|
||||
try {
|
||||
Contracts.throwIfDisposed(this.privIsDisposed);
|
||||
|
||||
this.connect();
|
||||
|
||||
this.implRecognizerStop();
|
||||
|
||||
this.implRecognizerStart(
|
||||
RecognitionMode.Conversation,
|
||||
(e: SpeechRecognitionResult) => {
|
||||
this.implRecognizerStop();
|
||||
|
||||
if (!!cb) {
|
||||
cb(e);
|
||||
}
|
||||
},
|
||||
(e: string) => {
|
||||
this.implRecognizerStop();
|
||||
if (!!err) {
|
||||
err(e);
|
||||
}
|
||||
});
|
||||
} catch (error) {
|
||||
if (!!err) {
|
||||
if (error instanceof Error) {
|
||||
const typedError: Error = error as Error;
|
||||
err(typedError.name + ": " + typedError.message);
|
||||
} else {
|
||||
err(error);
|
||||
}
|
||||
}
|
||||
|
||||
// Destroy the recognizer.
|
||||
this.dispose(true);
|
||||
}
|
||||
}
|
||||
|
||||
public sendActivityAsync(activity: string): void {
|
||||
this.privReco.sendMessage(activity);
|
||||
}
|
||||
|
||||
/**
|
||||
* closes all external resources held by an instance of this class.
|
||||
* @member DialogServiceConnector.prototype.close
|
||||
* @function
|
||||
* @public
|
||||
*/
|
||||
public close(): void {
|
||||
Contracts.throwIfDisposed(this.privIsDisposed);
|
||||
|
||||
this.dispose(true);
|
||||
}
|
||||
|
||||
protected dispose(disposing: boolean): boolean {
|
||||
if (this.privIsDisposed) {
|
||||
return;
|
||||
}
|
||||
|
||||
if (disposing) {
|
||||
this.implRecognizerStop();
|
||||
this.privIsDisposed = true;
|
||||
super.dispose(disposing);
|
||||
}
|
||||
}
|
||||
|
||||
protected createRecognizerConfig(speechConfig: SpeechServiceConfig): RecognizerConfig {
|
||||
return new RecognizerConfig(speechConfig, this.privProperties);
|
||||
}
|
||||
|
||||
protected createServiceRecognizer(
|
||||
authentication: IAuthentication,
|
||||
connectionFactory: IConnectionFactory,
|
||||
audioConfig: AudioConfig,
|
||||
recognizerConfig: RecognizerConfig): ServiceRecognizerBase {
|
||||
|
||||
const audioSource: AudioConfigImpl = audioConfig as AudioConfigImpl;
|
||||
|
||||
return new DialogServiceAdapter(authentication, connectionFactory, audioSource, recognizerConfig, this);
|
||||
}
|
||||
|
||||
private buildAgentConfig(): IAgentConfig {
|
||||
const communicationType = this.properties.getProperty("Conversation_Communication_Type", "Default");
|
||||
|
||||
return {
|
||||
botInfo: {
|
||||
commType: communicationType,
|
||||
connectionId: this.properties.getProperty(PropertyId.Conversation_ApplicationId),
|
||||
conversationId: undefined
|
||||
},
|
||||
version: 0.2
|
||||
};
|
||||
}
|
||||
}
|
|
@ -4,6 +4,7 @@
|
|||
export { AudioConfig } from "./Audio/AudioConfig";
|
||||
export { AudioStreamFormat } from "./Audio/AudioStreamFormat";
|
||||
export { AudioInputStream, PullAudioInputStream, PushAudioInputStream } from "./Audio/AudioInputStream";
|
||||
export { AudioOutputStream, PullAudioOutputStream} from "./Audio/AudioOutputStream";
|
||||
export { CancellationReason } from "./CancellationReason";
|
||||
export { PullAudioInputStreamCallback } from "./Audio/PullAudioInputStreamCallback";
|
||||
export { KeywordRecognitionModel } from "./KeywordRecognitionModel";
|
||||
|
@ -22,7 +23,7 @@ export { TranslationSynthesisEventArgs } from "./TranslationSynthesisEventArgs";
|
|||
export { TranslationRecognitionResult } from "./TranslationRecognitionResult";
|
||||
export { TranslationSynthesisResult } from "./TranslationSynthesisResult";
|
||||
export { ResultReason } from "./ResultReason";
|
||||
export { SpeechConfig } from "./SpeechConfig";
|
||||
export { SpeechConfig, SpeechConfigImpl } from "./SpeechConfig";
|
||||
export { SpeechTranslationConfig } from "./SpeechTranslationConfig";
|
||||
export { PropertyCollection } from "./PropertyCollection";
|
||||
export { PropertyId } from "./PropertyId";
|
||||
|
@ -40,5 +41,11 @@ export { CancellationErrorCode } from "./CancellationErrorCodes";
|
|||
export { ConnectionEventArgs } from "./ConnectionEventArgs";
|
||||
export { Connection } from "./Connection";
|
||||
export { PhraseListGrammar } from "./PhraseListGrammar";
|
||||
export { DialogServiceConfig } from "./DialogServiceConfig";
|
||||
export { BotFrameworkConfig } from "./BotFrameworkConfig";
|
||||
export { SpeechCommandsConfig } from "./SpeechCommandsConfig";
|
||||
export { DialogServiceConnector } from "./DialogServiceConnector";
|
||||
export { ActivityReceivedEventArgs } from "./ActivityReceivedEventArgs";
|
||||
export { ServicePropertyChannel } from "./ServicePropertyChannel";
|
||||
export { ProfanityOption } from "./ProfanityOption";
|
||||
export { BaseAudioPlayer } from "./Audio/BaseAudioPlayer";
|
||||
|
|
|
@ -258,4 +258,27 @@ export enum PropertyId {
|
|||
*/
|
||||
SpeechServiceResponse_TranslationRequestStablePartialResult,
|
||||
|
||||
/**
|
||||
* Identifier used to connect to the backend service.
|
||||
* @member PropertyId.Conversation_ApplicationId
|
||||
*/
|
||||
Conversation_ApplicationId,
|
||||
|
||||
/**
|
||||
* Type of dialog backend to connect to.
|
||||
* @member PropertyId.Conversation_DialogType
|
||||
*/
|
||||
Conversation_DialogType,
|
||||
|
||||
/**
|
||||
* Silence timeout for listening
|
||||
* @member PropertyId.Conversation_Initial_Silence_Timeout
|
||||
*/
|
||||
Conversation_Initial_Silence_Timeout,
|
||||
|
||||
/**
|
||||
* From Id to add to speech recognition activities.
|
||||
* @member PropertyId.Conversation_From_Id
|
||||
*/
|
||||
Conversation_From_Id
|
||||
}
|
||||
|
|
|
@ -96,7 +96,7 @@ export abstract class Recognizer {
|
|||
/**
|
||||
* @Internal
|
||||
* Internal data member to support fromRecognizer* pattern methods on other classes.
|
||||
* Do not use externally, object returned will change without warning or notive.
|
||||
* Do not use externally, object returned will change without warning or notice.
|
||||
*/
|
||||
public get internalData(): object {
|
||||
return this.privReco;
|
||||
|
@ -168,7 +168,7 @@ export abstract class Recognizer {
|
|||
audioConfig: AudioConfig,
|
||||
recognizerConfig: RecognizerConfig): ServiceRecognizerBase;
|
||||
|
||||
// Does the generic recognizer setup that is common accross all recognizer types.
|
||||
// Does the generic recognizer setup that is common across all recognizer types.
|
||||
protected implCommonRecognizerSetup(): void {
|
||||
|
||||
let osPlatform = (typeof window !== "undefined") ? "Browser" : "Node";
|
||||
|
|
|
@ -0,0 +1,94 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import { Contracts } from "./Contracts";
|
||||
import { DialogServiceConfigImpl } from "./DialogServiceConfig";
|
||||
import { PropertyId } from "./Exports";
|
||||
|
||||
/**
|
||||
* Class that defines configurations for the dialog service connector object for using a SpeechCommands backend.
|
||||
* @class SpeechCommandsConfig
|
||||
*/
|
||||
export class SpeechCommandsConfig extends DialogServiceConfigImpl {
|
||||
|
||||
/**
|
||||
* Creates an instance of SpeechCommandsConfig.
|
||||
*/
|
||||
public constructor() {
|
||||
super();
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an instance of the bot framework config with the specified subscription and region.
|
||||
* @member SpeechCommandsConfig.fromSubscription
|
||||
* @function
|
||||
* @public
|
||||
* @param applicationId Speech Commands application id.
|
||||
* @param subscription Subscription key associated with the bot
|
||||
* @param region The region name (see the <a href="https://aka.ms/csspeech/region">region page</a>).
|
||||
* @returns {SpeechCommandsConfig} A new bot framework config.
|
||||
*/
|
||||
public static fromSubscription(applicationId: string, subscription: string, region: string): SpeechCommandsConfig {
|
||||
Contracts.throwIfNullOrWhitespace(applicationId, "applicationId");
|
||||
Contracts.throwIfNullOrWhitespace(subscription, "subscription");
|
||||
Contracts.throwIfNullOrWhitespace(region, "region");
|
||||
|
||||
const speechCommandsConfig: SpeechCommandsConfig = new DialogServiceConfigImpl();
|
||||
speechCommandsConfig.setProperty(PropertyId.Conversation_DialogType, "custom_commands");
|
||||
speechCommandsConfig.setProperty(PropertyId.Conversation_ApplicationId, applicationId);
|
||||
speechCommandsConfig.setProperty(PropertyId.SpeechServiceConnection_Key, subscription);
|
||||
speechCommandsConfig.setProperty(PropertyId.SpeechServiceConnection_Region, region);
|
||||
return speechCommandsConfig;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates an instance of the bot framework config with the specified Speech Commands application id, authorization token and region.
|
||||
* Note: The caller needs to ensure that the authorization token is valid. Before the authorization token
|
||||
* expires, the caller needs to refresh it by calling this setter with a new valid token.
|
||||
* As configuration values are copied when creating a new recognizer, the new token value will not apply to recognizers that have already been created.
|
||||
* For recognizers that have been created before, you need to set authorization token of the corresponding recognizer
|
||||
* to refresh the token. Otherwise, the recognizers will encounter errors during recognition.
|
||||
* @member SpeechCommandsConfig.fromAuthorizationToken
|
||||
* @function
|
||||
* @public
|
||||
* @param applicationId Speech Commands application id.
|
||||
* @param authorizationToken The authorization token associated with the application.
|
||||
* @param region The region name (see the <a href="https://aka.ms/csspeech/region">region page</a>).
|
||||
* @returns {SpeechCommandsConfig} A new speech commands config.
|
||||
*/
|
||||
public static fromAuthorizationToken(applicationId: string, authorizationToken: string, region: string): SpeechCommandsConfig {
|
||||
Contracts.throwIfNullOrWhitespace(applicationId, "applicationId");
|
||||
Contracts.throwIfNullOrWhitespace(authorizationToken, "authorizationToken");
|
||||
Contracts.throwIfNullOrWhitespace(region, "region");
|
||||
|
||||
const speechCommandsConfig: SpeechCommandsConfig = new DialogServiceConfigImpl();
|
||||
speechCommandsConfig.setProperty(PropertyId.Conversation_DialogType, "custom_commands");
|
||||
speechCommandsConfig.setProperty(PropertyId.Conversation_ApplicationId, applicationId);
|
||||
speechCommandsConfig.setProperty(PropertyId.SpeechServiceAuthorization_Token, authorizationToken);
|
||||
speechCommandsConfig.setProperty(PropertyId.SpeechServiceConnection_Region, region);
|
||||
return speechCommandsConfig;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the corresponding backend application identifier.
|
||||
* @member SpeechCommandsConfig.prototype.Conversation_ApplicationId
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} value - The application identifier to set.
|
||||
*/
|
||||
public set applicationId(value: string) {
|
||||
Contracts.throwIfNullOrWhitespace(value, "value");
|
||||
this.setProperty(PropertyId.Conversation_ApplicationId, value);
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets the corresponding backend application identifier.
|
||||
* @member SpeechCommandsConfig.prototype.Conversation_ApplicationId
|
||||
* @function
|
||||
* @public
|
||||
* @param {string} value - The application identifier to get.
|
||||
*/
|
||||
public get applicationId(): string {
|
||||
return this.getProperty(PropertyId.Conversation_ApplicationId);
|
||||
}
|
||||
}
|
|
@ -295,7 +295,7 @@ export abstract class SpeechConfig {
|
|||
}
|
||||
|
||||
/**
|
||||
* @private
|
||||
* @public
|
||||
* @class SpeechConfigImpl
|
||||
*/
|
||||
// tslint:disable-next-line:max-classes-per-file
|
||||
|
|
|
@ -0,0 +1,137 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import {
|
||||
IAudioStreamNode,
|
||||
IStreamChunk,
|
||||
} from "../src/common/Exports";
|
||||
import {
|
||||
bufferSize,
|
||||
PullAudioOutputStreamImpl,
|
||||
} from "../src/sdk/Audio/AudioOutputStream";
|
||||
import { Settings } from "./Settings";
|
||||
|
||||
beforeAll(() => {
|
||||
// Override inputs, if necessary
|
||||
Settings.LoadSettings();
|
||||
});
|
||||
|
||||
// Test cases are run linerally, the only other mechanism to demark them in the output is to put a console line in each case and
|
||||
// report the name.
|
||||
// tslint:disable-next-line:no-console
|
||||
beforeEach(() => console.info("---------------------------------------Starting test case-----------------------------------"));
|
||||
|
||||
test("PullAudioOutputStreamImpl basic test", (done: jest.DoneCallback) => {
|
||||
const size: number = 256;
|
||||
const ps: PullAudioOutputStreamImpl = new PullAudioOutputStreamImpl(size);
|
||||
const ab: ArrayBuffer = new ArrayBuffer(size);
|
||||
|
||||
const abView: Uint8Array = new Uint8Array(ab);
|
||||
for (let i: number = 0; i < size; i++) {
|
||||
abView[i] = i % 256;
|
||||
}
|
||||
ps.write(abView);
|
||||
|
||||
let bytesRead: number = 0;
|
||||
ps.read().onSuccessContinueWith((audioBuffer: ArrayBuffer) => {
|
||||
try {
|
||||
expect(audioBuffer.byteLength).toBeGreaterThanOrEqual(size);
|
||||
expect(audioBuffer.byteLength).toBeLessThanOrEqual(size);
|
||||
const readView: Uint8Array = new Uint8Array(audioBuffer);
|
||||
for (let i: number = 0; i < audioBuffer.byteLength; i++) {
|
||||
expect(readView[i]).toEqual(bytesRead++ % 256);
|
||||
}
|
||||
} catch (error) {
|
||||
done.fail(error);
|
||||
}
|
||||
done();
|
||||
});
|
||||
});
|
||||
|
||||
test("PullAudioOutputStreamImpl multiple writes read after close", (done: jest.DoneCallback) => {
|
||||
const ps: PullAudioOutputStreamImpl = new PullAudioOutputStreamImpl(bufferSize);
|
||||
|
||||
const ab: ArrayBuffer = new ArrayBuffer(bufferSize * 4);
|
||||
const abView: Uint8Array = new Uint8Array(ab);
|
||||
for (let i: number = 0; i < bufferSize * 4; i++) {
|
||||
abView[i] = i % 256;
|
||||
}
|
||||
|
||||
let j: number = 0;
|
||||
for (j = 0; j < bufferSize * 4; j += 100) {
|
||||
ps.write(ab.slice(j, j + 100));
|
||||
}
|
||||
ps.close();
|
||||
|
||||
let bytesRead: number = 0;
|
||||
|
||||
const readLoop = () => {
|
||||
ps.read().onSuccessContinueWith((audioBuffer: ArrayBuffer) => {
|
||||
try {
|
||||
if (audioBuffer == null) {
|
||||
expect(bytesRead).toBeGreaterThanOrEqual(bufferSize * 4);
|
||||
expect(bytesRead).toBeLessThanOrEqual(bufferSize * 4);
|
||||
} else {
|
||||
expect(audioBuffer.byteLength).toBeGreaterThanOrEqual(bufferSize);
|
||||
expect(audioBuffer.byteLength).toBeLessThanOrEqual(bufferSize);
|
||||
const readView: Uint8Array = new Uint8Array(audioBuffer);
|
||||
for (let i: number = 0; i < audioBuffer.byteLength; i++) {
|
||||
expect(readView[i]).toEqual(bytesRead++ % 256);
|
||||
}
|
||||
}
|
||||
} catch (error) {
|
||||
done.fail(error);
|
||||
}
|
||||
|
||||
if (audioBuffer != null) {
|
||||
readLoop();
|
||||
} else {
|
||||
done();
|
||||
}
|
||||
});
|
||||
};
|
||||
|
||||
readLoop();
|
||||
});
|
||||
|
||||
|
||||
test("PullAudioOutputStreamImpl multiple writes and reads", (done: jest.DoneCallback) => {
|
||||
const ps: PullAudioOutputStreamImpl = new PullAudioOutputStreamImpl(bufferSize);
|
||||
|
||||
const ab: ArrayBuffer = new ArrayBuffer(bufferSize * 4);
|
||||
const abView: Uint8Array = new Uint8Array(ab);
|
||||
for (let i: number = 0; i < bufferSize * 4; i++) {
|
||||
abView[i] = i % 256;
|
||||
}
|
||||
|
||||
let j: number = 0;
|
||||
for (j = 0; j < bufferSize * 4; j += 100) {
|
||||
ps.write(ab.slice(j, j + 100));
|
||||
}
|
||||
ps.write(ab.slice(j));
|
||||
|
||||
let bytesRead: number = 0;
|
||||
|
||||
const readLoop = () => {
|
||||
ps.read().onSuccessContinueWith((audioBuffer: ArrayBuffer) => {
|
||||
try {
|
||||
expect(audioBuffer.byteLength).toBeGreaterThanOrEqual(bufferSize);
|
||||
expect(audioBuffer.byteLength).toBeLessThanOrEqual(bufferSize);
|
||||
const readView: Uint8Array = new Uint8Array(audioBuffer);
|
||||
for (let i: number = 0; i < audioBuffer.byteLength; i++) {
|
||||
expect(readView[i]).toEqual(bytesRead++ % 256);
|
||||
}
|
||||
} catch (error) {
|
||||
done.fail(error);
|
||||
}
|
||||
|
||||
if (bytesRead < bufferSize * 4) {
|
||||
readLoop();
|
||||
} else {
|
||||
done();
|
||||
}
|
||||
});
|
||||
};
|
||||
|
||||
readLoop();
|
||||
});
|
|
@ -0,0 +1,144 @@
|
|||
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
|
||||
<meta name="viewport" content="width=device-width">
|
||||
|
||||
<title>decodeAudioData example</title>
|
||||
|
||||
<link rel="stylesheet" href="">
|
||||
<!--[if lt IE 9]>
|
||||
<script src="//html5shiv.googlecode.com/svn/trunk/html5.js"></script>
|
||||
<![endif]-->
|
||||
</head>
|
||||
|
||||
<body>
|
||||
<div id="warning">
|
||||
<h1 style="font-weight:500;">Speech Recognition Speech SDK not found (microsoft.cognitiveservices.speech.sdk.bundle.js missing).</h1>
|
||||
</div>
|
||||
<div id="content" style="display:none">
|
||||
<h1>decodeAudioData example</h1>
|
||||
|
||||
<button class="play">Play</button>
|
||||
<button class="stop">Stop</button>
|
||||
|
||||
<h2>Set playback rate</h2>
|
||||
<input class="playback-rate-control" type="range" min="0.25" max="3" step="0.05" value="1">
|
||||
<span class="playback-rate-value">1.0</span>
|
||||
|
||||
<h2>Set loop start and loop end</h2>
|
||||
<input class="loopstart-control" type="range" min="0" max="20" step="1" value="0">
|
||||
<span class="loopstart-value">0</span>
|
||||
|
||||
<input class="loopend-control" type="range" min="0" max="20" step="1" value="0">
|
||||
<span class="loopend-value">0</span>
|
||||
|
||||
<pre></pre>
|
||||
</div>
|
||||
</body>
|
||||
<!-- Speech SDK REFERENCE -->
|
||||
<script src="microsoft.cognitiveservices.speech.sdk.bundle.js"></script>
|
||||
<!-- Speech SDK USAGE -->
|
||||
<script>
|
||||
// On document load resolve the Speech SDK dependency
|
||||
function Initialize(onComplete) {
|
||||
if (!!window.SpeechSDK) {
|
||||
document.getElementById('content').style.display = 'block';
|
||||
document.getElementById('warning').style.display = 'none';
|
||||
onComplete(window.SpeechSDK);
|
||||
}
|
||||
}
|
||||
</script>
|
||||
|
||||
<script>
|
||||
var SpeechSDK;
|
||||
Initialize(function (speechSdk) {
|
||||
SpeechSDK = speechSdk;
|
||||
});
|
||||
|
||||
var audioFormat = SpeechSDK.AudioStreamFormat.getDefaultInputFormat();
|
||||
var audioPlayer = new SpeechSDK.BaseAudioPlayer(audioFormat);
|
||||
|
||||
// define variables
|
||||
|
||||
var socketURL = 'ws://localhost:8080';
|
||||
var ws;
|
||||
|
||||
var pre = document.querySelector('pre');
|
||||
var myScript = document.querySelector('script');
|
||||
var play = document.querySelector('.play');
|
||||
var stop = document.querySelector('.stop');
|
||||
|
||||
var playbackControl = document.querySelector('.playback-rate-control');
|
||||
var playbackValue = document.querySelector('.playback-rate-value');
|
||||
playbackControl.setAttribute('disabled', 'disabled');
|
||||
|
||||
var loopstartControl = document.querySelector('.loopstart-control');
|
||||
var loopstartValue = document.querySelector('.loopstart-value');
|
||||
loopstartControl.setAttribute('disabled', 'disabled');
|
||||
|
||||
var loopendControl = document.querySelector('.loopend-control');
|
||||
var loopendValue = document.querySelector('.loopend-value');
|
||||
loopendControl.setAttribute('disabled', 'disabled');
|
||||
|
||||
// use web sockets to load audio chunks and push them into the BaseAudioPlayer
|
||||
function getData() {
|
||||
var socketURL = 'ws://localhost:8080';
|
||||
|
||||
ws = new WebSocket(socketURL);
|
||||
ws.binaryType = 'arraybuffer';
|
||||
ws.addEventListener('message',function(event) {
|
||||
audioPlayer.playAudioSample(event.data);
|
||||
});
|
||||
|
||||
// loopstartControl.setAttribute('max', Math.floor(songLength));
|
||||
// loopendControl.setAttribute('max', Math.floor(songLength));
|
||||
|
||||
// function(e){"Error with decoding audio data" + e.error});
|
||||
}
|
||||
|
||||
// wire up buttons to stop and play audio, and range slider control
|
||||
play.onclick = function() {
|
||||
getData();
|
||||
play.setAttribute('disabled', 'disabled');
|
||||
playbackControl.removeAttribute('disabled');
|
||||
loopstartControl.removeAttribute('disabled');
|
||||
loopendControl.removeAttribute('disabled');
|
||||
}
|
||||
|
||||
stop.onclick = function() {
|
||||
|
||||
if (ws !== null && ws !== undefined) {
|
||||
ws.close(1000);
|
||||
}
|
||||
|
||||
audioPlayer.stopAudio();
|
||||
|
||||
play.removeAttribute('disabled');
|
||||
playbackControl.setAttribute('disabled', 'disabled');
|
||||
loopstartControl.setAttribute('disabled', 'disabled');
|
||||
loopendControl.setAttribute('disabled', 'disabled');
|
||||
}
|
||||
|
||||
playbackControl.oninput = function() {
|
||||
// source.playbackRate.value = playbackControl.value;
|
||||
playbackValue.innerHTML = playbackControl.value;
|
||||
}
|
||||
|
||||
loopstartControl.oninput = function() {
|
||||
// source.loopStart = loopstartControl.value;
|
||||
loopstartValue.innerHTML = loopstartControl.value;
|
||||
}
|
||||
|
||||
loopendControl.oninput = function() {
|
||||
// source.loopEnd = loopendControl.value;
|
||||
loopendValue.innerHTML = loopendControl.value;
|
||||
}
|
||||
|
||||
// dump script to pre element
|
||||
|
||||
pre.innerHTML = myScript.innerHTML;
|
||||
</script>
|
||||
</html>
|
Двоичный файл не отображается.
|
@ -0,0 +1,50 @@
|
|||
const WebSocket = require('ws');
|
||||
const fs = require('fs');
|
||||
|
||||
const pcm_file = './ObamaInterview1.wav';
|
||||
let interval = 0,
|
||||
sampleRate = 16000,
|
||||
bytePerSample = 2,
|
||||
channels = 1,
|
||||
bytesChunk = (sampleRate * bytePerSample * channels),
|
||||
offset = 0,
|
||||
pcmData,
|
||||
wss;
|
||||
|
||||
fs.readFile(pcm_file, (err, data) => {
|
||||
if (err) throw err;
|
||||
pcmData = data;
|
||||
openSocket();
|
||||
});
|
||||
|
||||
|
||||
function openSocket() {
|
||||
wss = new WebSocket.Server({ port: 8080 });
|
||||
console.log('Server ready...');
|
||||
wss.on('connection', function connection(ws) {
|
||||
console.log('Socket connected. sending data...');
|
||||
if (interval) {
|
||||
clearInterval(interval);
|
||||
}
|
||||
interval = setInterval(function() {
|
||||
sendData();
|
||||
}, 500);
|
||||
});
|
||||
}
|
||||
|
||||
function sendData() {
|
||||
let payload;
|
||||
if (offset >= pcmData.length) {
|
||||
clearInterval(interval);
|
||||
offset = 0;
|
||||
return;
|
||||
}
|
||||
|
||||
payload = pcmData.subarray(offset, (offset + bytesChunk));
|
||||
offset += bytesChunk;
|
||||
wss.clients.forEach(function each(client) {
|
||||
if (client.readyState === WebSocket.OPEN) {
|
||||
client.send(payload);
|
||||
}
|
||||
});
|
||||
}
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
|
@ -0,0 +1,84 @@
|
|||
// Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
// Licensed under the MIT license.
|
||||
|
||||
import * as fs from "fs";
|
||||
import * as sdk from "../microsoft.cognitiveservices.speech.sdk";
|
||||
import {
|
||||
ConsoleLoggingListener,
|
||||
WebsocketMessageAdapter,
|
||||
} from "../src/common.browser/Exports";
|
||||
import {
|
||||
Events,
|
||||
EventType,
|
||||
} from "../src/common/Exports";
|
||||
import { BaseAudioPlayer } from "../src/sdk/Audio/BaseAudioPlayer";
|
||||
import { AudioStreamFormat } from "../src/sdk/Exports";
|
||||
import { Settings } from "./Settings";
|
||||
import { validateTelemetry } from "./TelemetryUtil";
|
||||
import WaitForCondition from "./Utilities";
|
||||
import { WaveFileAudioInput } from "./WaveFileAudioInputStream";
|
||||
|
||||
let objsToClose: any[];
|
||||
|
||||
beforeAll(() => {
|
||||
// Override inputs, if necessary
|
||||
Settings.LoadSettings();
|
||||
Events.instance.attachListener(new ConsoleLoggingListener(EventType.Debug));
|
||||
});
|
||||
|
||||
// Test cases are run linearly, the only other mechanism to demark them in the output is to put a console line in each case and
|
||||
// report the name.
|
||||
beforeEach(() => {
|
||||
objsToClose = [];
|
||||
// tslint:disable-next-line:no-console
|
||||
console.info("---------------------------------------Starting test case-----------------------------------");
|
||||
});
|
||||
|
||||
afterEach(() => {
|
||||
// tslint:disable-next-line:no-console
|
||||
console.info("End Time: " + new Date(Date.now()).toLocaleString());
|
||||
objsToClose.forEach((value: any, index: number, array: any[]) => {
|
||||
if (typeof value.close === "function") {
|
||||
value.close();
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
test("Play TTS from a file", (done: jest.DoneCallback) => {
|
||||
// tslint:disable-next-line:no-console
|
||||
console.info("Name:Play audio from a file");
|
||||
|
||||
const audioFormat = AudioStreamFormat.getDefaultInputFormat();
|
||||
const audioPlayer: BaseAudioPlayer = new BaseAudioPlayer(audioFormat);
|
||||
|
||||
let countData: number = 0;
|
||||
let countClose: number = 0;
|
||||
let error: number = 0;
|
||||
let lastError: string;
|
||||
fs.createReadStream(Settings.WaveFile, { highWaterMark: 4800 })
|
||||
.on("data", (buffer: Buffer) => {
|
||||
audioPlayer.playAudioSample(buffer);
|
||||
countData++;
|
||||
return;
|
||||
}).on("end", () => {
|
||||
countClose++;
|
||||
return;
|
||||
}).on("error", (err: any) => {
|
||||
lastError = err;
|
||||
error++;
|
||||
return;
|
||||
});
|
||||
|
||||
WaitForCondition(() => {
|
||||
return (countClose > 0);
|
||||
}, () => {
|
||||
|
||||
// tslint:disable-next-line:no-console
|
||||
console.info("countData: " + countData, " countClose: " + countClose);
|
||||
|
||||
done();
|
||||
return;
|
||||
});
|
||||
|
||||
// expect(() => sdk.DialogServiceConfig.fromBotSecret(null, null, null)).toThrowError();
|
||||
});
|
|
@ -12,7 +12,7 @@ export class Settings {
|
|||
|
||||
// Endpoint and key for timeout testing.
|
||||
// Endpoint should reduce standard speech timeout to value specified in SpeechServiceTimeoutSeconds
|
||||
// If undefined, production timeout of 10 seconds will be used, but at the cost of greatly incrased test
|
||||
// If undefined, production timeout of 10 seconds will be used, but at the cost of greatly increased test
|
||||
// duration.
|
||||
public static SpeechTimeoutEndpoint: string;
|
||||
public static SpeechTimeoutKey: string;
|
||||
|
@ -22,6 +22,8 @@ export class Settings {
|
|||
public static LuisRegion: string = "<<YOUR_LUIS_REGION>>";
|
||||
public static LuisAppEndPointHref: string = "<<YOUR_LUIS_APP_URL>>";
|
||||
|
||||
public static BotSecret: string = "<<YOUR_BOT_SECRET>>";
|
||||
|
||||
public static InputDir: string = "tests/input/audio/";
|
||||
|
||||
public static ExecuteLongRunningTests: string = "false";
|
||||
|
@ -31,7 +33,7 @@ export class Settings {
|
|||
}
|
||||
|
||||
/*
|
||||
* The intent behing this setting is that at test execution time the WaveFile below will contain speech
|
||||
* The intent behind this setting is that at test execution time the WaveFile below will contain speech
|
||||
* that the LUIS app above will recognize as an intent with this ID.
|
||||
*
|
||||
* Since the default wave file asks "What's the weather like?", an intent with the Id of "Weather" seems reasonable.
|
||||
|
|
|
@ -1444,9 +1444,9 @@ describe.each([true, false])("Service based tests", (forceNodeWebSocket: boolean
|
|||
postCall = true;
|
||||
});
|
||||
|
||||
test("InitialSilenceTimeout Continous", (done: jest.DoneCallback) => {
|
||||
test("InitialSilenceTimeout Continuous", (done: jest.DoneCallback) => {
|
||||
// tslint:disable-next-line:no-console
|
||||
console.info("Name: InitialSilenceTimeout Continous");
|
||||
console.info("Name: InitialSilenceTimeout Continuous");
|
||||
const s: sdk.SpeechConfig = BuildSpeechConfig();
|
||||
objsToClose.push(s);
|
||||
|
||||
|
|
Двоичный файл не отображается.
Двоичный файл не отображается.
Загрузка…
Ссылка в новой задаче