OK, so after a week of learning lua and hacking and slashing, I have a working script for IBM Watson speech to text voicemail transcription for Fusion PBX
Honestly its not that hard but I didn't know how to script in lua so it was painful. Anyway here goes:
First go get an acocount and API key from the IBM Watson speech to text site here: https://cloud.ibm.com/login
Next in the Fusion PBX GUI got to Advanced => Default Settings and scroll down to the voicemail section.
Enter the following new keys:
Sub: api_key Type: text Value: [whatever your api keys ] Enabled: true
Sub: json_enabled Type: boolean Value: true Enabled: true
Sub: transcibe_language Type: text Value: en-US Enabled: true
Sub: transcribe_provider Type: text Value: watson Enabled: true
Sub: transcription_server type: text Value: https://stream.watsonplatform.net/speech-to-text/api/v1/recognize
Now at the top of the page click the "reload" button. Then navigate to Status => Sip Status and click the "Flush Cache" button and "Reload Xml" button
Also make sure transcription is set to true in your voicemail box
Make a backup of /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua by renaming it.
Uplaod the file attached and make sure the ownership and permissions are correct.
You can also just modify the current record_message.lua by inserting the relevant bits in to the transcribe function, as below:
Now we just have to see how much Watson will cost when the rubber hits the road, but so far the accuracy has been very good.
Honestly its not that hard but I didn't know how to script in lua so it was painful. Anyway here goes:
First go get an acocount and API key from the IBM Watson speech to text site here: https://cloud.ibm.com/login
Next in the Fusion PBX GUI got to Advanced => Default Settings and scroll down to the voicemail section.
Enter the following new keys:
Sub: api_key Type: text Value: [whatever your api keys ] Enabled: true
Sub: json_enabled Type: boolean Value: true Enabled: true
Sub: transcibe_language Type: text Value: en-US Enabled: true
Sub: transcribe_provider Type: text Value: watson Enabled: true
Sub: transcription_server type: text Value: https://stream.watsonplatform.net/speech-to-text/api/v1/recognize
Now at the top of the page click the "reload" button. Then navigate to Status => Sip Status and click the "Flush Cache" button and "Reload Xml" button
Also make sure transcription is set to true in your voicemail box
Make a backup of /usr/share/freeswitch/scripts/app/voicemail/resources/functions/record_message.lua by renaming it.
Uplaod the file attached and make sure the ownership and permissions are correct.
You can also just modify the current record_message.lua by inserting the relevant bits in to the transcribe function, as below:
Code:
local function transcribe(file_path,settings,start_epoch)
--transcription variables
if (os.time() - start_epoch > 2) then
local transcribe_provider = settings:get('voicemail', 'transcribe_provider', 'text') or '';
transcribe_language = settings:get('voicemail', 'transcribe_language', 'text') or 'en-US';
if (debug["info"]) then
freeswitch.consoleLog("notice", "[voicemail] transcribe_provider: " .. transcribe_provider .. "\n");
freeswitch.consoleLog("notice", "[voicemail] transcribe_language: " .. transcribe_language .. "\n");
end
-- Watson Stuff --
if (transcribe_provider == "watson") then
local api_key = settings:get('voicemail', 'api_key', 'text') or '';
local transcription_server = settings:get('voicemail', 'transcription_server', 'text') or '';
if (api_key ~= '') then
transcribe_cmd = [[ curl -X POST -u "apikey:]]..api_key..[[" --header "Content-type: audio/wav" --data-binary @]]..file_path..[[ "]]..transcription_server..[[" ]]
local handle = io.popen(transcribe_cmd);
local transcribe_result = handle:read("*a");
handle:close();
if (debug["info"]) then
freeswitch.consoleLog("notice", "[voicemail] CMD: " .. transcribe_cmd .. "\n");
freeswitch.consoleLog("notice", "[voicemail] RESULT: " .. transcribe_result .. "\n");
end
local transcribe_json = JSON.decode(transcribe_result);
if (debug["info"]) then
if (transcribe_json == nil) then
freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: (null) \n");
else
freeswitch.consoleLog("notice", "[voicemail] TRANSCRIPTION: " .. transcribe_json["results"][1]["alternatives"][1]["transcript"] .. "\n");
end
if (transcribe_json["results"][1]["alternatives"][1]["confidence"] == nil) then
freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: (null) \n");
else
freeswitch.consoleLog("notice", "[voicemail] CONFIDENCE: " .. transcribe_json["results"][1]["alternatives"][1]["confidence"] .. "\n");
end
end
transcription = transcribe_json["results"][1]["alternatives"][1]["transcript"];
confidence = transcribe_json["results"][1]["alternatives"][1]["confidence"];
return transcription;
end
end
else
if (debug["info"]) then
freeswitch.consoleLog("notice", "[voicemail] message too short for transcription.\n");
end
end
return '';
end
Now we just have to see how much Watson will cost when the rubber hits the road, but so far the accuracy has been very good.