Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
# Read the wav file
wavefile = wave.open('office_top_short.wav', 'rb')
samplerate = wavefile.getframerate()

print("Audio file specs:")
print("  sample rate:", wavefile.getframerate())
print("  length:", wavefile.getnframes())
print("  data size in bytes:", wavefile.getsampwidth())
print("  number of chanels:", wavefile.getnchannels())
print()

...

Now we get to more interesting stuff. The dialogflow component will send back a lot of information, so we will have to handle that, and extract the transcription.

First, we’ll create an event. We’ll set this event whenever dialogflow has detected the end of a sentence. That way we can ask dialogflow to listen to the next immediately after. Its easiest to use a threading.Event, because dialogflow will signal the end of a sentence at an arbitrary point.

The on_dialog function handles setting this event. It also will print the partial transcript intermittently and once dialogflow has chosen a final transcript we’ll add this to the list.

Code Block
# set up the callback and variables to contain the transcript results
# Dialogflow is not made for transcribing, so we'll have to work around this by "faking" a conversation

dialogflow_detected_sentence = threading.Event()
transcripts = []


def on_dialog(message):
    if message.response:
        t = message.response.recognition_result.transcript
        print("\r Transcript:", t, end="")

        if message.response.recognition_result.is_final:
            transcripts.append(t)
            dialogflow_detected_sentence.set()

...

Now we can set up dialogflow. We do this by first reading in our json key

Code Block
# read you keyfile and connect to dialogflow
keyfile_json = json.load(open("your_keyfile_here.json"))

...

And then we can create a configuration for the dialogflow component. Make sure to set the proper sample rate!

Code Block
conf = DialogflowConf(keyfile_json=keyfile_json,
                      sample_rate_hertz=samplerate, )
dialogflow = Dialogflow(conf=conf)

...

We’ll direct the output message’s produced by dialogflow to the on_dialog function by registering it as a callback.

Code Block
dialogflow.register_callback(on_dialog)

...

To get a sense of what dialogflow is hearing, we’ll also play the sound on our own speakers.

Code Block
# OPTIONAL: set up output device to play audio along transcript

...

p = pyaudio.PyAudio()
output = p.open(format=pyaudio.paInt16,
                channels=1,
                rate=samplerate,
                output=True)

With everything set up, we can start to ask dialogflow to detect a sentence! We do this using dialogflow.request(GetIntentRequest(), block=False). Non blocking is important here, because we need to keep sending audio (and not wait for some result, which will not occur because no audio is sent). Every time dialogflow detects a sentence, we ask it to listen for the next one!

Code Block


# To make dialogflow listen to the audio, we need to ask it to "listen for intent".
# This means it will try to determine what the intention is of what is being said by the person speaking.
# Instead of using this intent, we simply store the transcript and ask it to listen for intent again.

print("Listening for first sentence")
dialogflow.request(GetIntentRequest(), block=False)

# send the audio in chunks of one second
for i in range(wavefile.getnframes() // wavefile.getframerate()):

    if dialogflow_detected_sentence.is_set():
        print()
        dialogflow.request(GetIntentRequest(), block=False)

        dialogflow_detected_sentence.clear()

    # grab one second of audio data
    chunk = wavefile.readframes(samplerate)

    output.write(chunk)  # replace with time.sleep to not send audio too fast if not playing audio

    message = AudioMessage(sample_rate=samplerate, waveform=chunk)
    dialogflow.send_message(message)

When we’re done we’ll write the output to a file and clean up dialogflow.

Code Block
dialogflow.send_message(StopListeningMessage())

print("\n\n")
print("Final transcript")
print(transcripts)

with open('transcript.txt', 'w') as f:
    for line in transcripts:
        f.write(f"{line}\n")

output.close()
p.terminate()

And run Thats the code! Run your file like so:

Code Block
cd sic_framework/tests
python3 demo_transcribe_with_dialogflow.py

...