Audio Transcriptions with Google

Record ideas on the move, preprocess at home, generate transcript in the cloud

Christian Prior-Mamulyan

Make a bunch of audio recordings

boards

Write a bash script (in WSL)

boards

Run the script on *wav files

boards

Upload to a Google bucket

console.cloud.google.com/speech/transcriptions

Download transcription

Improved script with ffmpeg audio filters

boards

The improved code


combine-wav-files-to-flac:
        rm -f _combined.flac
        rm mylist.txt
        find . -iname "2*wav" -printf "file %p\n" | sort -n > mylist.txt
        ffmpeg -hide_banner -f concat -safe 0 -i mylist.txt -c:a flac -af "
        silenceremove=start_periods=1:start_duration=0:start_threshold=-50dB:
        stop_periods=-1:stop_duration=2:stop_threshold=-40dB,
        loudnorm=i=-14.0" -ar 48000 _combined.flac

The documentation for the filters can be found at
https://ffmpeg.org/ffmpeg-filters.html#silenceremove and
https://ffmpeg.org/ffmpeg-filters.html#silenceremove

ffmpeg allows to filter silent parts based on a bunch of threshold parameters, and also does loudness normalization.

The process steps visualized

Copyright

Copyright © 2024 Christian Prior-Mamulyan.
Except where otherwise noted, this work is licensed under a Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License.