#DailyBloggingChallenge (331/365)
The idea that
> people might enjoy listening to a #podcast like approach of evaluating various #books
has been brought upon me.
To keep everything in the #Fediverse with the power of #ActivityPub the goal is to publish the content onto #FunkWhale.
#DailyBloggingChallenge (332/365)
The main way that I evaluate #books specifically #AudioBooks is by taking a #VoiceRecording after each chapter, section, or idea.
I have noticed that with #NonFiction books, I can easily listen to them at twice the speed. On the other hand, #fiction books need to be listened to at normal speed.
#DailyBloggingChallenge (362/365)
Originally wanted to use #VOSK to transcribe the #SpeechToText. Initially tried it out over #KdenLive and its 'Speech Recognition' tool.
This took quite awhile to setup, since it is not concrete what kind file format, if any, the VOSK model should have. Additionally, the recommendation of setting up a virtual #Python environment didn't work as expect and went with the global approach.
And finally scratched the whole approach, once realizing that transcribing 26 min audio clip is taking longer than 10min.
#DailyBloggingChallenge (364/365)
The 'Quick Start' section in the Readme sufficed for setting up.
The only thing that I had to change in the `./models/download-ggml-model.sh` script (1) is remove the option `--show-progress` on line 105. Seems like GNU Wget2 2.1.0 doesn't have that option.
Alternatively one can replace the option with
`--progress=bar --force-progress`
- 1: https://github.com/ggerganov/whisper.cpp/blob/master/models/download-ggml-model.sh
#DailyBloggingChallenge (365/365)
The only caveat of the #Whisper project is that it only works on 16-bit #WAV files.
There is a #FFMPEG script on how to do it via the #terminal
`ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav`
#DailyBloggingChallenge (363/365)
Instead opted in to using #Whisper which also works with #KdenLive.
Although Whisper is originally written in #Python there is a #CPP project that makes transcribing very fast. It took less than 2min to transcribe the 26 min audio clip.
https://github.com/ggerganov/whisper.cpp