Deepgram is by far the easiest speech->text API I’ve tried, however they don’t seem to have an API to let you look up how their content IDs map to your filenames. If you want to do this, you can do it from the Chrome developer tools:
$('a').valueOf().map( (v, x) => [[x.href, x.text]] ).filter( (i, t) => { return t[0].indexOf("content?id") >= 0 } ).map( (i, t) => t[0].split("?")[1] + " > deepgram-json/" + t[1].split(".")[0] + ".json" ).toArray().join("\n ")