Azure Cognitive Services - Speech

Spoiler alert as a formula "+44 = 16khz * 16 bits * 600 seconds + 44 = 160002*600+44 = 19200044" (I will open this one later, maybe) So, then I started to try this and that, first with Powershell. Generating Azure Speech API Key (There is two keys generated, but only one is needed for auth) I got this one working with under 15sec wav-file in 16khz. It translated the speech well enough, but the lenght was too short. A hint for the powershell usage with invoke-restmethod and Api Key. You have to put key inside {} to get it working, for some reason Azure declined the without these brackets. $Headers = @{ 'Ocp-Apim-Subscription-Key' = {Speech API KEY}; 'Transfer-Encoding' = 'chunked'; 'Content-type' = 'audio/pcm; codec=audio/pcm; samplerate=16000' } Although, Microsoft is saying in the documentation that rest-api and subscription key have to be put in 'key' format, but nope. Just no.
And Microsoft also in the first pic (under 15sec files, but in the next document its only 10)
Ok, I get that this is new stuff to them also, there was Bing Speech before, but still this is new and exciting. Btw, Google also released new functions for their Speech Services. And yes I also tried Google invented GoLang for this, but that is a different story to be told, maybe. And when You make the request, You have to use the same region conversion service that our Resource Group is in. So all the stuff in Azure have to be in same resource group or You will get funny (wasnt funny then) errors with Your queries. Mine was northeurope.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1 Here the list for the services https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text Eteenpäin sano mummo lumessa (Just had to use this one, Its Finnish and translated to ... but hey wait, You can use Azure after this blog, not telling) For codeless apps You can use Azure Logic Apps or Google App Engine for these. Here a nice write-up from Abhishek about Azure Logic Apps and batch transcript from Blob. First You want to download Azure Speech SDK for other languages or JavaScript in a browser package There was a new version 1.16 release in the beginning of march and now it supports mp3-format like Google does. I tried this one with WSL and node.js libraries. https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/cognitive-services/Speech-Service/includes/how-to/speech-to-text-basics/speech-to-text-basics-javascript.md You can also use Visual studio to build a solution to test this one, but I prefer the node.js based solution. https://github.com/Azure-Samples/Cognitive-Speech-STT-Windows And here is a sample repository for all different languages https://github.com/Azure-Samples/cognitive-services-speech-sdk Happy speeching all, this article will be continued in part2.