# media & entertainment
smart transcription-as-a-service platform
4000
journalists & researchers119
languages supported1
week of lead time23k
minutes of audio since launchproblems to solve
Happy Scribe is a high-end speech-to-text transcription platform for researchers, journalists, podcasters, and media production companies. It uses state-of-the-art machine-learning models to transcribe audio/video files uploaded by users automatically. The goal is to make the life of users easier as researchers and journalists spend a significant part of their time transcribing interviews. Happy Scribe reinvents the way they work.
Merixstudio helped Happy Scribe with the dynamic growth of the product. We were tasked with the advanced javascript development of a set of features and were in charge of making the transcribed text as clear and user-friendly as possible.
solutions
Merixstudio mainly contributed to enhancing the player for video or audio. Apart from simply playing the material, users could also control it by clicking on text to jump to the specific point in audio or video simultaneously. When the media player was taken care of, the next thing we were responsible for making the transcribed text as clear for users as possible. That’s why we introduced speaker identification which allows users to set the identification to differentiate between speakers.
Another issue we addressed was helping users to easily navigate the text when playing the transcribed file. We achieved that by adding the function of highlighting the text (similar to karaoke), which is synchronized with the spoken word. In addition to that, we introduced comments which also are based on highlighting particular parts of the text.
One of the most interesting features we implemented was also a confidence heatmap which defines the value in the audio/video file transcribed into text. Once a user clicks the heatmap button, the words get coloured depending on their confidence score - with that, words that have the biggest possibility to be incorrectly transcribed by the algorithm would be coloured red.
Finally, all users work is private by default, but the application allows users to share a preview of the transcript. A user can export the final results to Word, TXT, PDF or SRT and VTT for the use as subtitles.
graphic design
view examples



key features
-
speaker identification
recognising a speaker’s switch
-
custom timestamps
adding timestamps in the text
-
share publicly
share a view-only page of the transcript
-
highlight & comment
adding comments when collaborating with multiple teams
-
export transcript
exporting transcript in Word, TXT and PDF or in SRT for subtitles
-
heatmap mode
fast correction by looking only at the places where the algorithm struggled