HappyScribe

# internet & media

smart transcription-as-a-service platform

backend
Ruby on Rails
frontend
React
SlateJS editor
Immutable.js Libraries

4000

journalists & researchers

119

languages supported

1

week of lead time

23k

minutes of audio since launch

problems to solve

Happy Scribe is a high-end speech-to-text transcription platform for researchers, journalists, podcasters, and media production companies. It uses state-of-the-art machine-learning models to transcribe audio/video files uploaded by users automatically. The goal is to make the life of users easier as researchers and journalists spend a significant part of their time transcribing interviews. Happy Scribe reinvents the way they work.

Merixstudio helped Happy Scribe with the dynamic growth of the product. We were tasked with the advanced javascript development of a set of features and were in charge of making the transcribed text as clear and user-friendly as possible.

solutions

Merixstudio mainly contributed to enhancing the player for video or audio. Apart from simply playing the material, users could also control it by clicking on text to jump to the specific point in audio or video simultaneously. When the media player was taken care of, the next thing we were responsible for making the transcribed text as clear for users as possible. That’s why we introduced speaker identification which allows users to set the identification to differentiate between speakers.

Another issue we addressed was helping users to easily navigate the text when playing the transcribed file. We achieved that by adding the function of highlighting the text (similar to karaoke), which is synchronized with the spoken word. In addition to that, we introduced comments which also are based on highlighting particular parts of the text.

One of the most interesting features we implemented was also a confidence heatmap which defines the value in the audio/video file transcribed into text. Once a user clicks the heatmap button, the words get coloured depending on their confidence score - with that, words that have the biggest possibility to be incorrectly transcribed by the algorithm would be coloured red. 

Finally, all users work is private by default, but the application allows users to share a preview of the transcript. A user can export the final results to Word, TXT, PDF or SRT and VTT for the use as subtitles.

graphic design

view examples

key features

  • speaker identification

    recognising a speaker’s switch

  • custom timestamps

    adding timestamps in the text

  • share publicly

    share a view-only page of the transcript

  • highlight & comment

    adding comments when collaborating with multiple teams

  • export transcript

    exporting transcript in Word, TXT and PDF or in SRT for subtitles

  • heatmap mode

    fast correction by looking only at the places where the algorithm struggled

other works

BrandSync flag of BrandSync country

BrandSync

A B2B cloud-based product information management platform
python product design django saas angular
internet & media
Levatis flag of Levatis country

Levatis

A SaaS platform for managing the employees’ activities and schedules
redux angular
web app

We use cookies on this site to improve performance. By browsing this site you are agreeing to this. For more information see our Privacy policy I understand