Subtitling Videos with the help of speech recognition technology

Nowadays subtitling online videos is a growing practice, not in vain. YouTube offers the option of automatic subtitling even when it is not 100% perfect, and it is just at this point where we want to focus:

How does the technology that allows us to obtain the subtitles work?

What other tools are there to subtitle videos and in other languages?

Why do I have to subtitle the videos?

Before getting into the subject, let’s quickly get into context. This is more than necessary to understand the market trends and thus stay on the crest of the wave.

In short, the future of brands is video marketing . And, although the use of video in marketing plans is nothing new, it is the role that it now holds. Its dissemination can and must be done on all platforms, especially on social networks: Facebook, Instagram, Twitter, Snapchat ….

And if to the above we add the tendency to see videos without sound (85% of Facebook users tend to watch videos in silence), it is more than obvious why many companies and media are investing in subtitling videos. But let’s not go into this in more detail for now.

Let’s return to our focus: to obtain the subtitles we can resort to voice recognition technology . It is a tool that automates and streamlines the manual work of transcription to then take it to subtitles; in other words: convert the audio / video into text, although with some limitations. How does it work?

Voice recognition technology

We can define it as a set of computer methods that allow to convert an audio or video file into text, being the dictation of voice, commercialized since the 80s, the most popular application of this type of technology.

Fortunately, recent technological advances offer us today the opportunity to exploit this tool and apply it in other ways, namely:

  • Subtitling and automatic translation of the video .
  • The indexing and extraction of information from audiovisual documents.
  • The human-machine voice interfaces like Siri.

We focus on points 1 and 2, the limits of speech recognition technology are, for example, its inaccuracy to recognize words or ways of speaking (jargon accents …), transcribe what is said by more than one speaker and mark all the punctuation marks. Also, it is affected by environmental factors, that is, if the ambient sound is very strong, speech recognition will be weak.

The good thing is that these faults are constantly being corrected. The companies that offer this type of service have as a priority the development of their platforms so that they are more and more intelligent. Siri already understands us more😉

If we talk about platforms that offer subtitles, there are some that only offer automatic subtitling such as YouTube and others that offer automatic subtitling plus manual revision, such as the online platform Authôt.

Free and paid tools to get subtitles

We have options like YouTube , which is not bad because it allows us to obtain automatic subtitles or upload our own. Therefore, you have it easy if you are not bothered by certain transcription errors and if the videos will only be published in this medium.

Other tools are those of dictation , such as GoogleDoc, Dictation or Speechnotes   which, as the name implies, are not designed precisely to obtain the subtitles with their timecode, but because they are free many people use them.

With a little inventiveness, many users activate the microphone of these platforms and immediately start to shoot the video so that it “dictates” what must be transcribed. Works? If the audio is good, the result will be too.

The detail of these dictation tools is that their process is slow. The transcription is done in the same time of the video, that is, if it lasts one hour, you will have to wait an hour for the platform to transcribe it. Also, you will not be able to export such text in .srt format, which means having to manually complete the synchronization of the subtitles.

On the other hand, there are payment tools to obtain the automatic subtitles and among them some that include services of revision, embedding and even translation. An example would be the online platform Authôt. It has a free version where you can upload 10 min of video or audio.

The interesting thing about this platform is the speed of its speech recognition technology. The audio / video is transformed into text in a few minutes, with a result of 97% effectiveness if the audio is of quality; for example, a 1-hour video is transformed into text in 12 min .

In addition, there are multiple export formats, obviously, the subtitle with its timecode (.srt) and others (txt, .mp3, .mp4, and more). The manual correction service and the translation service are also available.

Why do I have to subtitle the videos?

We can mention at least three reasons .

  1. The habit of watching videos without sound on social networks.
  2. The high engagement of subtitled videos reaches a wider audience (accessibility).
  3. Its usefulness to internationalize the contents if they are translated.

In an investigation carried out by the Animoto company in 2017, the great value of video on the Internet was verified, especially on Facebook, where 85% of the videos are reproduced without sound, adding that 39% of users see the video to the end if it has subtitles . It is worth noting, too, the preference for watching pre-recorded videos (52%). On the other hand, companies that use subtitled videos are favoring digital accessibility . The contents are made more accessible for people who are deaf or hard of hearing, thus complying with the Royal Legislative Decree of November 29, 2013, on the rights of people with disabilities and their social inclusion.


Likewise, if the videos have subtitles in other languages , their reach will undoubtedly be greater. This translation practice is applied by major media such as the US press group Condé Nast, which carries magazines such as Vogue, GQ and Vanity Fair.

Finally, you can go further if the subtitles are exported as text to be reused in the cloud. So said content can become an article for the blog of the company and, why not, as an e-book. It is an excellent practice to reuse the subtitles and even better if this benefits the SEO positioning of your website.


A Hole in #OO’s History

As part of my research for an independent history of Occupy Oakland, I recently Googled “Occupy Oakland” looking for interviews with campers during the occupation of Oscar Grant Plaza. After sampling hundreds of videos, I’m frankly stumped. What puzzles me most is the similarity between the YouTube archives and mainstream media. Occupiers have of course slammed MSM for focusing on fracas and vandalism to the exclusion of #OO’s positive achievements. Yet a large proportion of YouTube clips devoted to Occupy Oakland show street skirmishes and smashy just like MSM—indeed, … Continued

World War Web Advisory #1: Are You An Unwitting Victim Of Internet Censorship?

WORLD WAR WEB ADVISORY #1: ARE YOU AN UNWITTING VICTIM OF INTERNET CENSORSHIP? 1. Emails about topics like Occupy Wall Street, 9/11, the USA Patriot Act, Citizens United v. FEC, AT&T Mobility v. Concepcion, the National Defense Authorization Act of 2012 (NDAA), the Stop Online Piracy Act (SOPA), the Protect IP Act (PIPA) and the... [Read More]