Voice Sampling is the process of taking samples of audio including voice and pasting together to form other words. You can see them in various places, and you maybe have already experienced using one before. These include the popular sites for either translation or text to speech you all know some of them including: ‘Google Translate, Microsoft Translator, Google Text-To-Speech, Apple’s Voiceover’ and more.

An example of Voice Sampling used to generate text in Google Translate. (transate.google.com)

How does it work?

As you may know, you can add custom phrases and then the software behind these sites will process it and serve your request. It’s obviously not only physically impossible but also incoherent and non-efficient to put someone (Since a lot of the text-to-speech voices were read by a human at some point) to read every single possible phrase and word. That’s why Voice Sampling exists. Voice actors will have to read only things like common vowels used together and then the software behind these text-to-speech sites will process it, this is good for names that aren’t necessarily known or a real word like foreign company names. Obviously these text-to-speech engines will make mistakes but they’re getting better and better as time comes, this can also be used for the bad as people can sample the voice of a politician or other popular figure and mislead them (examples: a celebrity is voice sampled and then they announce a private concert to which people have to pay online or a president being voice sampled into faking threats to other countries (Especially those with little to no access to the internet), sign deals or alert citizens to do either action.

Use cases.

Ok so we talked about the bad things that could happen but there is also a possible good side. Apart from translation and easier communication between users that speak different languages, there are also several tools in place for those who are visually impaired. Such as their phone with programs like voiceover already in place and actively being used which could make things easier. Some apps that let people scan a photo and get text readout with technologies like OCR (Optical Character Recognition) may also help people with conditions like dyslexia.

Also, there are programs in place for those who have speaking difficulties to practice with text-to-speech engines on a set instruction sheet for things like homework or just when the speech therapist isn’t around, this can also work for people in areas were there is a lot of poverty like in third-world countries or in places were there is a lack of doctors like these.

So let's review some of the PROs and CONs of voice sampling.


  • Easier communication.
  • It can replace text content.
  • Good for education (Like Ebooks).
  • It makes things easier for the visually impaired.
  • It can help people with speaking difficulties.
  • It can help people learn the pronunciation of a new language they are learning.


  • It can be used to mislead people.
  • It is still not very stable and can make errors.
  • In cases when it makes an error it may say something unwanted.
  • And those are the major pros and cons we could think of, please let us know if you enjoyed reading this article so we can make more like these.
