Active recall with AWS Polly

17 de septiembre 2022ComentariosjavascriptDavid Poza SuárezComentarios

Have you ever listened about active recall or meaningful learning? I'm not an expert in the matter, but what is important here is that they are effective memorization methods, backed by science, and I've preparated a simple script capable of creating audio files in order to apply these techniques on english learning.

I came up with this idea when I was studying phrasal verbs. It is intended to partially mimic the flashcards mechanism from Anki, but in simpler and unattended form. When you are given a cue (sentence in spanish or a question for instance) and you have to think of the english version (answer), you are actually using active recall, because you are making an effort to remember the translation, not just simply seeing both "question" and "answer" at the same time. Also you are applying meaningful learning because you're putting it in context with a real sentence example.

How does it work?

First, I created a set of examples sentences in english, containing all the mandatory phrasal verbs for the First Certificate. You can find these examples in regular dictionaries or looking at context engines like linguee or reverso context.

Secondly I need to provide mentioned list as a file in csv (comma separated value) format. You can use libreoffice or Ms Excel to do that.

Now it parses the csv file and translates every sentence using DeepL, the best translator out there. It uses AI, that's why it is so reliable.

Once it has both "columns", english and spanish one, it's the moment of synthesize the locution for them. To achieve this, I'm using AWS Polly with its new neural voices, which sound really natural.

All in all, the script gives you the option of concatenate all files into one with silences in between, so you can pick it up, copy it to your mobile device and practise conveniently during your commuting time.

How can I use it?

Step number one is just cloning the repo ( and installing all dependencies using npm i. Ah! It also requires the ffmpeg libraries, and supposed you are on a Debian based distro is straighforward, you have a package with the same name.

Next step is filling in the .env file with credentials for DeepL API (it has free plan) and your credentials for AWS. The cost for Polly is acceptable, it costed me about 0.25$ per the whole list of phrasal verbs sentences. Take into account that not every zone has neural voices available, I'm using EUW2 - London.

The environment variables file should look like this:


From now on you can launch the script from the command line. It has several arguments available:

  • concat: it merges all the resulting audios into a unique one.
  • voice-id: It admits any AWS Polly voice identifier. For example "Matthew"
  • repetitions: It expects to find a number, that's times you want to listen to the solution. This is useful if you find hard to understand spoken english at normal speed.
  • random-order: This modifies the playing order from the one specified in the csv file. Maybe you grouped sentences by meaning and maybe It's an unwanted hint on getting the answer.
  • invert-columns: If you prefer working on the other way, listening English and guessing the answer in Spanish, you don't have to modify the whole file, just add this parameter.
  • disable-translation: As some people may go for translating on their own, this option disables DeepL and picks up the senteneces from second column in file.
  • disable-synthesizer: On the contrary, if you are just testing translation, you can disable AWS Polly process.

Real example

As I said, I created an audio file with all the phrasals for FCE starting from a list of sentences gathered from different sources, here is the list. And here is the audio in video format:

What do you think about the result? Pretty impressive, huh? Far from robot-like google translator voices...

I believe this is the best way of using repetition, putting the vocabulary in context, so you are creating many neural connections against the memorized word, and consequently improving chances of evoking it in the right moment during a conversation. Also it's crucial to practise the recall starting from the word in both Spanish and English.

See you on my next freak idea 😉