The Word Brain (2)

The conceptual framework of Italian with Elisa is published at (76 pages, free PDF). The two key chapters are Chapter 1: Words and Chapter 2: Listening (see below).

The full edition of The Word Brain is available in English, Spanish and Serbian. Short editions are available in German, Portuguese, Farsi and Italian.

 Chapter 1: Listening

Have you recently listened to people speaking unfamiliar languages? If you haven’t, turn on your radio or TV set, select a station from another country, and within minutes you will hit a broadcast with loquacious individuals talking all the time. Alternatively, if you live in a metropolis, go down onto the streets and spot groups of animated people speaking foreign languages. Listen attentively. You will soon notice that humans produce continuous streams of uninterrupted speech. The overall impression? Phonological porridge, polenta, bouillie. For the non-initiated listener, it is hard to grasp that there is much structure to such seemingly random proliferation of sound. The reality is different, of course. Any single language you come across on Earth is as differentiated, distinguished, beautiful, and funny as your native language. Impenetrable as foreign languages appear to be, on the scale of a human lifetime, they are just around the corner – give them two or three years, and any of them is yours. It is a refreshing thought that all humans are brothers and sisters in language.

A porridge-like sense of unintelligibility prevails even after years of language classes at school. You are able to decipher a restaurant menu and order a dish of spaghetti, but comprehension vanishes as soon as the waiter starts talking. The same happens with bakers, taxi drivers, and hotel employees – again polenta and pea soup. It seems as if years of classes studying grammar and learning long lists of vocabulary produce little or no effect. You can read Goethe, Shakespeare, Sartre, Cervantes, or Dante, and yet you don’t understand their descendants. Many of us conclude that we are inept at learning other languages and never try again.

The apparent easiness with which humans learn their native language during the first years of life, is intriguing. Not only do young children readily soak up any of the thousands of possible human languages, but they also learn to understand a huge variety of radically different pronunciations – mum and dad, the neighbours, the fisherman at the street corner, people speaking other dialects, stuttering infants, and toothless grandparents. To date, there is no machine capable of this level of speech recognition.

How do young children outperform the most sophisticated machines? How do they structure linguistic input into meaningful units so rapidly? To answer these questions, look at how you spent the first 6 months of your life. As a physiological preterm primate, your interactions with the world were pretty limited – eating, digesting, looking, and listening. With such a limited repertoire of actions, every single action necessarily received an immense share of your attention. Once digestion was settled, you mutated into an ear-and-eye monster, capturing shapes and movements around you and soaking in every single sound you heard. You didn’t lose a minute setting about the most important task of your life: putting structure into the sound produced by the people who inhabited your life. The first hurdle was determining the word boundaries within the language of your ancestors. Where do single words begin; where do they end?

As you see from Figure 2.1, the sound wave per se does not confer information about the boundaries between single words. To show the magnitude of the task you face in a new language, try to delimit the word boundaries:



Figure 2.1: Sound wave pattern of ‘ Putting structure into the porridge of sound produced by the people who inhabited your life.’



Delimiting word boundaries in a speech stream is no easier than trying to determine them in the previous paragraph. So how do young infants crack the sound code? They perform frequency analyses. Take for example the sound sequence What a pretty baby you are. Through continuous exposure to human language – babbling humans produce 10,000 words and more in a single hour ! – infants progressively understand that syllables which are part of the same word tend to follow one another predictably (pret-ty, ba-by), whereas syllables that follow one another less frequently are word boundaries (a#pret, ty#ba).[i]

This type of frequency analysis is dependent on a well-functioning memory that accumulates an ever-growing number of words and, of course, extensive training. The problem is speed. As human speech can produce three and more words per second, there is little time for either childish astonishment or for adult considerations such as ‘What does that word exactly mean?’, ‘Is the verb in the present or past tense?’, ‘What the hell is that grammatical structure?’, etc. At full speed, speech is unpardonable – a single instant of indecision makes you stumble and after getting onto your feet again, the sentence is gone. Speech comprehension is therefore a triple challenge: slicing human speech into digestible units, endowing them with meaning by matching the segments with thousands of existing words stored in your brain dictionary, and, finally, doing all this without giving it a second thought. Fortunately, our word brain is genetically programmed to do these mental acrobatics, and as you have already done it once – when you learned your native language – you can do it again with other languages as often as you want. To see what it looks like when your auditory brain cortex works at full-speed, put your brain into a PET scanner (Figure 2.2).[ii]



Figure 2.2 Listening to words: High activity in the auditory brain cortex. Adapted from Raichle, 1988.[iii] Used with permission.


Thorough training is paramount. In my experience, it took around 1,500 to 2,000 hours of intense listening to achieve ‘semi-perfect sequencing abilities’, both in French and Italian. Amazingly, the results were similar for Arabic, a language so totally different from everything I had learned before. This seems counterintuitive because in Arabic, I needed to learn at least three times as many words as in Italian, and raises a couple of questions: Could the time of exposure that is needed to achieve full sequencing abilities – 1,500 hours would translate into 6, 4, and 2 hours per day over a period of 9, 12, and 24 months, respectively) – be a human constant? Should our speech recognition abilities be independent of the type of language we learn? Perhaps even relatively immune to the effect of ageing? And are young children truly superior to adults in word segmenting or do they simply dedicate more time to listening than adults? Some of these questions will be answered by future research, but I am inclined to accept that there is a physiological threshold for human brains to get wired to the ability of dissecting the sounds of new languages. You would need a minimum of time to perform this task, but you wouldn’t need much longer than that.

You are now able to solve the close-to-zero-understanding-after-years-of-school problem that we exposed at the beginning of this chapter. If teenagers are frustrated when they put their school knowledge into practise, it is because school teaching is insufficient to get you anywhere near the 1,500-hour exposure minimum. Even if your teachers teach exclusively in the foreign language, you will rarely total more than 500 hours of attentive listening in a typical 5-year course. Thus, you discover that your teachers were innocent – they simply did not have enough time to get you through your speech segmentation task.

So, if private and public schools are not in a position to provide us with sufficient exposure to human speech, where can we go to get it? The best school, of course, is life. Emigrate, either definitely or for just one study year, and take a linguistic bath in a new language environment. The younger you are, the more flexible your brain, and the easier it will be to find yourself in groups of people who never stop talking. Add an intense love affair, and your daily listening quota of 8, 10, or even 12 hours will soon be a reality. Within a year, you are a perfect speech segmenter.

If you choose to stay at home, you will need speech surrogates. With a workload of 500 to 1,500 hours from the previous chapter, you may find it demanding to accommodate another 1,500 hours of training in your time schedule. You are lucky. As listening can easily be done in parallel to other activities – commuting, doing sport, cooking, etc. – you will manage to dissolve the bulk of your speech recognition programme in daily life (like a murderer who dissolves a corpse in an acid bath!). Thereafter, you just have to change your TV habits (more about that below), and the true extra study time can be reduced to around 100 hours. Just remember these two important pieces of advice: 1) During the first year of your training, never read a text without hearing the sound. 2) Only listen to audio sources if you have the corresponding text at hand.

The immediate consequence is that it is imperative that your first language manual comes with a CD-ROM (CD). During the 100 hours of extra study just mentioned, listen to the CD. As expected, even with the text in front of your eyes, comprehension of the audio files is not always immediate. In these cases, take single sentences or even single words, put them in an audio loop and listen to them 5, 10, or 15 times. Some audio devices come with a convenient button to define the beginning and the end of the loop. Using this sledgehammer method cracks every sentence within minutes. More importantly, don’t feel uncomfortable if you listen to a language CD for the 54th time. This is all but dishonouring, and after all, you did exactly that with your favourite music when you were young.

Insomnia, too, is an excellent moment for donning your earphones. Some people will discover that the incomprehensible sounds will lull them into sleep. Finally, don’t be afraid of unconventional behaviour. If you are used to having a siesta, put your earphones on and activate the loop mode. It is certainly impossible to learn words during sleep, but the sound and music of the new language will certainly enter your brain.

Once you have digested your first (and maybe second) language manual, you will discover that the Internet offers extraordinary tools for second-language acquisition: audio files plus transcripts! Scientists will appreciate the excellent transcripts of the Nature podcasts published before July 2014 (

The final surrogate for speech in real life is TV. Apart from high-quality documentaries, which are rare, TV is a poor source of content, and most of us would prefer reading books or scientific journals. TV is also mostly irrelevant. Suicide attacks in remote countries; minor earthquakes, tsunamis, or volcanic eruptions; old, helpless people murdered by drug-intoxicated gangs of youths; drug-intoxicated gangs of youths slain by paramilitary troops; paramilitary groups killed in an ambush by guerrilleros, etc. – all this has little or no impact on your personal and professional life, and watching TV is basically tantamount to killing precious life time. Imperfect though it may be, some broadcasts, for example TV news programmes, have nonetheless the composition of outstanding speech trainers. The journalists talk continuously, there is no background music to spoil the sound of the speech, the language is standardised with only a few slang words, and the images provide you with important clues for understanding what’s going on. In addition, TV news provides all the ingredients of a classical soap opera: the players (politicians) and the content (political crises) are well known, and you already know half of the story, and even if you don’t, it really doesn’t matter.

My advice: Stop watching TV in your native language and start watching TV in your future language. The TV genres that serve your purpose most are the news and documentaries if you wish to become familiar with the language of the media and the language of science; and soap operas if you are interested in more colloquial language. Listen to your new TV programme for 15 to 60 minutes every day, starting on the very first day that you begin studying another language. Persist, even if you don’t understand a single word. Remember: it is all about word boundaries, so try and discover your first words. As you will see later, identifying these boundaries is partly independent of knowing the meaning of the words.


Let us summarise:

  • Human speech is a continuous sound stream. To understand the meaning, your built-in speech-recognition system cuts human speech into single words, matches them with your vast brain dictionary, and does all this more or less unconsciously at a rate of three words per second.
  • To ensure extensive exposure to human speech, emigrate or find surrogates for real life: 1) Language manuals + CD’s; 2) Internet audio sources + transcripts; 3) TV.
  • If you cannot emigrate, dissolve your training into your daily life by listening to audio files during cooking, commuting, doing sport, etc. Change your TV habits and watch TV exclusively in your new language. Use earphones for enhanced comprehension.
  • Unless you emigrate, speech recognition training is as lonely a task as word learning. No one can do the job for you. Again, teachers are of almost no help (see also the Teachers chapter below).
  • During the first year of your training, never read a text without hearing the sound; and listen to audio sources only if you have the corresponding text at hand.
  • If you are an insomniac, plug your earphones in and listen to your audio material.
  • Allow 15 to 60 minutes for speech recognition every day.


Week after week, the sound pattern of words will flow into your brain. Again, your brain will be acting as a huge sponge, as cracking the code to a human language is not a reserved hunting ground for infants and young children. With time, as comprehension sets in, British porridge slowly mutates into French Cuisine. So far, so good, you might think, but you have noticed something rather curious. You have been told to learn 5,000 to 15,000 words and complete a 1,500-hour speech recognition course, but nobody has asked you to say a single word. Legitimately, you wonder if you will one day be authorised to pronounce some of the words you have learned and to communicate your precious thoughts to other people.

There are good reasons to restrain your desire to communicate. As you are a virgin – linguistically speaking – you might prefer to stay that way for a while. If you accept patience, my favourite prescription is a monastic ‘3-month silence’. Remember: you are not at school, there are no exams on the horizon, and you may therefore take a comfortable route when starting your new language. Concentrate on absorbing words, sounds and sentences, and, day after day, let the sound of the new language slowly sink in. Of course, you are too old for an exclusive baby approach to language learning, but for now, listen passively as young children do. Good pronunciation comes as a bonus of patient and attentive listening. So before you open your mouth, see in the next chapter what your eyes can do.


 Workload after Chapter 1–2

Speech-recognition training, typically 1,500 hours and more, can mostly be integrated into daily activities. Only about 100 hours of extra study time are needed while you become familiar with one or two language manuals. Added to the workload defined in the previous chapter, your total workload is now

600 to 1,600 hours


Chapter 3

Please find Chapter 3 of The Word Brain at