From tapes to AI: navigating the evolution of technology in pronunciation teaching

By Ana Paula Biazon Rocha

Pronunciation teaching and learning have long benefited from technology. From the very first cassette players and tapes for drilling and repetition in the 1960s to the current and emerging applications of AI (Artificial Intelligence), technological advances have brought unprecedented changes and innovations. These developments have significantly impacted learners’ training in the perception and production of core pronunciation features, namely segmentals and suprasegmentals, and the subsequent automatization of their L2 speech (Chun & Jiang, 2022; Levis & Rehman, 2022). In preparation for PronSIG’s online conference, ‘Will the AI revolution remove the need for pronunciation teachers?’, on 12 Oct 2024 (Sat), this post will analyse some key points about the role of technology in pronunciation teaching and learning, discuss things to consider when selecting technology tools from both teacher and learner perspectives, and help you get ready for our conference (register here).

1. Some key points about technology and pronunciation

A. ‘Pronunciation, like all aspects of spoken language, is usually ephemeral’ (Levis & Rehman, 2022, p. 296). This means that the way someone speaks cannot be reproduced exactly, sound by sound. Thus, the possibility of recording speech to play it back later allows teachers and learners to have concrete pronunciation examples to analyse and work with. Coursebook CDs or online audio tracks, for instance, have been pivotal in language learning and teaching, offering valuable aural input and practice. In addition, with mobile phones and computers, teachers and students can easily create their own pronunciation recordings, making pronunciation practice more accessible (here’s a free online recorder that I regularly use and recommend to my learners).

B. Unlike written texts, seeing speech with the naked eye is impossible. In other words, we cannot see the airflow from the lungs, the movement of the vocal cords, or the precise positioning of the tongue during sound production. Through websites such as Seeing Speech (University of Glasgow and Queen Margaret University, Edinburgh) or Tools for Clear Speech (Baruch College, New York) to name just a few, we can better understand how sounds are articulated. Other softwares such as Praat and Audacity also help make abstract phonetic concepts such as pitch movement visible through spectrograms, which can greatly aid in teaching stress and intonation patterns.

C. Technology use in pronunciation teaching promotes learner autonomy. With the resources mentioned earlier, students can engage in self-directed practice anytime and anywhere. Clearly, they should be instructed on how to use those tools effectively, but once they get the hang of it, they can extend their pronunciation learning independently. An interesting example of this is automatic speech recognition (ASR), which converts speech into text. Whether sending a message via their mobile phone or typing in a Word document or Google Doc, learners can test and refine their pronunciation, receiving immediate feedback. If the ASR system fails to recognize their speech or produces unintended text, learners can reflect on what aspects of their pronunciation might need adjustment.

D. The internet has made authentic speech and diverse English varieties widely accessible. As Walker and Archer (2024, p. 53) note, ‘students can hear English spoken by hundreds of competent speakers in multiple different accents’. One of the main challenges for pronunciation technologies is shifting focus from native-like models to intelligible and comprehensible ones. To put it simply, using the example above, an ASR system might not recognise a learner’s pronunciation due to lack of training in different varieties of English regardless of the clarity of the learner’s speech. Therefore, it is crucial to raise students’ awareness of such limitations to ensure they do not misinterpret feedback from these systems.

2. So many choices: where to start?

With numerous technological resources available, it can be overwhelming for teachers to decide which tools to integrate into their lessons and how to use them effectively (Yoshida, 2018). Likewise, teachers should also guide students in identifying effective tools to support their independent practice (Walker & Archer, 2024). To navigate these options, it is essential to consider several factors when selecting the best tools for classroom use or recommending good apps for students to practise pronunciation:

Table 1: Factors for teachers and learners to consider when selecting technology tools

Another crucial point is that different teaching contexts have different levels of access to technology tools. So far, we have discussed using mobile phones, computers, and the internet to promote pronunciation teaching and learning. But what about contexts where technology and resources are more limited? In these cases, alternative tools – or even teachers themselves – can serve as the ‘technology’ used to facilitate learning (you can read more about it in this previous blog post).

3. Pronunciation and AI

AI has been transforming the language teaching and learning landscape lately. Its rapid and widespread development can be daunting for many teachers as we are still unsure about its long-term effects and implications. However, pretending it does not exist or that it is not used by our learners may not be advisable. On the forefront of this discussion, PronSIG’s October online conference, ‘Will the AI revolution remove the need for pronunciation teachers?’, will help us reflect on this new reality.

The programme features various sessions focusing on pronunciation teaching and AI, including the opening plenary, ‘What Would It Take for AI to Replace Pronunciation Teachers?’, and the closing plenary, ‘Surveying the Gap Between CAPT and STT Programs: Can AI Chatbots Fill the Void?’. These sessions will be led by two experts in pronunciation and technology: Beata Walesiak and Sharon McCrocklin, respectively. Most of the sessions will also be run by women, highlighting PronSIG’s commitment to inclusion and diversity – a noteworthy aspect that aligns with our values. This will be an unmissable conference (register now)!

You can find more about teaching pronunciation using technology by checking this previous blog post, or by reading chapter 6 of Walker and Archer (2024, p. 53-60).

You can also explore other blog posts to expand your knowledge of pronunciation instruction. Don’t forget to leave your comments below and follow PronSIG on social media!

References

Chun, D. & Jiang, J. (2022) Using to explore L2 pronunciation. In J. Levis, T. Derwing & S. Sonsaat-Hegelheimer, Second Language Pronunciation. Bridging the Gap Between Research and Teaching (129–150). Wiley Blackwell.

Levis, J. & Rehman, I. (2022). Pronunciation and technology. In E. Hinkel (Ed.), Handbook of Practical Second Language Teaching and Learning (296–311). Routledge.

Walker, R. & Archer, G. (2024). Teaching English Pronunciation for a Global World. Oxford University Press.

Yoshida, M. T. (2018). Choosing technology tools to meet pronunciation teaching and learning goals. The CATESOL Journal, 30(1), 195–212. Retrieved from https://files.eric.ed.gov/fulltext/EJ1174226.pdf .

Leave a Reply Cancel reply