Technology changes over the last 10‑15 years have added a huge range of new tools that learners and teachers can use in their classrooms. These include tools not even designed with teaching in mind, such as Padlet, in addition to a range of edtech tools that have been adapted for English language learning, including apps such as Memrise, Kahoot and Quizlet. Furthermore, advances in Learning Management Systems, and the rise of Google Education have enabled teachers and learners to use educational tools in a simplified way. Teachers can, set homework, give feedback and generally keep on top of what their learners are doing across a range of different platforms. This has led to more learner autonomy, more opportunities to ‘flip’ traditional classroom models, and more variation in the ways in which and between lessons. Perhaps the most mould‑breaking use of these technologies is by online tutoring providers such as VIPKID and Speexx, who combine large amounts of online practice content without remote delivered real‑time lessons delivered by teachers who can be on the other side of the world. In addition to the rise of these apps and services, the last 15 years have seen a significant rise in self‑study language learning services, with Babbel and Duolingo among the most prominent. Local products such as Hello English in India and South Asia are also attracting large numbers of subscribers. However, self‑study apps remain largely the preserve of the occasional learner. For committed language learners, teachers, and often the classroom, remain a key part of the experience.
The changes outlined above have increased access to English‑language learning and have increased learner autonomy. However, it has been argued that the promise of the AI revolution to deliver ‘personalised learning pathways’ for learners is not really delivering in the context of English language learning (Kerr, 2019). Developing communicative language competence is a multi‑faceted and complex affair that relies on evidence from a complex series of outputs, many of which are not easily captured as data points. As such, the judgement of a teacher is still the most important ongoing measure of how learners are progressing and what they should do next. Assessment for Learning is an essentially human process.
The impact of technology on summative assessment is also felt in many ways. For decades, assessment organisations have used adaptive testing, and AI is leading to approaches becoming ever more refined. In addition, advances in technology have made tests easier to access through remote proctoring and the use of multiple devices. Finally, AI has had an impact on marking, with automated marking of writing and speaking established in many, particularly lower‑stakes assessments, bringing down the cost of assessment for users and allowing both greater reliability and quicker turnaround of marks. However, the fundamentals of Assessment of Learning also remain the same: the proficiency test, used to create a generalisable score that can be used by learners and others to make judgements about learners’ place on a scale, sometimes in a particular context.
Jones and Saville, in their conceptualisation of Learning Oriented Assessment, conceptualise the difference between Assessment for Learning and Assessment of Learning as follows:
Assessment for Learning, on the right, is primarily a matter of observation and interpretation by the teacher, which she will then use to refine her strategy, give feedback to learners, update class objectives and decide what to do next. Teachers are experts in this form of assessment, which provides the horizontal view of where learners are. Teachers have a detailed picture of strengths, weaknesses, learning preferences, motivation and much much more that Assessment of Learning, usually standards‑based summative assessment cannot provide.
Assessment of Learning can instead be used to validate learning against broader standards such as the CEFR. If the summative assessment conforms to principles of validity, reliability and impact, while being practical (for VRIP see Principles of Good Practice), that test is also learning oriented because success in the test represents successful language learning as defined by the construct on which the test is based.
The two complementary approaches of using teacher judgement for breadth and proficiency testing to measure against standards is effective when the syllabus – the goals of learning – are common to both. However, there are difficulties that technology can provide solutions to.
On the right‑hand side of the above diagram, teachers are often keeping disparate records of large numbers of learners. Teacher workload is an enormous issue all over the world, and a major cause of heavy workload is the time taken to mark students’ work, collate data from multiple sources and plan lessons that meet the disparate needs of their classes of students.
In terms of marking, there are tools such as Cambridge Write and Improve and Grammarly, which can automark learners’ writing, picking up on surface level errors and encouraging learners to self‑correct, before handing their work to the teacher for a final check, focussing on discourse‑level features and effect on the target reader. In addition to saving teacher time, this also encourages learner autonomy as learners can correct their work immediately and many times without having to wait for their work to be marked by a human.
But more might be possible, in knowledge‑based subjects such as maths and sciences, there are personalised learning tools (Isaac Physics, Century tech) that provide personalised learning to students and pass performance data to teachers. They also recommend areas that the teacher can focus on based on strengths and weaknesses of the whole class. The fact that this does not exist to the same extent for English demonstrates the difficulty of applying these approaches to a subject whose aim is to teach communicative language competence.
In Cambridge, we are experimenting with various approaches at Cambridge Assessment, Cambridge University Press and within departments of the university. To create successful personalised learning solutions, we need to overcome the following methodological constraints:
Proficiency testing does not by its nature test the details of what has been learned and how progress has been achieved. Everything from the design of tasks to the specification of tests is about eliciting a generalised.
Proficiency testing does not by its nature test the details of what has been learned and how progress has been achieved. Everything from the design of tasks to the specification of tests is about eliciting a generalised view of performance.
Feedback must be specific and actionable. As we collect the data, we need to be sure that we are adding the value that learners and teachers want, and that they are able to use the advice they receive to change the focus of their study
view of performance. Also, although built around a socio‑linguistic construct, even good proficiency tests do not aim to overtly test the processes that lead to successful outcomes. However, this is crucial to what happens in classroom assessment. Teachers help their students with learning strategies, editing skills and explicit focus on the grammatical and lexical building blocks of English. We need to see how we can build tools that can burrow into narrow corners of the construct, to give better diagnostic information to learners and teachers. This is a daunting as assessment tasks, as all tasks in language, no matter how narrow they are, draw on a range of knowledge and competencies. Different learners get questions wrong for different reasons and we need to build up a sophisticated picture of learner behaviour across multiple tasks.
One of the most important motivating factors for learners is evidence that they are making progress. However, progress is a difficult thing to measure, especially in the small increments that learners want. Also, as discussed above, learner progress is not always immediately reflected in better performance in tasks. In addition to designing different tools, as discussed above, we also need to find ways of conceptualising what progress is and passing that information on to learners in a meaningful way. This is not primarily about tech tools such as gamification, although that helps; this is about finding ways to tell learners that they are getting better as a result of their studies.
Feedback must be specific and actionable. As we collect the data, we need to be sure that we are adding the value that learners and teachers want, and that they are able to use the advice they receive to change the focus of their study. To this end, telling learners that Listening skills would improve with ‘more listening’ is not helpful or useful.
We need to be clear that we are not replacing the role of the teacher, because this isn’t possible. As with tools such as Write and Improve, we can use technology to supplement teacher judgement, improve the level of data they can use and support them in the choices that they make. Performance data is not, and is unlikely ever to be, better than teacher judgement. It is simply different.
There is a long list of challenges, and we know we are at the beginning of the journey of addressing them. Concrete steps we have taken include, working with colleagues at Cambridge University Press to build a joint curriculum framework that covers our assessments and coursebooks. We are using this curriculum to create a metadata framework, which we will be using to tag every piece of content we create. This metadata, combined with a combined data store will give us the data that we need to start building a more holistic picture of learner progress throughout their language learning journey. We can then use this information to design products that meet the needs outlined above. We will also collaborate with the computer science, engineering and computational linguistics departments at the University to learn how we can use machine learning tools to build AI‑led solutions.
We are working on diagnostic testing solutions and automated speaking and writing practice tools. We know there is a long way to a tech‑led transformation of English language teaching and learning. In order to get there, we believe we need a strong theoretical grounding in the principles of language learning and assessment, experience in turning that theoretical grounding into high quality products, but most importantly, an ability and willingness to rethink and reimagine how we can better help the learners and teachers of today and tomorrow.
Glyn Hughes is Cambridge Head of Assessment Quality and Validity. In his role, Glyn oversees all of the work that happens in Assessment. Glyn has recently been working on a number of projects that are looking at how assessment is changing due to changes in technology and how assessment can do even more to support learning in different ways in the future. Glyn is also leading the efforts to improve our procedures for producing high-quality assessment material as quickly and efficiently as possible. In his time at Cambridge, Glyn has worked on a number of products and services. Before joining the organisation, Glyn gained extensive experience in language teaching, language school management and teacher training. Glyn holds a masters’ degree in Applied Linguistics from Aston University.