Article body

The past thirty years have produced a rich body of corpus-based research and ground-breaking advances in contrastive linguistics and translation studies since Baker (1993) envisaged Corpus-based Translation Studies (CBTS). Recently, the field of corpus-based translation and interpreting studies has gradually expanded in methodology, theory, analysis and applications. It is within this context that the present volume, co-edited by Sylviane Granger (Professor of the English language and linguistics, University of Louvain) and Marie-Aude Lefer (Associate Professor of Translation Studies and English-French translation, University of Louvain), offers an in-time and in-depth survey of the latest developments and trends and shows potential future perspectives and orientations for the development of CBTS.

Besides an introduction and an index, this volume includes four main sections, entitled: Corpus-based Translation Studies: Current challenges and future perspectives (Part I), Recent methodological and theoretical developments in CBTS (Part II), Corpus-based empirical studies (Part III), and Corpus use in translator training (Part IV).

The volume opens with a section focusing on the trends, the challenges and the future of CBTS. One paper focuses on the current state, the other expounds the gaps and challenges. The first paper “Corpus-based translation and interpreting studies: A forward-looking review” is contributed by Sylviane Granger and Marie-Aude Lefer. They carry out a thorough survey of corpus-based translation and interpreting studies, based on 186 recent articles (2012-19) in 12 scientific journals written in English. Automatic extraction and manual filtering of the data reveals an upward trend in this field (p. 20), with empirical studies accounting for two thirds while applied and methodological-theoretical ones lag far behind (p. 21). A detailed analysis shows that translation universals, namely, explicitation, normalisation and simplification are the most investigated. The study relies heavily on parallel corpora and basic techniques such as frequency and concordances, while more advanced techniques, such as multivariate methods have received much less attention. The authors point out that the survey is only partial, as it is limited to journal articles written in English (p. 37), but that it is still helpful for suggestions on future developments in CBTS.

Federico Gaspari follows with an overview and analysis in “Expanding the reach of corpus-based translation studies: The opportunities that lie ahead.” The author reviews the development of CBTS in the last three decades, particularly the key areas of translation theory and corpus methodology. In terms of translation theory in CBTS, its focus extends from translation universals (explicitation in particular) to mediation universals, among which directionality plays a key role. Regarding methodology, the author recommends to scholars in CBTS to leave their comfort zone. In digital times, it is imperative for CBTS to challenge the traditional translation methods, namely, the well-established methods of adopting comparable and parallel corpora and employ novel approaches to extend translation areas in investigating data produced on the Internet, app and streaming TV (p. 50).

Part II deals with recent methodological and theoretical developments in CBTS. In the first paper, Haidee Kotze constructs a constrained-language framework to explore the similarities and differences between constrained-language varieties and native varieties. After providing an overview of the rationale for the constrained-language framework, the author constructs a model with macro-level and micro-level constraints, aiming to generalise the factors that shape language use in constrained language or communication contexts. Taking the complementizer that in three varieties of English (English original, translated English from Afrikaans and South African English) as a case study, based on the corpus of five registers, random forests analysis and conditional inference tree suggest that translators are more likely to opt for the explicit that than writers across the three varieties (p. 89). It is an advancement that the research shifts from single-feature approach to multivariable approaches and it helps to understand the linguistic choices of the translators bound by language-internal and social constraint.

Next, Stella Neumann, Jonas Freiwald and Arndt Heilmann take subject identifiability as an example to demonstrate combined observational and experimental research by investigating data from translation product and translation process, aiming to further increase explanatory power in CBTS. Focusing on English original declarative clauses and their aligned German translations in the register of popular scientific writing from the CroCo Corpus, they collect the experimental data from 73 participants (29 trained translators, 13 translation students, 31 untrained participants) by keystroke logging and eye-tracking. Binomial generalised mixed regression models have been used to show that subjects in German translations are more likely to be identifiable than those in English originals, and the distribution of identifiable and non-identifiable subjects in German does not change noticeably between the pre- and post-verbal position, which may be caused by text material (p. 113). Moreover, most identifiable English subjects either stay completely intact or receive small formal changes, while the non-identifiable subjects are changed more.

In Part III, issues related to corpus-based empirical studies are examined. Ilmari Ivaska, Adriano Ferraresi and Silvia Bernardini explore syntactic properties in two types of constrained language use: Second-language Acquisition (SLA) and Translation Studies (TS) with three registers (argumentative writing, political speeches and tourism-related communication). Taking POS dependency bigrams as an example, the authors use a corpus-driven method, namely keyness analysis and multidimensional analysis within the constrained-language framework. They have identified 15 (out of 1,000) dependency bigram patterns in over half of the 12 pairwise keyness analysis irrespective of the L1/SLs involved. According to the findings, non-native-language users generally make more use of clausal elaboration/verbality than phrasal elaboration/normality. Syntactic differences concerning post-nominal modification and determiners reflect register differences, which are most common in political discourse and least common in tourism-related discourse. Proper nouns are also register-related patterns, and among the unconstrained varieties, proper nouns are less common in argumentative writing than they are in political or tourism-related discourse (pp. 144-47).

Taking the translation shifts from grammatically metaphorical of-constructions as an example, Arndt Heilmann, Tatiana Serbina, Jonas Freiwald and Stella Neumann illustrate that multivariate statistical analyses can help to address different factors that influence translation phenomena. The study applies data in the register of popular scientific writings and tourism brochures from the parallel English-German Croco Corpus. After annotating the data with respect to de-metaphorisation as well as classifying the data into nine semantic categories (one is deleted), the authors use a binomial linear mixed-effect model for statistical analysis. The findings show that the distribution of the eight semantic categories of the of-construction across the two registers is statistically significant, among which the categories POSSESSION, QUALIFICATION and ENGAGEMENT are some of the most common types. Also, popular science has distinctly more cases of ENGAGEMENT, whereas tourism has more POSSESSION and QUANTIFICATION of-constructions (p. 170). A generalised binomial lineal mixed regression model shows that de-metaphorisation occurs significantly less often than keeping the level of metaphoricity (p. 172). It might be caused by the structural equivalence of of-construction in English and the comparable von-construction in German, and translating literally might be a strategy to save translation efforts. Therefore, it is worthwhile to investigate more registers and different translation directions for exploring this phenomenon better.

Ekaterina Lapshinova-Koltunski addresses normalisation and shining-through in novice and professional translations to analyse the influence of translation competence and register, taking lexico-grammatical patterns as an example. Based on seven registers from the CroCo corpus, the author applies text classification with support vector machines (SVM) to show more shining-through in ESSAY, INSTR and TOU in professional translations, while more normalisation in FICTION, POPSCI and SPEECH. The findings also show that the translationese effect is not balanced in different registers and it is different between student and professional translators.

Part IV shifts the spotlight from theory and methodology in CBTS to the application of corpora in translation teaching and translator training. Heidi Verplaetse reports an experiment on the impact of the respective translation resources on student translation product quality from acceptability and adequacy errors in the monolingual target language corpus (MOC) vs. Linguee transition conditions. The findings show that more errors are made when students translate with the MOC than with Linguee, which appears to point to a slightly better general translation performance with bilingual resources than with monolingual resources, confirming that the use of a bilingual concordance improves adequacy in translation, while the use of MOC does not improve grammar significantly (p. 223). In terms of error subtype frequencies, lexical errors are predominant both with the MOC and with Linguee, but even more prominently with the latter, which indicates that the MOC compensates for the absence of a bilingual resource and even provides an advantage at this target text level (p. 224).

In the last chapter, Natalie Kübler, Alexandra Mestivier and Mojca Pecman conduct two experiments discussing the application of corpus on translation training, taking complex noun phrases (complex NP) as an example. They first identify and analyse the annotated errors made by students while translating with and without the help of corpora. The most frequent errors are incorrect analysis of the structure and dependency of complex NP constituent, incorrect modifier attachment within a complex NP involving noun coordination, misidentification of the head and forbidden insertion within a term embedded in complex NP. The authors then added a post-editing task to examine to what extent machine translation (MT) system influence students’ translational choices. It shows three types of errors made by trainee translators: over- or under-correction of MT output and failure to correct the MT solution (pp. 250-51). The findings show no significant difference between the two translation experiments with and without corpus use. A reason for this might be that complex NPs involve a high degree of complexity so that students cannot correctly query corpora for such complex lexical items. The authors point out more evidence is needed to assess the usefulness of corpora in translation teaching, for instance, remedial classroom activities.

Overall, the edited volume is reader-friendly and enlightening. A detailed list of references, and great key readings with a brief introduction attached to each paper will direct the reader to vital further resources. Therefore, it can be used as a helpful reference and essential reading for both established scholars who are devoted to CBTS and newcomers who hope to understand the latest developments in the field: the book offers a worthwhile snapshot of the development and trends of CBTS and points the way forward for further research from theoretical, methodological and applied aspects. Theoretically, it covers the latest developments, extending from translation universals to mediation universals covering not only translated language, but also constrained language, such as varioversal, second language, etc. Methodologically, it has moved away from basic text processing operations (Baker 1995: 226), such as frequency and concordance, to more advanced statistical testing (De Sutter and Lefer 2020: 2), such as multi-methods, multivariable approaches or multifactorial statistical techniques, etc. Moreover, the present book showcases the rise of an interdisciplinary approach to CBTS. Translation is a complex and multifactorial phenomenon, and only if many variables are considered (such as a combination of linguistic, cognitive and sociocultural factors, as well as registers, translation directions and more language pairs), can the constraints be determined. Finally, the discussions of CBTS in applied research will promote the exploration and development of the nature of translation teaching and translator training through a new lens. The above advancements provide valuable guidelines for the current challenges faced by CBTS. For this reason, the volume is timely.

While the volume overall achieves a good balance of subfields in CBTS, some readers might still lament the absence of two separate chapters, one dedicated specifically to corpus-based interpreting studies, and the other to advanced techniques. Translating and interpreting are sister disciplines (Shlesinger 1998; Defrancq et al. 2020: 1), and there is still a lack of research on corpus-based interpreting studies in the present volume. Besides, the advanced techniques require more knowledge of the software and statistics, without which the progress of replicating research in CBTS could be hindered. Only when the methodologies are well documented and in detail, and the software is easily accessible, can the research be easily replicated for other data or for other language pairs. To this end, the field of corpus-based translation studies can attain full maturity only if the techniques are mastered by most scholars and the scope of both corpus-based translation and interpreting are accordingly extended and strengthened.