EP-Poland: Building A Bilingual Parallel Corpus For Interpreting Research

Magdalena Bartłomiejczyk, Ewa Gumul, Danijel Koržinek

Abstract


This paper reports on the process of building the EP-Poland corpus and on the first empirical applications thereof. This extensive bidirectional English-Polish corpus of original parliamentary contributions paired with professional simultaneous interpretations includes 11 European Parliament debates held between January 2016 and February 2020. The main topic of these debates is the rule of law crisis triggered by the Law and Justice government in Poland. The corpus contains over 157,000 tokens and about 20 h 45 min of recordings, counting both source and target texts. The two interpreting directions (English-Polish and Polish-English) are represented almost evenly. The annotation of the corpus completed so far includes mark-up information, POS tagging, labelling disfluency phenomena, and all forms of explicitating shifts. Manual annotation for personal deixis is in progress. An additional interesting feature is the speaker identification performed employing the X-vector method, which allowed us to identify 36 interpreters. We begin with an overview of the existing interpreting corpora. Then we proceed to explain the design features of the EP-Poland and report on two completed empirical studies analysing idiosyncratic interpreting behaviour. We conclude by outlining future development pathways and offering some remarks on corpus significance and its limitations.

 


Keywords


interpreting corpus; parallel corpus; simultaneous interpreting; political discourse; parliamentary interpreting

Full Text:

PDF

References


Baker, M. (1993). Corpus linguistics and translation studies: implications and applications. In M. Baker, Francis, G. & Tognini-Bonelli E. (Eds.) Text and technology: in honour of John Sinclair (pp. 233–250). Amsterdam/Philadelphia: Benjamins.

Bartłomiejczyk, M. (2016). Face Threats in Interpreting. A Pragmatic Study of Plenary Discourse in the European Parliament. Katowice: Wydawnictwo Uniwersytetu Śląskiego.

Bartłomiejczyk, M. (2020). How much noise can you make through an interpreter? A case study on racist discourse in the European Parliament. Interpreting. 22(2), 238–261. https://doi.org/10.1075/intp.00042.bar

Bartłomiejczyk, M. & Rojczyk, A. (under review). How native-like do conference interpreters sound in L2? A phonetic analysis of retour interpretations into English in the European Parliament.

Beaton, M. (2007). Intertextuality and ideology in interpreter-mediated communication: The case of the European Parliament. Unpublished Ph.D thesis, Heriot-Watt University.

Bendazzoli, C. (2018). Corpus-based Interpreting Studies: Past, present and future developments of a (wired) cottage industry. In M. Russo, C. Bendazzoli & B. Defrancq (Eds.) Making way in corpus-based interpreting studies (pp. 1–20). Singapore: Springer.

Bendazzoli, C., Sandrelli, A. & Russo, M. (2011). Disfluencies in simultaneous interpreting: A corpus-based analysis. In A. Kruger, K. Wallmach & J. Munday (Eds.) Corpus-Based Translation Studies: Research and Applications (pp. 282–306). London & New York: Continuum.

Bernardini, S. & Zanettin, F. (2004). When is a universal not a universal? Some limits of current corpus-based methodologies for the investigation of translation universals. In A. Mauranen & P. Kujamaki (Eds.) Translation universals: do they exist? (pp. 51–62). Amsterdam:

Benjamins.

Bernardini, S., Ferraresi, A., Russo, M., Collard, C. & Defrancq, B. (2018). Building interpreting and intermodal corpora: A how-to for a formidable task. In M. Russo, C. Bendazzoli & B. Defrancq (Eds.) Making way in corpus-based interpreting studies (pp. 21–42). Singapore: Springer.

Chmiel, A., Kajzer-Wietrzny, M., Koržinek, D., Janikowski, P., Jakubowski, D. & Polakowska, D. (forthcoming). Fluency parameters in the Polish Interpreting Corpus (PINC). In M. Kajzer-Wietrzny, S. Bernardini, A. Ferraresi & I. Ivaska (Eds.) Empirical Investigations into the Forms of

Mediated Discourse at the European Parliament. Berlin: Language Science Press.

Collard, C. & Defrancq, B. (2019). Predictors of ear-voice span, a corpus-based study with special reference to sex. Perspectives. 27(3), 431–454. https://doi.org/10.1080/0907676X.2018.1553199

Dayter, D. (2018). Describing lexical patterns in simultaneously interpreted discourse in a parallel aligned corpus of Russian-English interpreting (SIREN). Forum. 16(2), 241– 264. https://doi.org/10.1075/forum.17004.day

Defrancq, B. & Plevoets, K. (2018). The cognitive load of interpreters in the European Parliament: A corpus-based study of predictors for the disfluency uh(m). Interpreting. 20(1), 1–32. https://doi.org/10.1075/intp.00001.ple

Duflou, V. (2016). Be(com)ing a conference interpreter: An ethnography of EU interpreters as a professional community. Amsterdam/Philadelphia: Benjamins.

European Parliament. (2013). Towards More Efficient and Cost Effective Interpretation in the European Parliament. Retrieved January 10, 2021 from https://www.europarl.europa.eu/doceo/document/A-7-2013-0233_EN.html

Ferraresi, A., Bernardini, S., Petrović, M. M. & Lefer, M.-A. (2018). Simplified or not simplified? The different guises of mediated English at the European Parliament. Meta. 63(3), 717–738. https://doi.org/10.7202/1060170ar

Gile, D. (2000). Issues in interdisciplinary research into conference interpreting. In B. Englund Dimitrova & K. Hyltenstam (Eds.) Language processing and simultaneous interpreting. Interdisciplinary perspectives (pp. 89–106). Amsterdam and Philadelphia: Benjamins.

Graves, A., Pascual Olaguíbel, M., & Pearson, C. (2022). Conference interpreting in the European Union institutions. In M. Albl-Mikasa, & E. Tiselius (Eds.) The Routledge handbook of conference interpreting (pp. 104–114). London and New York: Routledge.

Gu, C. & Tipton, R. (2020). (Re-)voicing Beijing’s discourse through selfreferentiality: a corpus-based CDA analysis of government interpreters’ discursive mediation at China’s political press conferences (1998–2017). Perspectives. 28(3), 406–423.

https://doi.org/10.1080/0907676X.2020.1717558

Gumul, E. (2017). Explicitation in simultaneous interpreting: A study into explicitating behaviour of trainee interpreters. Katowice: Wydawnictwo Uniwersytetu Śląskiego.

Gumul, E. (2021). Explicitation and cognitive load in simultaneous interpreting: Product- and process-oriented analysis of trainee interpreters’ outputs. Interpreting 23 (1), 45-75. https://doi.org/10.1075/intp.00051.gum

Gumul, E. & Bartłomiejczyk, M. (under review). Interpreters’ explicitating styles: A corpus study on material from the European Parliament.

Kajzer-Wietrzny, M. (2012). Interpreting Universals and Interpreting Style. Unpublished Ph.D thesis, PhD dissertation, Adam Mickiewicz University in Poznań.

Kajzer-Wietrzny, M. (2018). Interpretese vs. non-native language use. The case of optional that. In M. Russo, C. Bendazzoli & B. Defrancq (Eds.) Making way in corpus-based interpreting studies (pp. 97–113). Singapore: Springer.

Krasnowska-Kieraś, K. & Kobyliński, Ł. (2019). Part of speech tagging for Polish. Poznan Studies in Contemporary Linguistics, 55(2), 211-237. https://doi.org/10.1515/psicl-2019-0009

Laviosa, S. (1998). Core patterns of lexical use in a comparable corpus of English narrative prose. Meta. 43(4), 557–570. https://doi.org/10.7202/003425ar

Liontou, K. (2013). Strategies in German-to-Greek simultaneous interpreting: A corpus-based approach. Gramma. 19, 37–56.

Magnifico, C. & Defrancq, B. (2016). Impoliteness in interpreting: A question of gender? Translation & Interpreting 8(2), 26–45. https://doi.org/10.12807/ti.108202.2016.a03

Mat Awal, N., Jaludin, A., Rahman, A. N. C. A. & Abdullah, I. H. (2019). “Is Selangor in deep water?” A corpus-driven account of air/water in the Malaysian Hansard Corpus (MHC). GEMA Online Journal of Language Studies. 19(2), 99–120. http://doi.org/10.17576/gema-2019-1902-

Matczak, M. (2020). The clash of powers in Poland’s rule of law crisis: Tools of attack and self-defense. Hague Journal on the Rule of Law. 12, 421–450. https://doi.org/10.1007/s40803-020-00144-0

Monti, C., Bendazzoli, C., Sandrelli, A. & Russo, M. (2005). Studying directionality in simultaneous interpreting through an electronic corpus: EPIC (European Parliament Interpreting Corpus). Meta. 50(4), 1079–1147. https://doi.org/10.7202/019850ar

Nelson, M. (2010). Building a written corpus: What are the basics? In A. O’Keeffe & M. McCarthy (Eds.) The Routledge handbook of corpus linguistics (pp. 53–65). London: Routledge.

Ogrodniczuk, M. & Nitoń, B. (2020). New developments in the Polish Parliamentary Corpus. In D. Fišer, M. Eskevich & F. de Jong (Eds.) Proceedings of the Second ParlaCLARIN Workshop, (pp. 1–4). Marseille: European Language Resources Association (ELRA).

Partington, A., Duguid, A. & Taylor, C. (2013). Patterns and meanings in discourse. Theory and practice in corpus-assisted discourse studies (CADs). Amsterdam: John Benjamins. https://doi.org/10.1075/scl.55

Pöchhacker, F. (2004). Introducing Interpreting Studies. London: Routledge. https://doi.org/10.4324/9781315649573

Russo, M. (2010). Reflecting on interpreting practice: Graduation theses based on the EPIC. In L. Zybatow (Ed.) Translationswissenschaft – Stand und Perspektiven (pp. 35–50). Frankfurt am Main: Peter Lang.

Russo, M., Bendazzoli, C. & Defrancq, B. (Eds.). (2018). Making way in corpus-based interpreting studies. Singapore: Springer. https://doi.org/10.1080/0907676X.2019.1594127

Russo, M., Bendazzoli, C. & Sandrelli, A. (2006). Looking for lexical patterns in a trilingual corpus of source and interpreted speeches: Extended analysis of EPIC (European Parliament Interpreting Corpus). Forum. 4(1), 221–254. https://doi.org/10.1075/forum.4.1.10rus

Saldanha, G. & O’Brien, S. (2013). Research methodologies in Translation Studies. Manchester: St Jerome.

Shlesinger, M. (1998). Corpus-based interpreting studies as an offshoot of corpus-based translation studies. Meta. 43(4), 1–8. https://doi.org/10.7202/004136ar

Snyder, D., Garcia-Romero, D., Sell, G., Povey, D. & Khudanpur, S. (2018). X-vectors: Robust DNN embeddings for speaker recognition. In 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP) (pp. 5329–5333). IEEE.

Spinolo, N. & Garwood, J. (2010). To kill or not to kill: Metaphors in simultaneous interpreting. Forum. 8(1), 181–211. https://doi.org/10.1075/forum.8.1.08spi

Straniero Sergio, F.& Falbo, C. (Eds.). (2012). Breaking Ground in Corpus-based Interpreting Studies. Frankfurt am Main: Peter Lang. https://doi.org/10.1016/j.system.2014.02.010

Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam: John Benjamins.




DOI: http://dx.doi.org/10.17576/gema-2022-2201-06

Refbacks

  • There are currently no refbacks.


 

 

 

eISSN : 2550-2131

ISSN : 1675-8021