A Corpus Analysis of Frequently Occurring Words and their Collocations in High-Impact Research Articles in Education

Sharmine Farahin Bahtiar, Arshad Abd Samad, Abu Bakar Mohamed Razali, Yoong Wei Ho


In research writing, the importance of formulaic sequences (FS) or collocations and the need for writers to adapt and utilise these phrasal constructions in their writing cannot be denied. However, English as a second or a foreign language (ESL/EFL) writers often have difficulty in finding the right words or phrases in writing good academic research papers due to limited vocabulary and the lack of native-like fluency. There is also the concern that they are unaware of the rhetorical structure of academic research papers, which can hamper the organization of ideas and flow of writing and lead to messy and unclear production of language. Formulaic sequences and collocations of words represent useful phrasal constructions that writers use to fluently and efficiently express their intended communicative purposes. In this regard, this study uses corpus analysis to list collocations commonly used in high-impact journals written in the field of education. The study also categorises these collocations according to their communicative purposes, referred to as moves and steps, in the rhetorical structure of the Introduction of research articles.  The final list which focussed on ten node words from 40 high-impact journal articles consists of 3 to 12 word phrases found in the Introduction section of these articles. They were then categorised according to their specific functions based on an adapted Introduction Move Framework of the Create a Research Space (CARS) schema (Swales 2004) and common moves in the Introduction section of the Academic Phrasebank (Morley, 2014). 


Keywords:  Rhetorical structure; ESL and EFL; Corpus studies; Collocations; Academic writing

Full Text:



Ackermann, K., & Chen, Y., H. (2013) Developing the Academic Collocation List – corpus driven and expert-judged approach, Journal of English for Academic Purposes, 12(2013) 235–247.

Adel, A., & Erman, B. (2012). Recurrent word combinations in academic writing by native and non-native speakers of English. English for Specific Purposes, 31(2), 81-92.

Afshar, H. S., Doosti, M., & Movassagh, H. (2018). A Genre Analysis of the Introduction Section of Applied Linguistics and Chemistry Research Articles. Iranian Journal

of Applied Linguistics. 21(1), 163-214.

Altenberg, B. (1998). On the Phraseology of Spoken English: The Evidence of Recurrent Word Combinations. In A. Cowie (Ed.), Phraseology: Theory, Analysis and

Applications (pp.101-122), Oxford: OUP.

Ang, L. H. & Tan, K. H. (2019). From Lexical Bundles to Lexical Frames Uncovering the Extent of Phraseological Variation in Academic Writing. 3L: The Southeast

Asian Journal of English Language Studies. 25(2), 99-112.

Biber, D. (2006). University language: A corpus-based study of spoken and written registers. Amsterdam: John Benjamins.

Biber, D., Conrad, S., & Cortes, V. (2004). If you look at…: Lexical bundles in university teaching and textbooks. Applied Linguistics, 25, 371–405.

Chang, C. F. & Kuo, C. H. (2011). A corpus-based approach to online materials development for writing research articles, English for Specific Purposes, 30(3), 222-

Cortes, V. (2013). The purpose of this study is to: Connecting lexical bundles and moves in research article introductions. Journal of English for Academic

Purposes,12(1), 33-43.

Cotos, E., Huffman, S. & Link, S. (2015). Furthering and applying move/step constructs: Technology-driven marshalling of Swalesian genre theory for EAP pedagogy.

Journal of English for Academic Purposes, 19, 52-72.

Cowie, A. (1992). Multiword lexical units and communicative language teaching. In P. Arnaud, & H. Bejoint (Eds.), Vocabulary and Applied Linguistics (pp.1-12).

London: MacMillan.

Coxhead, A. (2000). Coxhead’s Academic Word List. Retrieved from http://www.cal.org/create/conferences/2012/pdfs/handout-4-vaughn-reutebuch- cortez.pdf

Davis, M., & Morley, J. (2015). Phrasal Intertextuality: The Responses of Academics from Different Disciplines to Students’ Re-Use of Phrases. Journal of Second

Language Writing. 28(2015), 20–3.

Dobakhti, L. (2011). The Discussion Section of Research Articles in Applied Linguistics: Generic Structure and Stance Features (Master’s thesis, Universiti of Malaya).

Retrieved from http://studentsrepo.um.edu.my/5641/1/LEILA_DOBAKHTI.pdf

Ellis, N. C. (2002). Reflections on frequency effects in language processing. Studies in Second Language Acquisition, 24, 297-339.

Erman, B., & Warren,B. (2000). The Idiom Principle and the Open Choice Principle. Text, 20, 29–62.

Flowerdew, J., & Li,Y.(2007). Language re-use among Chinese apprentice scientists writing for publication. Applied Linguistics, 28, 440–465.

Gabsalova, D., Brezina, V., & McEnery, T. (2017). Collocations in Corpus-Based Language Learning Research: Identifying, Comparing, and Interpreting the Evidence.

Language Learning, 67(S1). 155-179.

Gilquin, & Paquot, M. (2008). Too chatty. Learner academic writing and register variation. English Text Construction, 3(1), 41-61.

Gledhill, C. J. (2011). The ‘lexicogrammar’ approach to analysing phraseology and collocation in ESP texts. ASp [Online], 59, 5-23.

Hadi Kashiha & Chan, S. H. (2014). Discourse functions in formulaic sequences in academic speech across two disciplines. GEMA Online® Journal of Language Studies.

(2), 15-27.

Hirano, E. (2009). Research Article Introductions in English for Specific Purpose: a Comparison between Brazilian Portuguese and English. English for Specific

Purposes, 7, 113-121.

Hsu, J. (2007) Lexical Collocations and their Relation to the Online Writing of Taiwanese College English Majors and Non-English Majors. Electronic Journal of Foreign

Language Teaching 2007, 4(2), 192–209.

Hong, A., L., Hua, T., K., & Mengyu, H. (2017). A Corpus-based Collocational Analysis of Noun Premodification Types in Academic Writing, 3L: The Southeast Asian

Journal of English Language Studies, 23(1), 115 – 131.

Howarth, P. (1998). The Phraseology of Learners’ Academic Writing. In A. Cowie (Ed.), Phraseology: Theory, Analysis and Applications (pp.161– 186). Oxford: OUP.

Hyland, K. (2008). Academic clusters: text patterning in published and postgraduate writing. International Journal of Applied Linguistics, 18(1), 41–62.

Hyland, K. & Tse, P. (2007). Is there an “academic vocabulary”? TESOL Quarterly. 41(2), 235-253.

Joseph, R., Lim, J. M. & Mohd Nor, N. A. (2014). Communicative moves in Forestry Research Introductions: Implications for the design of learning materials. Procedia

– Social and Behavioural Sciences, 135:15, 53-69.

Kanoksilapatham, B. (2011). Civil Engineering Research Article Introductions: Textual Structure and Linguistic Characterization. The Asian ESP Journal, 7(2), 55-84.

Kanoksilapatham, B. (2007). Rhetorical moves in biochemistry research articles. In Biber, D., Connor, U. and Upton, T. A. (Eds.) Discourse on the move: Using

corpus analysis to describe discourse structure. Amsterdam: John Benjamins.

Kjellmer, G. (1991) ‘A mint of phrases’, in K. Aijmer & B. Altenberg (Eds.) English Corpus Linguistics: Studies in Honour of Jan Svartvik. (pp.111-127). Harlow,


Laufer, B. & Waldman, T. (2011). Verb-noun collocations in second language writing: A corpus analysis of learners’ English. Language Learning. 61(2), 647-672.

Leech, G. (1994). Students' grammar - teachers' grammar - learners' grammar. In M. Bygate, A. Tonkyn & E. Williams (Eds.), Grammar and the language teacher

(pp.17-30). New York: Prentice Hall.

Lim, J. M. H. (2006). Method sections of management research articles: A pedagogically motivated qualitative study. English for Specific Purposes, 25(3),


Lim, J. M. H. (2010). Commenting on research results in applied linguistics and education: A comparative genre-based investigation. Journal of English for Academic

Purposes, 9(4), 280-294.

Meunier, F. (2012). Formulaic language and language teaching. Annual Review of Applied Linguistics, 32, 111-129.

Morley, J. (2014). Academic Phrasebank. University of Manchester. Retrieved 20 March 2020 from www.kfs.edu.eg/com/pdf/2082015294739.pdf

Nguyen, T. T. L (2018). Rhetorical Structures and Linguistic Features of English Abstracts in Thai Rajabhat University Journals. 3L: The Southeast Asian Journal of

English Language Studies, 24(4), 71-84.

Nguyen, T.T. L & Pramoolsook, I. (2014). Rhetorical Structure of Introduction Chapters written by Novice Vietnamese TESOL postgraduates. 3L: The Southeast Asian

Journal of English Language Studies, 20(1), 61-74.

Nwogu, K. N. (1997). The medical research paper: Structure and functions. English for Specific Purposes, 16(2) 119-138.

Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. Annual Review of Applied Linguistics, 32, 130–49.

Pawley, A., & Syder, F., H. (1983). Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In J. C. Richards & R. W. Schmidt (Eds.), Language

and communication (pp.191–226). New York: Longman.

Peters, A.M. (1983). The units of language acquisition. Cambridge: CUP.

Peters, E., & Pauwels, P. (2015). Learning Academic Formulaic Sequences. Journal of English for Academic Purpose, 20(2015) 28-39.

Rahman, M., Darus, S. & Amir, Z. (2017). Rhetorical structure of introduction in applied linguistics research articles. International Journal of Interdisciplinary

Educational Studies. 9(2), 69-84.

Samraj, B. (2002). Introductions in research articles: Variations across disciplines. English for Specific Purposes 21(1) 1-17

Simpson-Vlach, R., & Ellis, N., C. (2010) An academic formula list: New methods in phraseology research, Applied Linguistics, 31(4), 487–512.

Sinclair, J. (1991). Corpus, concordance, collocation: Describing English language. Oxford: OUP.

Smadja, F.A. & McKeown, K.R. (1990) ‘Automatically extracting and representing collocations for language generation’, Proc. of the 28th Annual Meeting of the

Association for Computational Linguistics, pp. 252.

Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge: CUP.

Swales, J. M. (2004). Research genres: Explorations and applications. Cambridge: CUP.

Swales, J., M., & Feak, C., B. (2012). Academic Writing for Graduate Students: Essential Tasks and Skills (3rd ed.). Ann Arbor: University of Michigan Press.

Yang, R., & Allison, D. (2003). Research articles in applied linguistics: moving from results to conclusions. English for Specific Purposes, 22(4), 365-385.

DOI: http://dx.doi.org/10.17576/3L-2020-2604-08


  • There are currently no refbacks.




eISSN : 2550-2247

ISSN : 0128-5157