What Do Different Word Lists Reveal about the Lexical Features of a Specialised Language?

Noorli Khamis, Imran Ho Abdullah


Most corpus-based investigations capitalise on word list analyses: frequency, keyword, and key-keywords, in profiling the lexical features of a specialised language. Though the three word lists have been used in many corpus-based language studies, comparisons across these three types of word lists in characterising a specialised language has not been made to identify any salient information each word list can reveal about the target language. This paper provides comparisons of Engineering English using three types of word list: frequency, keyword and key-keyword lists. The purpose is to identify the lexical information that can be revealed by the groups of words listed according to each type of word lists. To conduct the analyses, a corpus of Engineering English (E2C) is created. All the word lists from the corpus are extracted using the Wordsmith software. Next, further analyses on the distribution of the vocabulary components, namely function vs. content words, and word categories i.e. GSL, AWL and Others, are conducted on all the three word lists. The findings reveal that different word lists result in different ranges of words, and the analyses of the words reveal the distinct features of the specialised language at different levels. Given such differences, this study provides insights into which word lists are to be considered in a lexical study for language description purposes. Hence, this study further verifies the importance of corpus-based lexical investigations in providing empirical evidences for language description.


Keywords: corpus; lexical features; specialised corpus; language description; word lists analysis

Full Text:



Cech, O. R. & Macutek, G. J. (2009). Word form and lemma syntactic dependency in czech: a comparative study. Glottometrics. 85-98.

Chung, C. & Pennebaker, J. (2007). The psychological functions of function words. In Fiedler, K. (Ed.). Social Communication (pp. 343-359). New York: Psychology Press.

Coxhead, A. (2000). The academic word list: a corpus-based word list for academic purposes. Paper presented at the 4th International Conference on Teaching and Language Corpora. Atlanta.

Flowerdew, L. J. (1997). Corpus linguistics: applications to ESP. Paper presented at Exploring Language 1997. Language Centre: HKUST.

Fraser, S. (2005). The lexical characteristics of specialized texts. Paper

presented at JALT2004. Tokyo.

Fuentes, B. C. & Fuentes, A. C. (2002). A current corpus of technology language in Spain: English words that matter. English for Specific Purposes World. Retrieved February 20, 2011 from http://www.esp-world.info/Curado.htm.

Gardner, S. (2007). Integrating ethnographic, multidimensional, corpus linguistic and systemic functional approaches to genre description: an illustration through university history and engineering assignments. Paper presented at the 19th European Systemic Functional Linguistics Conference and Workshop.

Gavioli, L. (2005). Exploring Corpora for ESP Learning. Amsterdam: John Benjamins Publishing Company.

Gilmore, A. & Millar, N. (2018). The language of civil engineering research

articles: A corpus-based approach. English for Specific Purposes. Vol 51, 1-17.

Goh, G.Y. (2011). Choosing a reference corpus for keyword calculation. Linguistic Research. Vol 28 (1), 239-256.

Granger, S. & Paquot, M. (2009). In search of a General Academic vocabulary: a corpus-driven study. Paper presented at the International Conference ‘Options and Practices of L.S.A.P practitioners’. Crete: University of Crete, February.

Heatley, A., Nation, I. S. P. & Coxhead, A. (2002). Range (Computer software). Retrieved July 25, 2011 from


Kanoksilapatham, B. (2013). Generic characterisation of civil engineering research article abstracts. 3L: The Southeast Asian Journal of English Language Studies. Vol 19(3), 1-10.

Kashiha, H. & Heng, C. S. (2014). Discourse functions of formulaic sequences in academic speech across two disciplines. GEMA Online Journal of Language Studies. Vol 14(2), 15-27.

Lee, H.K. (2014). Phraseological patterns of English adjectives and nouns: with reference to the noun collocates of new, good, old and high in American English. Linguistic Research. Vol 31(3), 541-567.

Lei, L. & Liu, D. (2016). A new medical academic word list: A corpus-based study with enhanced methodology. Journal of English for Academic Purposes. Vol 22, 42-53.

Lu, W., Lee, S.-M. & Jhang, S.E. (2017). Keyness in maritime institutional law texts. Linguistic Research. Vol 34(1), 51-76.

Meyer, C. F. (2002). English Corpus Linguistics: An Introduction. Cambridge: CUP.

Mudraya, O. (2006). Engineering English: A lexical frequency instructional model. English for Specific Purposes. Vol 25, 235-256.

Nation, I. S. P. (2001a). How many high frequency words are there in English? In Gill, M., Johnson, A.W., Koski, L. M., Sell, R. D. & Wårvik, B. (Eds.). Language, Learning and Literature: Studies Presented to Håkan

Ringbom English Department Publications 4 (pp. 167-181). Turku: Åbo Akademi University.

Nation, I. S. P. (2001b). Learning Vocabulary in Another Language. Cambridge: CUP.

Nelson, M. (2000). A corpus-based study of the lexis of Business English and Business English teaching materials. Unpublished PhD thesis, University of Manchester.

Noorli Khamis & Imran Ho-Abdullah. (2015). Exploring word associations in academic engineering texts. 3L: The Southeast Asian Journal of English Language Studies. Vol 21(1), 117-131.

Noorli Khamis & Imran Ho-Abdullah. (2013). Word lists analysis: specialised language categories. Pertanika Journal of Social Sciences & Humanities. Vol 21(4), 1563-1581.

Paquot, M. (2005). Towards a productively-oriented academic word list. Paper presented at the Corpora and ICT in Language Studies PALC 2005. Frankfurt: University of Lodz.

Partington, A. & Marchi, A. (2018). Using corpora in discourse analysis. Paper presented at the Corpora and Discourse International Conference. Lancaster: Lancaster University.

Peña, G. A. & Peña, C.N. (2015). Extraction of candidate terms from a corpus of non-specialized, general language. Investigación Bibliotecológica: Archivonomía, Bibliotecología e Información. Vol 29(67), 19-45.

Peters, P. & Fernández, T. (2013). The lexical needs of ESP students in a professional field. English for Specific Purposes. Vol 32, 236–247.

Rizzo, C. R. & Pérez, M. J. (2015). A key perspective on specialized lexis: keywords in telecommunication engineering for CLIL. Procedia - Social and Behavioral Sciences. Vol 198, 386-396.

Sadeghi, K. & Nobakht, A. (2014). The effect of linguistic context on efl vocabulary learning. GEMA Online Journal of Language Studies. Vol 14(3), 65-82.

Scott, M. (1997). PC analysis of key words -- and key key words. System. Vol 25(1), 1-13.

Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.

Stubbs, M. (1998). A note on phraseological tendencies in the core vocabulary of English. The Free Library. Retrieved June 5, 2000 from http://www.thefreelibrary.com/A note on phraseological tendencies in the core vocabulary of English.-a093027799.

Stubbs, M. (2001). Words and Phrases: Corpus Studies of Lexical Semantics. Oxford: Blackwell.

Stubbs, M. (2009). The search for units of meaning: Sinclair on empirical semantics. Applied Linguistics. Vol 30(1), 115-137.

Utimaya, M. & Chujo, K. (2007). Linking word distribution to technical vocabulary. Journal of the College of Industrial Technology. Vol 40, 13-21.

DOI: http://dx.doi.org/10.17576/3L-2018-2403-03


  • There are currently no refbacks.




eISSN : 2550-2247

ISSN : 0128-5157