Explore
Chapter 4
Presentation of Findings and Discussion
In this section an overview of the results and further discussion in the context of pedagogical value of the four corpus analyses are presented. The creation of the teaching materials is further discussed. Due to space constraints, the sample outputs are presented in Tables 3-6, however, full details are available in Appendices 1-4.
4.1 The fifty most frequent content words
The frequency analysis underpins the majority of the analytical work that is carried out within the remit of CL. The results of the frequency analysis are word lists, compiled by frequency counts of each word in a corpus, and can be further used to derive keyword lists and collocational data. Creating frequency word lists is typically the starting point of any corpus-based analysis (Baker at al., 2006:76).
What can be immediately observed from the current frequency analysis, is the fact that the most frequent words of the target corpus are highly topic-specific, that is they all strictly relate to business, which corroborates Nelson’s (2006) findings discussed in section 2.2.1 as regards the specificity of lexis in BE. The link between the used vocabulary and phrases in the discipline of business clearly demonstrates that business-specific vocabulary dominates in business texts and confirms the need for top-specific language instruction.
In terms of grammatical classification, it can be observed that the most frequent content words of the analysed corpus are predominantly nouns (group, year, risk, company, board, committee), which reveals that nominalisation is extensively used in ARs. Adjectives are the next common part of speech (financial, net, fair, key, capital, annual, strategic, tax, income) followed by participles (continued, including, consolidated, based). As can be seen in Table 3 which lists the 20 most frequent words (the full list of 50 words is available in Appendix 1), a strong and evident association of entries with the field of business can be observed.
Table 3 The 20 most frequent content words.
The rationale behind frequent analysis and creating the frequency list follows the logic of Milton’s Frequency Model of vocabulary learning (Milton, 2009:25), founded on evidence that “there is a strong relationship between a words frequency and the likelihood that a learner will encounter it and learn it” and that “the more frequent a word is, the more likely it is to be learned, as a general rule.”(Milton, 2009:27). The usefulness of frequency data corpus analysis generally, is that “it identifies patterns of use that otherwise often go unnoticed by researchers” (Biber et al, 2005:376).
O’Keefe, et al. (2007:33-37) demonstrate that the most frequent words constitute an essential part of the core vocabulary. They further argue that this type of information may offer a potential for pedagogy organised around lexis, adding that: “The single word has served us well, and will continue to do so” (O’Keeffe, et al., 2007: 58-59).
Teachers and syllabus designers, or materials writers armed with the complex information a frequency list provides, can produce and apply more practical vocabulary pedagogy which focuses on the specific educational needs of the students. In effect, this approach will result in naturally learners/users-centred methodology, and can be applied even at the elementary levels. (O’Keefe, et al, 2007:47)
4.2 Keywords
Keywords are defined as “those words which are identified by statistical comparison of a ‘target’ corpus with another, larger corpus, which is referred to as the ‘reference’ or ‘benchmark’ corpus” Evison’s (2010:127). A key word, according to Scott (2010:149), is a word which is found to occur with unusual frequency in a given text or set of texts. As such it may be found to occur much more frequently (positive keyword) than would otherwise be expected or much less frequently (a negative keyword). In order to know what is expected, a reference of some sort must be used, therefore a reference corpus has to be employed. A reference corpus is the basis of comparison in the keyword analysis, and according to Baker at al, (2006:137) would typically be “a larger set of texts drawn from a wider range of genres and/or sources”. For the purpose of investigation of English language, they recommend using the Brown family corpora, or sections on the British National Corpus. For the purpose of this research the Brown corpus of one million tokens (Nelson and Kucera, 1979) was used as a reference.
Similarly to frequency analysis, the top keywords identified here (Table 4) are highly topic-specific, i.e. they are strongly associated with a domain of business. The second observation is that the keyword list contains 82% (41) of the same content words, the difference being in their frequency of occurrence. Analogous to the previous analysis, the entries in table 4 have been limited to the top twenty, and the full list is available in Appendix 2. Out of the fifty analysed most frequent words and the 50 top keywords 9 (18%) of keywords are not appearing in the first 50 most frequent content words list. These include, ranked by frequency: lease, risk, strategic, shareholders, asset, accounting, corporate, impairment, strategy. This corroborates previous findings that the corpus contains highly specialised and business-specific vocabulary.
Table 4 The Top twenty (positive) keywords.
As keywords analysis allows us to gain insights into which vocabulary is typical of the genre, it also should be prioritised when teaching and planning the syllabus. Ideally, words from both frequency and keyword lists should be taken into consideration while designing the teaching materials.
4.3 Collocations
This study identified the twenty most frequent collocates of the 50 most frequent content words. The list contains 1,000 entries in total, which includes the node/search word and its collocates. Table 3 provides a sample output from the analysis, the full list is available in Appendix 3. Both the nodes/search words and their respective collocates are ranked by descending order of frequency. As discussed in section 4.1, frequency plays a significant role in vocabulary learning.
What can be immediately noticed is that the grammatical/function words appear as the top frequent collocates of the most frequent content words, combining into what is referred to as colligations (Baker at al, 2006:36) i.e. grammatical word combinations, for instance noun/verb + preposition, such as statement of, year of.
Considering that the focus of this research is on the lexical features of ARs, the function words – such as: articles (a, the), prepositions (of, to, in, for, on, with), and essentially colligations, were excluded from this analysis. The full list of collocations, including the colligates, i.e. the 20 collocates of the top 50 content words are provided in Appendix 3, whereas Table 5 presents a sample output of the lexical collocations extracted from the top 20 content words.
Lexical collocations, understood as collocations that do not contain grammatical elements (Bahns, 1993: 57), but two lexical words (noun + noun/verb/ adjective) where both words contribute to the meaning, appear to be much more diverse: group statement, financial risk, committee audit, strategic report, business group, fair value. The nodes examined in this work, i.e. the 50 most frequent content words of the corpus, are predominantly nouns (exceptions: financial, net, total, annual, new, including, recognised, fair). This shows that nominalisation is a key feature of ARs. The frequent noun + noun and noun + adjective combinations may also indicate that the nodes/search words somewhat moderate/influence their collocates.
What can be immediately observed is that the vast majority of the lexical collocates of analysed nodes are notably associated with the world of business, and specifically linked with the disciplines of finance and accounting (such as: financial statement, business report, company directors, net value, remuneration governance, share earnings, restricted cash, cash equivalents, internal audit, credit risk, profit tax, etc.).
Therefore, similarly to previous investigations, the lexical collations are highly discipline/theme specific. What is more, the majority of the lexical collocations emerge to form two-word compounds (compound nouns) that appear to carry a very specific meaning in the business context, that can be inferred only from this specific setting. What can be observed is that the node is described by its collocate and together they yield a new concept. This corresponds with Bauers’ (2019:1) explanation of compounding as the “word formation process of combining two or more elements, each of which is used elsewhere in the language as a word of its own right to form a new concept”. For most compounds in English, the first constituent is the modifier, whereas the second is referred to as the head, which tends to determine the class to which the compound belongs while the modifier adds the meaning of “specialisation”, e.g executive salary is a category of salary. (Dhar & van der Plas 2019).
Examples in the analysed corpus include financial statement, remuneration committee, ordinary shares, chief executive, share earnings, interest amortisation, executive salary, financial assets, directors report, business group, share dividend, cash equivalents, income statement.
Table 5: The lexical collocates extracted from the top twenty collocates of the 20 most frequent words.
Collocations, as described by Baker et al. (2006:36), refer to “the phenomenon surrounding the fact that certain words are more likely to occur in combination with other words in certain contexts” and a ‘collocate’ is essentially a word occurring within the neighbourhood of another word. Collocations, also described “as units of formulaic language” (Gablasova et al., 2017:155), have recently gained significance in the field of language learning and use (Gablasova et al, 2017). The focus on identifying collocations in this MA research project is dictated by the fact that “collocations describe the way individual words co-occur with others” (Lewis, 1993:93), and published literature on collocations indicates “fluent and natural production associated with native speakers of the language” (Gablasova et al, 2017:156). An observation that words exist “in the company of other words” (McEnery and Wilson, 2001:71) implies that this knowledge and awareness thereof is essential for the relevant and appropriate use and application of words.
One of the crucial developments in vocabulary research has been the Firthian approach of the word meaning (Firth:1935, as cited in O’Keeffe at al, 2006:59), arguing that the meaning of a word can be inferred from the the way it is combined with other words in actual use, more so than from the meaning it possesses in itself. While Bahns (1993:56) calls them “a neglected aspect of vocabulary teaching”, researchers have been increasingly interested in how words combine as pairs of collocations and how groupings of more than one word “have unitary meanings and specialised functions” (O’Keeffe et. al, 2006:59).
The emergence of Corpus Linguistics and corpus investigation allowed linguists to verify these mainly intuition-based concepts “in actual, attested language use on a larger scale.” (ibid.:59). Collocations have been rather prominent in corpus linguistics research, and the rationale behind it is that “corpora represent a rich source of information about the regularity, frequency, and distribution of formulaic patterns in language” (Gablasova et.al, 2017:156). Ackerman and Chen (2013:3) add that while “collocations can be instantly recognized by native speakers, they often remain difficult for learners to acquire and use properly”, possibly due to the fact, that collocations can contain “some element of grammatical or lexical unpredictability or inflexibility” (Nation,2001:324, as cited by Ackerman and Chen, 2013). Therefore, corpus investigation can act as “an objective frame of reference” (Bowker and Pearson, 2002:19).
The main conclusion from the current analysis is that these highly-technical and domain- i.e. business- specific collocations may directly inform the content of specialized BE vocabulary teaching. The fact that the identified collocations are highly specific may signify the importance of the specialised terminology in the BE pedagogy. What is more, in accordance with Chung and Nation (2003), learning the word’s meanings along with its common collocations significantly facilitates acquiring of technical vocabulary, therefore collocations should be incorporated into the BE syllabus. As O’Keeffe et al. summarise:
“For the learner of any second/foreign language learning the collocations of that language is not a luxury if anything above a survival level mastery of the language is desired, since collocation permeates even the most basic frequent words” (O’Keeffe et al, 2007:60).
4.4 Four-word clusters
This investigation identified the 50 most frequent word chunks typical to the genre of Irish AR. This section presents these and discusses their pedagogical application in the BE classroom. Word clusters can be described as multi-word sequences or ‘lexical bundles’ (Biber, et al, 2004:371) are also referred to as “any group of words in sequence” (Baker et al 2006:34) and not just single words, or collocations. Similarly to vocabulary and collocations, lexical bundles play an instrumental role in successful L2 learning (Biber et al.: 2004). For the purpose of this thesis the term lexical bundles, or more specifically: four-word clusters was adopted (Biber et al.: 2004).
What can be immediately observed from the investigation of the clusters in the target corpus, is that they consist of most of the highly frequent content words. This would include examples containing words such as: group, financial, statement, year, value, company, results, in the four-word expressions such as: the consolidated financial statements, for the year ended, the fair value of, for the financial year, cash and cash equivalents. The most evident conclusion that can be drawn from this investigation is that most frequent content words appear to occur frequently, and even form the most frequent four-word clusters. This is illustrative of how language systematically clusters into combinations of words, referred to as ‘chunks’ (O’Keeffe 2007:13) and may be indicative of the notion that language is pre-patterned.
Table 6: The twenty most frequent 4 word clusters (full list in Appendix 4)
Secondly the presence of some less specialised expressions also needs to be noted, the examples include predominantly prepositional phrases such as: in accordance with the, in respect of the, in relation to the, as a result of, set out in the, at the end of, at the date of, in line with the, as a result of, are set out in, as part of the. Although these phrases do not contain strictly business-specific content words, they still bear quite an official tone and relate to the formal register of the formal/conventional and legal language of business, also corroborating Bhatia’s (2010) findings related to the presence of the different types of discourse in the genre of Annual Reports.
Based on concordance lines below, it can be observed that these expressions are predominantly used to express agreement and compliance with rules, provisions, policies and terms:
Transactions involving derivatives are carried out in accordance with the Treasury policy.
We challenged Management on the disclosures, in particular, whether they are sufficiently clear in highlighting the exposures that remain, the significant uncertainties that exist in respect of the provisions and the sensitivity of the provisions to changes in the underlying assumptions.
The Second Line of Defence sets the frameworks and policies for managing specific risk areas, provides advice and guidance in relation to the risk and provides independent review and challenge and reporting on the company’s risk profile.
The estimated minimum time commitment set out in the terms of appointment is 30 to 60 days per annum including attendance at Committee meetings.
Additionally they tend to express and demonstrate security obligations, and cause and effect:
The Group has a low risk appetite for loss of confidentiality, integrity or availability of our information assets as a result of cyber events.
This was primarily driven by the ROI portfolio as a result of post model adjustments i.e. management adjustments as outlined on pages 97 and 98, resulting in a charge of € 82 million.
Most of these positions arise as a result of activity generated by corporate customers while the remainder represent trading decisions of the Group’s derivative and foreign exchange traders with a view to generating incremental income.
O’Keeffe at al. (2007:13) emphasise that “language systematically clusters into combinations of word ‘chunks’” and this can provide researchers with insights for teaching vocabulary and how “learners approach the task of acquiring vocabulary and developing fluency”. The recent body of evidence challenges the tradition that “language is strictly compositional, arguing instead that much of common everyday language use is composed of prefabricated expressions” (ibid.:376) and that “lexical bundles are stored as unanalysed multi-word chunks, rather than as productive grammatical constructions and do not present production or comprehension difficulties for speakers and listeners in classroom teaching (Biber at all, 2005: 400).”
This view is also endorsed by Lewis (1993: 96), he states that “correctly identified lexical phrases can be presented to L2 learners in identifiable contexts, mastered as learned wholes, and thus become an important resource (…).” Previous and current corpora investigations revealed (Biber, at al, 2004, McEnery and Xiao, 2006, Liu, 2012, O’Keefe at al, 2007, O’Donnell et al, 2012), that much of the linguistic output is composed of multi-word units rather than single words or two-word forms and it is evident that the collocational functioning of words expand beyond the two-word units. It is only with the empirical corpus analysis that the linguists came to understand the extent of the significance and prominence of the chunks/multi-word clusters. As O’Keeffe et al. (2007: 60) state “language is available for use in ready-made chunks to far greater extent than could ever be accommodated by theory of language that rested on the primacy of syntax.”
4.5 Materials informed by ARIC and contributions
The current thesis provides the description of the lexical items of the ARIC corpus and therefore aims at contributing to the current state of knowledge on BE in the Irish context by offering the empirical insights on the examined lexical features. Corpus-informed teaching materials for immediate use in the classroom are also provided. They aim to address the need of a context-specific BE instruction and were one of the core deliverables of the current study. The main focus here is on written materials that are presented in a form of worksheets containing language activities informed by the lexical analysis of ARIC corpus. The worksheets, available in Appendix 6, can be used immediately in the BE classroom, or can serve as templates for teachers and researchers to create comparable exercises and activities informed by other corpora.
The worksheets are organised around the lexical features that were investigated in this research, and are titled accordingly: (a) ‘Vocabulary Focus’ – this worksheet comprises activities designed to promote the acquisition of the most frequent content words and keywords in context, (b) ‘Collocations Focus’ – concentrates on the top lexical collocations of the top 50 content words, (c) ‘Lexical Bundles/Cluster Focus’ – promotes the awareness and acquisition of the top fifty four words clusters, and finally, (d) ‘Genre Focus” worksheet combines the elements of the genre approach with a specific instruction on language items introduced in the texts that were extracted from the corpus.
By organising the activities in this manner, the corpus-informed materials are designed to promote the lexical approach and they aim to achieve it by: (a) focusing on the frequency information – i.e. all materials are organised around the top 50 most frequent lexical features, (b) introducing the corpus-derived lexical features in their original BE context by presenting the concordance lines form the corpus, also referred to as the Key Word In Context (KWIC), (c) demonstrating the role of lexical chunks by introducing the concept of bundles/clusters and collocations in language activities.
In addition to that, the ARIC corpus can be further explored to create more specific topic-based modules organised around specialised vocabulary of specific sectors (banking, construction, retail, etc). Furthermore, the raw corpus data, i.e. lists can also constitute a basis for direct application of the corpus investigation by learners in Data Driven Learning (DDL), depending on other circumstances such as teachers’ corpus knowledge, availability of the specialised software, learner’s readiness to avail of corpus methods.
Additionally, the raw corpus data can serve as a foundation for the design of BE language testing. Corpus-based assessment tools and vocabulary tests customised specifically to the educational needs of BE students could be created. Such tests would focus predominantly on the learner’s specialised lexical knowledge and would incorporate features examined in the current research i.e. vocabulary, collocations, and bundles/clusters. Informed by empirical evidence, corpus-based tests would ensure the authenticity of language to be tested.
It should be noted that the present research aims at offering multiple contributions to the BE teaching field. Firstly, it provides empirical evidence on the business language of ARs in the Irish context. To the author’s knowledge it is the first analysis of this kind. One major contribution is the delivery of extensive insights into most prominent lexical features of Irish ARs. The corpus-derived, frequency-driven observations offer empirical information on lexical items that are representative of the language of business in Ireland.
Secondly, through the analysis of these functional, complex and ‘intertextual’ (Bhatia, 2010) documents the present study addresses the issue of lack of authentic language of Irish business environment in the current BE pedagogy. It can be said that the findings and the proposed materials offer an opportunity to expand on the empirically-based lexical knowledge of the practitioners and can support achieving fluency by the non-native English speaking professionals working in Ireland. By offering the insights on the language of the Irish ARs, this research can be considered as an initial step to address the lack of comprehensive information on the nature of language that non-native professionals encounter in the Irish workplace.
Finally, another important implication from this study is that it could potentially make a contribution to an area of teacher development. The findings can be equally applied to any specific domain of ELT. The study’s descriptions, the sample materials that can be considered as model templates, as well as the specialised corpus could be of benefit in teacher’s continuous professional development (CDP). It can offer an increased awareness of convenience and approachability of corpus pedagogy, that can be used in any specific language instruction. It has the potential of helping BE/ESP teachers enhance their teaching practice and can provide a direction for novice BE/ESP instructors.
4.6 Pedagogical Implications and other applications
The corpus analysis, as described in Chapter 4 revealed an array of language features characteristic of the language of business used in the genre of Annual Reports. The investigation of the Annual Reports of the twenty Irish-listed companies points to the conclusion that the vocabulary used in the analysed genre is notably discipline-specific, i.e. it is evidently associated with the different areas of the domain of business (e.g. finance and accounting). This section provides a short summary of findings, suggesting their relevance and applicability to the BE pedagogy.
First of all, it appears that the practical findings of this project may have direct application in BE instruction, perhaps the most obvious one being the design and development of teaching resources. The corpus-based results can readily serve as a foundation for the development of highly relevant and authentic resources, programmes, and syllabi customised to the language needs of the learners (i.e. corporate clients who would order such courses).
In general, it is safe to say that all the linguistic features identified and investigated in this project may be beneficial in the BE pedagogy. The content words are essential for effective language use in general, they are the building blocks of language, or: “the core or heart of language”, as Lewis (1993:89) remarked. Notably, the content words examined in the present research, that is the most frequent words and the keywords, appear to be highly associated with the discipline of business. Therefore, they could constitute a solid basis of a very specialised BE syllabus.
Acquiring and understanding collocations can be particularly beneficial when learning specialised use of the word that often involves a form that differs from its other uses. This appears to be the case particularly in the specialised business texts where lexical items often appear to acquire specialised meanings explainable from context alone. As Chung and Nation (2003) remark, “most technical vocabulary needs to be learned productively by learners specializing in that area and learning common collocations and grammatical patterns helps this.” Moreover, Lewis’s statement – that also constitutes one of the key principles of the lexical approach – that “collocation is integrated as an organising principle within syllabuses” (Lewis, 1993:vi) illustrates its pedagogical value.
In addition to the benefits of collocations, the significant four-word clusters/bundles appear to be highly applicable in the BE pedagogy – their use not only demonstrates users’ appropriate lexical knowledge, it also remarkably contributes to the more natural, native-like flow of the language use, described as fluency. The frequency learning model can be also applied to word clusters therefore, the principal recommendation resulting from this analysis would be incorporating the findings and the analysed four-word clusters into BE language teaching. Awareness and practical use of the clusters from the list produced as a result of the investigation, as well as functional familiarity with the way in which they are used in the authentic documentation may greatly assist the learners and professionals in more authentic use of language in professional settings (Liu, 2012). As O’Keeffe et al, (2007:61) remark, “chunks are ready for use at any moment, and do not need re-assembling every time they are used”, and so they considerably contribute to fluency, understood as smooth and effortless performance in a language.
Another observation from this study is that most frequent lexical words regularly appear in collocations and the most frequent four-words constructions. This points towards the fact, and perhaps confirms, that the English language has a pre-patterned nature. This in turn also verifies the requirement for a lexical approach in BE teaching which highlights ‘pedagogical chunking’ (Lewis, 1993: 120). Language systematically clusters into a combination of words which has implications in teaching. It may indicate how to deliver meaningful vocabulary lessons and how learners approach the task of acquiring vocabulary and developing fluency. (O’Keeffe, et al, 2007:13). Collocation and bundles/clusters may therefore be a useful starting point in language teaching and learning in the context of BE, and broader contexts.
This allows the conclusion that the corpus data and the features of the language examined in ARs of Irish companies, as revealed by this empirical examination, inform the content of BE instruction. The teaching materials that were developed based on corpus data extracted from the ARIC Corpus are focused on BE lexis which can help learners and non-native speaking professionals improve their vocabulary use. The above findings and materials could equally be incorporated in a lexical syllabus, as suggested by Sinclair and Renouf (1988, as in Baker et al., 2006) that can be built around frequency-based information, as derived from this corpus.
Additionally, the intention of the researcher is to make the results public on a dedicated website that will be developed as a post-research project. The BE teachers and material creators, could freely avail of the lists to create activities promoting the awareness and practical use of the content words, collocations, four-word clusters, which can help learners improve the language they use in annual reports.
It is hoped that the mentioned materials can support the teachers and learners alike in their BE learning journey and by offering this contribution to the BE field this project hopes to bridge a gap between workplace and the classroom.
4.7 Limitations
This study concentrated on the 20 Annual Reports of Irish Companies published for the year 2019 and it only investigated reports published in this single year. Therefore it is restricted to a very specific time and the sample of business text is therefore somewhat limited.
Additional limitation may stem from the fact only one genre of BE is examined; and also quite generally, without making any further divisions to subgenres, such as CEO’s statements, and without further dividing the analysis to business sectors.
In terms of the analysis, this research relies solely on frequency, mostly in the interest of consistency, and also because this approach has the advantage “of being methodologically straightforward” (Liu, 2016). This methodology, however, serves the purpose of this thesis, also taking into account its time constraints. Some possible lines of future work may include deeper analysis of collocations in terms of statistical significance and perhaps a further combined comparative analysis may be in order.
The results are intended to be published on the dedicated website, that is to be created specifically for this project, where the sample teaching materials will also be included as downloadable files. Due to time constraints the website development will take place as the last step of this project and is planned further ahead. In addition, the feedback from the non-native speaking professionals and BE teachers is yet to be sought to gain insights into the effectiveness and practicality of the developed materials, which bears potential for designing a corpus-informed BE course/programme, in the future. Despite the limitations, it is hoped that the empirical findings of this research constitute a considerable amount of pedagogically valuable information that could be utilised in BE instruction.
"Acquiring and understanding collocations can be particularly beneficial when learning specialised use of the word that often involves a form that differs from its other uses."
