• No results found

Anaphoric demonstratives in student academic writing: A cross-disciplinary study of (un)attended this and these

N/A
N/A
Protected

Academic year: 2022

Share "Anaphoric demonstratives in student academic writing: A cross-disciplinary study of (un)attended this and these"

Copied!
86
0
0

Loading.... (view fulltext now)

Full text

(1)

Degree Project

Master’s Level

Anaphoric demonstratives in student academic writing

A cross-disciplinary study of (un)attended this and these

Author: Elisabete Ferreira Supervisor: Annelie Ädel Examiner: Jonathan White

Subject/main field of study: English for Academic Purposes Course code: EN3077

Credits: 15

Date of examination: 13/06/19

At Dalarna University it is possible to publish the student thesis in full text in DiVA.

The publishing is open access, which means the work will be freely accessible to read and download on the internet. This will significantly increase the dissemination and visibility of the student thesis.

Open access is becoming the standard route for spreading scientific and academic information on the internet. Dalarna University recommends that both researchers as well as students publish their work open access.

I give my/we give our consent for full text publishing (freely accessible on the internet, open access):

Yes ☒ No ☐

Dalarna University – SE-791 88 Falun – Phone +4623-77 80 00

(2)

1 Abstract

Cohesive devices such as anaphoric reference play an important role in written discourse. This thesis investigates the extent to which the anaphoric demonstratives this and these are used as determiners (‘attended’) or pronouns (‘unattended’) by first-year undergraduate students from four different academic disciplines. Data extracted from the British Academic Written English (BAWE) corpus were analysed quantitatively to determine the frequency of use of attended and unattended this/these across disciplines, as well as qualitatively to examine the types of nominal and verbal structures that follow the demonstratives. When compared to findings from previous studies, novice student writers were found to employ this/these as pronouns to a larger extent than both students at a more advanced level and research article writers. It was also observed that the determiners this and these pattern differently, selecting distinct attending nouns to a great extent. In addition, comparison of the results for each subcorpus shows that even though there are some differences between the four disciplines, these differences are not as great as might be expected and do not indicate a clear distinction between ‘hard’ and ‘soft’ sciences.

While the influence of genre has not been scrutinised, other possible explanations proposed relate to the educational context and level of study in association with the range of lexical choices available to novice student writers.

Keywords: cohesion, demonstratives, anaphoric reference, disciplinary variation, student writing

(3)

2 Table of Contents

1 Introduction ... 3

1.1 Aim and Research Questions ... 4

2 Review of the Literature ... 6

2.1 Academic Discourse(s) ... 6

2.1.1 English for general or specific academic purposes ... 8

2.1.2 Disciplinary variation in academic writing... 10

2.1.2.1 ‘Common core’ versus discipline-specific vocabulary ... 11

2.2 Discourse Grammar: Cohesion in Written Discourse ... 13

2.2.1 Cohesive devices and reference patterns ... 15

2.2.2 Demonstratives as determiners or pronouns ... 17

2.2.3 Attended and unattended this/these in academic writing ... 19

3 Material and Methodology ... 23

3.1 Data... 23

3.2 Method of Analysis ... 24

3.2.1 Models of classification ... 28

4 Results and Discussion ... 29

4.1 Overall Frequencies of Attended and Unattended this/these ... 30

4.2 Most Frequent Nouns and Noun Types Following this/these ... 32

4.3 Most Frequent Verbs and Verb Types Following this/these ... 38

4.4 Distribution of Attended and Unattended this/these across Disciplines ... 42

4.5 Distribution of Nouns/Verbs and Types across Disciplines ... 45

5 Conclusion ... 52

References ... 56

Appendix 1 List of nouns following this/these ... 61

Appendix 2 List of nouns per type ... 73

Appendix 3 List of verbs following this/these... 76

Appendix 4 Complete list of nouns per discipline ... 79

Appendix 5 Complete list of verbs per discipline... 84

(4)

3 1 Introduction

In order to construct a rhetorically effective text, writers adopt several strategies. One of them is the use of cohesive devices to structure the text, connect ideas and maintain a clear flow of information, which facilitates understanding and helps build a strong and convincing argument.

The importance of these devices is reflected in the attention paid in the literature to specific linguistic items such as linking adverbials or connectors (e.g. Charles, 2011; Hinkel, 2001;

Granger & Tyson, 1996). While connectors have been researched to some extent, cohesive items such as the demonstratives this/these and that/those have been less explored. As markers of anaphoric reference, these help to establish “contextual ties between ideas” (Hinkel, 2001, p. 112) and play an important role in text cohesion.

The appropriateness or not of leaving an anaphoric demonstrative (in particular this)

‘unattended’, that is, not followed by a noun, has been a matter of debate in academic writing for decades (Swales, 2005; Geisler, Kaufer & Steinberg, 1985). The circumstances that lead writers to choose between the pronominal form of a demonstrative and the determiner form followed by a noun have received considerable attention in the North American approach to composition and writing instruction (e.g. Moskovit, 1983; Geisler et al., 1985). In addition, style manuals and textbooks (e.g. American Psychological Association, 2010; Glenn & Gray, 2013; Swales & Feak, 2012) continue to draw attention to the potential ambiguity and confusion that writers can create by using an unattended demonstrative, placing an unnecessary burden on the reader. Teachers too tend to be adamant that students should avoid disrupting the cohesion of a text by making the anaphoric reference clear and often “scribble in the margin

‘this what?’” (Swales, 2005, p. 2).

Rather than taking a prescriptive approach to the use of (un)attended this/these, the research focus has shifted in recent years to descriptive examinations of how these forms are actually used in patterns of anaphoric reference, especially in academic writing. This

(5)

4

development has been strongly supported by the increasing use of corpora and corpus tools.

Much of the emphasis has been placed, however, on published research articles (Gray & Cortes, 2011; Gray, 2010; Swales, 2005) and comparatively few studies have investigated how demonstrative determiners and pronouns are used by student writers; even fewer have targeted a non-North American context (e.g. Petch-Tyson, 2000). Those studies that have focused on student writing have prioritised advanced or upper-level students (e.g. Wulff, Römer & Swales, 2012; Römer & Wulff, 2010). A comparison of their findings with results from an investigation of attended and unattended demonstratives in novice writing could provide useful information about their usage at different levels of study.

In comparison with the relative scarcity of studies on anaphoric demonstratives, a phenomenon that has been somewhat more attended to in recent years is the systematic variation within academic discourse. An increased awareness of disciplinary variation and the importance of a discipline-specific identity have been widely acknowledged, as research has demonstrated that academic discourse varies to a large extent across genres and disciplinary communities (e.g. Hyland, 2004, 2006; Bhatia, 2002; Becher & Trowler, 2001). Students at both undergraduate and postgraduate level are also expected to follow certain linguistic conventions and develop disciplinary discourses in order to successfully progress through tertiary education (Hyland, 2006, p. 39). The disciplinary perspective is thus also relevant to explore in student writing (e.g. Nesi & Gardner, 2012; Staples, Egbert, Biber & Gray, 2016;

Gardner, Nesi & Biber, 2018).

1.1 Aim and Research Questions

The overarching aim of this study is to investigate the use of the demonstratives this/these in a representative collection of texts produced by first-year undergraduate students from different academic disciplines. It seeks to determine how frequently this/these are used as determiners or pronouns, as well as the most common nouns and verbs that follow these demonstratives, and

(6)

5

whether any disciplinary variation can be found. For these purposes, frequencies of use and the most frequent nouns and verbs and respective types will be analysed, followed by a comparison of the results across disciplines. The study will be guided by the following research questions:

1) To what degree are attended and unattended this/these used in first-year undergraduate student writing?

2) Which nouns and noun types (concrete, deictic, shell, other abstract nouns) most frequently follow attended this/these?

3) Which verbs and verb types (lexical, primary, modal) most frequently follow unattended this/these?

4) To what extent, if any, do student writers in different disciplines differ with respect to the frequency of use of attended and unattended this/these?

5) How are the most frequently co-occurring nouns/verbs and respective types distributed across academic disciplines?

To address these research questions, the next section will review literature on academic writing related to disciplinary discourses and cohesion, as well as provide an overview of previous corpus-based research on the use of demonstratives as cohesive resources. Section 3 will then describe the material and methodology employed in this study, after which the results of the analysis of the data will be presented and discussed in section 4. Finally, a summary of the main findings, as well as limitations and suggestions for future research will follow in the concluding section.

(7)

6 2 Review of the Literature

2.1 Academic Discourse(s)

Academic discourse broadly concerns “the ways of thinking and using language which exist in the academy” (Hyland, 2009, p. 1). It is through discourse, in its different realisations, that academics collaborate, communicate, create and disseminate knowledge. Academic discourse is used by members of different social groups, often referred to as ‘discourse communities’, within which they develop specialised knowledge and competence in the communication practices of their particular disciplines (Paltridge, 2002, p. 15; see also Becher & Trowler, 2001). According to Swales (1990, p. 9), “discourse communities are sociorhetorical networks that form in order to work towards sets of common goals”. A discourse community is further identified by a particular way of communicating, its members possessing a certain level of expertise and familiarity with the specialised vocabulary and genres that are relevant for their communicative purposes (Swales, 1990, pp. 24-27). Different academic discourse communities thus have distinct ways of using language and doing things. While it has been somewhat contested, the concept of discourse community nevertheless “proves useful in identifying how writers’ rhetorical choices depend on purposes, setting and audience” (Hyland, 2009, p. 66) of any given discourse, which in turn can help to understand how those communicative purposes or goals are achieved by different groups.

Research on academic discourse in recent years has demonstrated that “the discourses of the academy are enormously diverse” (Hyland, 2004, p. x) with respect to both genre and discipline (e.g. Bhatia, 2002; Hyland, 2004), resulting in an increasing awareness of multiple

‘academic discourses’ that contrasts with the previously-held perception of “a single, monolithic ‘academic English’” (Hyland, 2009, p. ix). The implications and value of this

(8)

7

research, for language teaching in general and English for Academic Purposes pedagogy in particular, are widely recognised (Flowerdew, 2002; Hyland, 2006).

Within academic discourse, particular attention has been paid to written discourse. As the

“main way scholarship is transmitted” (Flowerdew, 2016, p. 6), writing is essential for the dissemination of knowledge through publication by professional academics. At the same time, writing plays an equally essential role in the display of knowledge for assessment purposes by students and “is probably the single most important skill necessary for academic success” (Nesi

& Gardner, 2012, p. 3). In spite of, or because of this role, it represents a challenge for novices in academia. Academic writing in English specifically is usually thought to be a complex and elaborated form of discourse, characterised by long and unfamiliar words and rather abstract (Biber & Gray, 2010). Certain features that are typically associated with academic texts include high lexical density and nominalisation (compressed nominal or phrasal structures); a formal, detached and impersonal style; source use and intertextuality (McCarthy, Matthiessen, & Slade, 2010, p. 55; Paltridge, 2002, pp. 136-137; see also Biber, 2006; Biber & Gray, 2010). Some of these features make academic language not only difficult to read and understand but also intimidating for students, especially novice writers (Flowerdew, 2016, p. 7; Biber, 2006).

Among the multiple approaches to the study of academic discourse, findings from corpus- based investigations have contributed considerably to the description and characterisation of academic discourse in general, and academic writing in particular. In addition to research on the distinctive features of academic discourse, the different registers of university language have been examined (e.g. Biber, 2006), as have the diversity of academic genres and their interplay with the various disciplinary discourses (e.g. Hyland, 2004). Large corpora of academic texts, such as the British Academic Written English (BAWE) corpus, have also been used to investigate different genres and features of student academic writing. Nesi and Gardner (2012), for instance, have identified, from the wide variety of university assignment genres

(9)

8

across several disciplines, five different social functions, which are linked to the different stages of a degree: ‘demonstrating knowledge and understanding’; ‘critical evaluation and developing arguments’; ‘developing research skills’; ‘preparing for professional practice’; and ‘writing for oneself and others’. These functions help to gain a better understanding of the type of knowledge that students need to demonstrate to meet tutors’ expectations for different tasks, as well as of the various linguistic features that are characteristic of each. Also using data from the BAWE corpus, Staples et al. (2016) have examined how students’ writing develops in terms of grammatical complexity through levels of study, mediated by genre and discipline, and found that their writing becomes more complex as students advance in their studies. Taking a different approach, Gardner et al. (2018) have also explored the interaction between several situational variables in student writing. Using a multidimensional analysis, they identified the way certain linguistic features are differently clustered or dispersed along four dimensions across disciplines, levels of study and genres in the BAWE corpus.

2.1.1 English for general or specific academic purposes

The increasingly pervasive use of English in academia and research in the past decades has led to the emergence of an area of study in Applied Linguistics designated English for Academic Purposes (EAP). One of the main branches of English for Specific Purposes (ESP), EAP specifically “covers language research and instruction that focuses on the communicative needs and practices of individuals working in academic contexts” (Hyland & Shaw, 2016, p. 1). The main focus of EAP instruction is the learner and their needs in preparation for tasks or work in academic environments, while taking into account the demands of specific academic disciplines. One of the goals of EAP is thus to equip students with skills in and awareness of different disciplinary conventions, which will help them produce more genre-oriented and discipline-specific writing in English (Flowerdew, 2002).

(10)

9

The growing importance of English as a medium of university instruction alongside the globalisation of higher education has led to the expansion of the role and focus areas of EAP, which in turn generated the “tension between a general EAP addressing a generalized academic discourse and dedicated disciplinary-discourse instruction” (Hyland & Shaw, 2016, p. 6). As a result, two main approaches to EAP instruction have emerged, English for General Academic Purposes (EGAP) and English for Specific Academic Purposes (ESAP). EGAP is concerned with the general academic skills and ‘common core’ features of academic language that are needed by students in all disciplines, whereas ESAP addresses the specific needs of students of particular disciplines (Flowerdew, 2016, p. 7). Often associated with different levels of study, EGAP courses provide undergraduates and taught graduates with the generic academic skills they require, while research postgraduates and academics have more discipline-specific needs that can be met through ESAP instruction (Flowerdew, 2016, p. 8). The fact that many fields of study are becoming more interdisciplinary, leading to higher demands on students to acquire competence in more than one discipline, adds however another layer of complexity (Bhatia, 2002, p. 27; Flowerdew, 2016, p. 8).

The ongoing debate over the issue of specificity in ESP/EAP has centred around the question whether there are transferrable or common skills and language features to justify a general approach, or should the focus lie on providing students with discipline-specific instruction (Hyland, 2002, p. 385; see also Shutz, 2013). As discussed in more detail in Section 2.1.2.1, a main area of focus of EAP where this question is particularly relevant and remains central is that of academic vocabulary (generic versus discipline-specific), namely the creation of word lists based on frequency, for which the use of corpora has been instrumental (Nesi, 2016). Corpora have been increasingly used by EAP researchers and practitioners as sources of authentic language use for the analysis of large samples of academic text in different registers and contexts (e.g. Biber et al., 1999; Biber, 2006; Hyland, 2004; Nesi & Gardner, 2012; Gardner

(11)

10

et al., 2018). In addition to being used as research tools, corpora are also useful resources for EAP writing instruction and materials development (Nesi, 2016; Hunston, 2002).

2.1.2 Disciplinary variation in academic writing

The concept of variation is not only relevant to distinguish between academic and other types of discourse, but also within academic discourse, which varies according to different parameters such as genre, register, mode (spoken/written), and discipline. Of particular interest here, discipline is a complex concept to define, not least because it varies over time and geographical location due to the “changing nature of knowledge domains” (Becher & Trowler, 2001, p. 41).

Hyland (2004, p. 1) suggests that disciplines are primarily defined by disciplinary-approved practices that members of a local discourse community adopt and are competent in. Along the same lines, for Becher and Trowler (2001) academic communities (or ‘tribes’) and disciplinary knowledge (the ‘territories’) are “inseparably intertwined” (p. 23). In other words, belonging to a discipline means sharing common ways of producing and communicating knowledge.

Disciplines differ in terms of preferred genres, how knowledge and arguments are constructed, as well as the vocabulary or terminology used. For instance, the sciences typically emphasise the construction of knowledge through proof in an objective way and value precision, whereas the humanities favour the strength of argument and an interpretive discourse that attempts to describe and understand familiar human experiences, while the social sciences are seen as falling somewhere in between (Hyland, 2009, p. 63). It follows that disciplinary knowledge mainly “reflect[s] real-world differences in subject matter” (Becher & Trowler, 2001). The distinction between ‘hard’ (sciences/technology) and ‘soft’ (humanities/social sciences) disciplines that generally reflects common perceptions does not, however, imply a clear-cut division between disciplinary groupings (Hyland, 2009, p. 64).

Gaining an awareness of discipline-specific conventions alongside developing disciplinary knowledge is therefore essential for successful student academic writing

(12)

11

(Flowerdew, 2016, p. 8; Hyland & Tse, 2007, pp. 248-249). As university students are exposed to and learn to master the different communication skills that are specific to their target disciplines, they also become socialised into the discourse communities of the various academic disciplines (e.g. Nesi & Gardner, 2012; Hyland, 2006; Becher & Trowler, 2001).

Corpus research has contributed to revealing the extent of disciplinary variation in academic writing both within and across disciplines (e.g. Hyland, 2004, 2006; Hyland & Tse, 2007). Advanced student writing in particular has been found to reflect to some extent the differences in disciplinary discourses that students gradually come to recognise and use as they progress through their studies (Staples et al., 2016; Gardner et al., 2018).

2.1.2.1 ‘Common core’ versus discipline-specific vocabulary

The fact that academic discourse can be distinguished from other types of discourse by the prominent use of certain linguistic features (e.g. Biber, 2006) could suggest that there is a common set of general academic skills or language features that cuts across disciplines.

Different disciplines will, however, vary in the extent to which they adhere to those features and in how they use them (e.g. Hyland & Tse, 2007), especially considering the variety of academic writing tasks and assignments that, for instance, university students are expected to do in different disciplines (Nesi & Gardner, 2012). In addition, in terms of terminology or vocabulary specifically, it is not clear whether there is a large enough ‘common core’

vocabulary which is characteristic of academic discourse and similarly used in all disciplines (Hyland & Tse, 2007, p. 243).

Research on academic writing has underlined the distinction between technical or specialised (discipline-specific) terminology and general academic vocabulary, that is, those

“items which are reasonably frequent in a wide range of academic genres but are relatively uncommon in other kinds of texts” (Hyland & Tse, 2007, p. 235; see also Hyland, 2002). The latter in particular has been the focus of lists of words specific to academic discourse, such as

(13)

12

the Academic Word List (Coxhead, 2000), the Academic Vocabulary List (Gardner & Davies, 2013), and the Academic Formulas List (Simpson-Vlach & Ellis, 2010), which attempt to capture the most important vocabulary that university students should master (Nesi, 2016).

The usefulness of such vocabulary lists has been debated over time and the question remains “whether it is useful for learners to possess a general academic vocabulary […] because it may involve considerable learning effort with little return (Hyland & Tse, 2007, p. 236).

Different positions have been taken on this point, with some authors arguing for the importance of making lists of general academic vocabulary accessible to meet university students’ needs (e.g. Simpson-Vlach & Ellis, 2010; Gardner & Davies, 2013), while others claim that many words or phrases have different meanings or phraseological patterns in different disciplines and that is more important for students to be made aware of those distinctive uses in their own target disciplines (e.g. Hyland & Tse, 2007; Hyland, 2008).

Hyland and Tse (2007), for instance, question “the assumption that a single inventory can represent the vocabulary of academic discourse and be so valuable to all students irrespective of their field of study” (2007, p. 238) on the basis that their investigation shows that the extent to which lexical items in the AWL are used across disciplines varies considerably. They also highlight the differences in meaning and collocational environments that particular items (e.g.

volume, attribute) exhibit in different disciplines. Hyland (2008) further adds that disciplines show different preferences in their use of lexical bundles (or highly frequent collocations), with less than half of the top 50 identified bundles being common to the four disciplines under investigation (Biology, Electrical Engineering, Applied Linguistics, and Business Studies). In support of a discipline-specific approach to EAP teaching, Hyland concludes that these findings undermine “the widely held assumption that there is a single core vocabulary needed for academic study” (p. 20). In contrast, Simpson-Vlach and Ellis (2010, p. 509) identify a

“common core of academic formulas that do transcend disciplinary boundaries” (the Academic

(14)

13

Formulas List) by using a ‘formula teaching worth’ measure that combines frequency and MI statistics in the ranking of formulas. While acknowledging the need for further research on disciplinary variation, they argue that differences in operationalisation in Hyland’s (2008) study could explain their contrasting results. Simpson-Vlach and Ellis (2010) maintain that the AFL is an “empirically derived, pedagogically useful list” of frequently recurrent formulaic sequences across academic genres and disciplines that occur significantly more in (spoken and written) academic discourse than in non-academic discourse, and hence could be considered representative of an academic style of discourse. Other studies (Shutz, 2013; Römer & Wulff, 2012) lend further support to the relevance of a general approach to teaching academic vocabulary by demonstrating that there are a number of verbs (related to the research activity, e.g. reporting information, describing data and results) and nouns (mainly metadiscoursal and related to methodology) that are highly frequent in and common to several disciplines.

Based on the increasing literature on general versus discipline-specific EAP instruction, it seems that it would be more adequate to say that one does not exclude the other; rather, a combination would suit a larger number of students, depending on specific teaching contexts and stages. In addition to the specific uses of vocabulary (both individual words and multiword units) in different disciplines that students need to be exposed to and acquire, there is still an important set of academic words or phrases that are shared across disciplines, which reflect the activities and functions that are typical of ‘academic’ work.

2.2 Discourse Grammar: Cohesion in Written Discourse

A text can be defined as “the verbal record of a communicative act” (Brown & Yule, 1983, p. 6), meaning that it is an instance of language in use rather than language as an abstract system of meanings and grammatical relations. A stretch of language of any length can be identified as a unit of meaning, or text, if it constitutes a unified whole, made up of certain resources that create texture, within a specific communicative context (Halliday & Hasan, 1976). Cohesion is

(15)

14

what ties a text together, that is, the network of grammatical and lexical relations that connect various parts of a text (Halliday & Hasan, 1976):

Cohesion occurs where the INTERPRETATION of some element in the discourse is dependent on that of another. The one PRESUPPOSES the other, in the sense that it cannot be effectively decoded except by recourse to it. (p. 4;

authors’ emphases).

These surface meaning relations enable the reader to understand and interpret the content of the text. By analyzing these relations or ties between dependent elements, the purpose and structure of the text and its component parts can be more easily identified.

Cohesion also makes writing flow, or “mov[e] from one statement in a text to the next”

(Swales & Feak, 2012, p. 30), by creating and reinforcing connections at different structural levels—sentence, paragraph and discourse. Within these, cohesion in writing can be established through various language features, including information structure and thematic progression at a macro-level, and lexicogrammatical cohesive devices, such as repetition, linking words, and pronominal reference, at a more micro-level.

The notion of information structure typically concerns the development from ‘given’ (or old) to ‘new’, which is the unmarked pattern of organisation of information in English. Given information is that which is already known to the hearer/reader or otherwise “recoverable either anaphorically or situationally” (Halliday, 1967, p. 211, as cited in Brown & Yule, 1983, p. 179), whereas new information is not. In other words, given information represents shared knowledge and provides a reference point to which new information can be related (Bloor & Bloor, 2004, p. 66). This given-to-new pattern facilitates understanding of the information conveyed by the speaker/writer and clearly indicates what they consider the most important information. The early placement of given information (i.e. in the subject position) “establishes a content

(16)

15

connection backward and provides a forward content link that establishes the context” (Swales

& Feak, 2012, p. 31).

Information structure is closely related to the thematic structure established within a clause by its constituents, ‘theme’ and ‘rheme’. In the context of writing, the former is what the clause is about, or its “starting point”, and the latter what is said about the theme, that is, the new element or piece of information being introduced (Paltridge, 2012, p. 129; Bloor & Bloor, 2004, p. 73). The theme then serves both as a point of orientation by connecting back to previous stretches of text and as a point of departure by connecting forward and contributing to the development of later stretches (Bloor & Bloor, 2004, p. 73). The relationship between theme and rheme, for example the way a theme develops a topic introduced by a previous rheme, also contributes to the texture of a text. Thematic progression, a “key way in which information flow is created in a text” (Paltridge, 2012, p. 131; italics in original), allows the writer to maintain a continuity of ideas and the reader to follow the ideas in a text.

Writing cohesively thus implies ensuring a smooth flow of information as well as a clear connection between ideas and dependent elements within a text. The lexicogrammatical devices that can be used to establish cohesive ties will be the focus of the following section.

2.2.1 Cohesive devices and reference patterns

In their classic model of cohesion in English, Halliday and Hasan (1976) identify five main cohesive devices which contribute to the texture of a text: reference, substitution, ellipsis, conjunction, and lexical cohesion. Of particular interest for this study, reference and ellipsis require the reader to elicit information in a certain point of the text, by retrieving it from or relating it to a relevant part of the text. While ellipsis involves retrieval of information that can be presupposed, reference creates textual cohesion by linking elements (referents) that enable recovery of information from the immediately preceding or subsequent context. More specifically, reference can be defined as the relationship of identity which enables the reader to

(17)

16

trace entities or events in a text. It comprises a set of grammatical and discoursal resources that allow the writer to refer back (anaphorically) to something that appeared before in the text or forward to something that is yet to be introduced (Halliday & Hasan, 1976):

[…] the specific nature of the information that is signalled for retrieval. In the case of reference the information to be retrieved is the referential meaning, the identity of the particular thing or class of things that is being referred to; and the cohesion lies in the continuity of reference, whereby the same thing enters into the discourse a second time. (p. 31)

Maintaining continuity of reference, or chains of reference, enables the reader to interpret and follow the flow of information as intended by the writer. As “sequences of noun phrases all referring to the same thing […] in a relation of co-reference” (Biber, Johansson, Leech, Conrad

& Finegan, 1999, p. 234; emphasis in original), chains of reference contribute to a great extent to text cohesion. Co-reference can be expressed through different linguistic forms.

Halliday and Hasan (1976) distinguish between three types of reference: personal, demonstrative, and comparative. Personal reference involves using personal pronouns, possessive determiners and possessive pronouns to “refer to something by specifying its function or role in the speech situation” (p. 44). Through demonstrative reference, the referent is identified by means of adverbial and nominal demonstratives in terms of location (in space or time) and proximity. Comparative reference is indirect and includes two types of comparison, general (likeness or unlikeness) and particular (quantity or quality), which are expressed by adjectives and adverbs.

These types of reference can refer to the context of the situation (exophorically) or to entities mentioned within a text (endophorically). Endophoric reference is established in two main ways, namely through the use of anaphoric and cataphoric expressions (Brown & Yule, 1983, p. 192; Bloor & Bloor, 2004, p. 96). Anaphora refers to the pattern of reference by which

(18)

17

a word or phrase is used to refer back to someone or something (antecedent) that has already been mentioned earlier on in the text. In contrast, cataphora is the process by which some linguistic items refer forward to someone or something coming later in the text. In this study, only anaphoric reference is of interest as textual cataphoric cohesion is much less common (Halliday & Hasan, 1976, p. 68).

The demonstratives (this, these, that and those) are an important means of establishing cohesive and referential relations in discourse, as will be discussed in more detail below.

2.2.2 Demonstratives as determiners or pronouns

Described as forms of “verbal pointing” (Halliday & Hasan, 1976, p. 57), the demonstratives this, these, that and those constitute a category of deictics or deictic expressions, that of spatial deixis. Meaning “pointing” through language in ancient Greek, the term “deixis” generally refers to those linguistic forms (e.g. this and that, here and now) that are tied to the situational or textual context shared by the speaker/hearer or writer/reader, which means that context is essential for their identification and interpretation. As important cohesive resources, deictic expressions serve not only to point to something but also to establish anaphoric or cataphoric reference to a preceding or following part of the discourse. Even though deictic expressions are typically associated with spoken discourse, in writing the text itself can provide the context that is needed to interpret these “pointing” expressions. In fact, deictics such as this are common not only in conversation registers, but also in academic writing (Swales, 2005, p.1; Biber et al., 1999, p. 349; see also Biber, 2006, p. 15) their referential meaning being determined by the textual context in which they occur: relative proximity (this, these) and relative remoteness (that, those).

In establishing anaphoric reference between a referring expression and an antecedent, demonstratives can be used either independently as pronouns or ‘Heads’ (e.g. This means that…) or dependently as determiners or ‘Modifiers’ preceding a head noun (e.g. This

(19)

18

explanation…). Focusing specifically on what Halliday and Hasan (1976) call selective nominal demonstratives (this, these, that, and those), these are distinguished in terms of number (singular versus plural) and proximity (near versus distant), as shown in Table 1 below.

Table 1. Distinctive features of the demonstratives Singular Plural

Near this these

Distant that those

According to Biber et al. (1999, p. 349), “proximity is insufficient to account for the distribution of the demonstrative pronouns”, which varies depending on register: that more common in conversation; this, these and those more often employed in academic prose. The high frequency of the demonstratives this and these as determiners and as pronouns in academic writing can be explained by “their use in marking immediate textual reference” (Biber et al., 1999, p. 349). One further distinction then between these forms of demonstratives relates to the anaphoric distance that is normally associated with each form: the shorter anaphoric distance of demonstrative pronouns contrasts with the larger anaphoric distance of demonstrative determiners. This in turn is related to the “relationship between explicitness and anaphoric distance” (Biber et al., 1999, p. 240) that further distinguishes the demonstrative forms when used pronominally as Heads or followed by a noun as Modifiers: the greater the distance between the referring expression and its antecedent, the greater the possibility of creating ambiguity. Halliday and Hasan also highlight the notion that singular demonstrative Heads (this and that) are associated with extended text reference (cf. ‘situation reference’ in Petch-Tyson, 2000) in addition to referring anaphorically to a single referential point, but “in either case the effect is cohesive” (Halliday & Hasan, 1976, p. 67).

This study focuses on the ‘near’ demonstratives this and these, which are characteristic of academic writing (Biber, 2006, p. 15), and in particular on their distinctive anaphoric uses

(20)

19

as determiners or as pronouns in establishing connections between parts of a text. More specifically, this and these are often used in given-to-new patterns to refer back to a part or entirety of the preceding sentence (or even several previous sentences), thus contributing to textual cohesion. Phrases or clauses beginning with this/these followed by a noun (or ‘summary word’) “summarize what has already been said and pick up where the previous sentence has ended” (Swales & Feak, 2012, p. 43), as illustrated in example (1) below. Nesi and Gardner (2012, p. 110) also draw attention to the summarising role of the anaphoric demonstrative this followed by a lexical verb, such as This clearly shows that or This illustrates that, in essays across disciplines in the BAWE corpus. The following examples illustrate how this can be used to connect pieces of information and create text cohesion both as a determiner (1) and as a pronoun (2).1

(1) When children reach school, they are required to have a metalinguistic awareness and understanding of the language. This ability demands that they think about language, its uses and the rules that govern it, in order to read and write. (LING_6045a)

(2) Without the vocabulary it has learnt during the acquisition of its first 50 words, a child could not progress to this stage, and these words could not have been acquired without the experimentation of sound production in the babbling phase. This shows that all stages in the speech development process are important and should be viewed as equally significant. (LING_6067b)

2.2.3 Attended and unattended this/these in academic writing

As previously noted, the anaphoric demonstratives this and these are important elements in given-to-new information structuring of texts, offering a convenient way of cohesively “getting out of a sentence and into another” (Swales, 2005, p. 6). They can, however, serve other functions. As pronouns (or “unattended”), they can be an economical and effective means of

1 All examples are taken from the data used for this study and are identified by an abbreviation of the subcorpus (e.g. LING) followed by the Document ID number (e.g. 6045a) as referenced in the BAWE corpus documentation.

Any typographical or grammatical errors in the original text were kept.

(21)

20

offering an explanation or conveying information in reference to preceding discourse of varying lengths. When supported by a noun (or “attended”), they can pinpoint the focal point of a proposition, while simultaneously allowing the writer to add clarity or interpretation of the referent (Swales & Feak, 2012, pp. 43-48; Geisler et al., 1985, p. 151). These different functions and the “long and unfinished story” (Swales, 2005, p. 4) of the attended and unattended forms of the anaphoric demonstratives can be summarised in terms of trade-offs between economy and clarity and rhetorical opportunities (Geisler et al., 1985):

Out of control, the unattended this points everywhere and nowhere; under control, it is the lanuage’s [sic] routine for creating a topic out of a central prediction [sic], pointing to it, bringing it in to focus, and discussing it; all done in one stroke, gracefully, economically, and without names. (p. 153)

While Swales (2005) argues that “tacit sense of the tradeoff between economy and clarity […]

probably only comes with considerable writing experience” (p. 14), the choice between clear and economical reference is particularly relevant for university students, who are expected to learn how to write clear, unambiguous texts using an ‘academic’ style of discourse that is typically compressed and inexplicit (Biber & Gray, 2010, p. 19).

Despite the potential effectiveness of both determiner and pronominal forms of the demonstratives, academic style guides and textbooks often advise against or recommend the careful use of unattended this (e.g. American Psychological Association, 2010; Glenn & Gray, 2013; Swales & Feak, 2012). The APA manual, for instance, refers to the pronominal forms this, that, these, and those as “the most troublesome” and recommends writers to “[e]liminate ambiguity by writing, for example, this test, that trial, these participants, and those reports”

(2010, p. 68). In a section called “This and Summary Phrases”, Swales and Feak (2012) also recommend as best practice that the demonstrative this be followed by a noun whenever “there is a possibility your reader will not understand what this is referring to […] so that your meaning

(22)

21

is clear” (p. 43). They add however, in a final section commentary which was not included in previous editions of the textbook, that “there are occasions when “unattended” this (no following noun) is perfectly reasonable” (p. 48). This additional commentary, as noted by the authors, is based on the findings of a corpus-based study on advanced student academic writing (Wulff et al., 2012).

Recent research on published academic prose (Gray, 2010; Gray & Cortes, 2011; Swales, 2005) corroborate these findings by reporting on the common use of this and these as pronouns in empirical research articles across academic disciplines. One possible explanation relates to requirements on academics during the editorial and revision process to submit shorter texts, and a “simple way of doing this is to ‘de-attend’ instances of this” (Swales, 2005, p. 14). It could also be argued that this increasing awareness and use of the pronominal demonstratives (and in particular of this) in academic writing might be related to a shift in academic English towards a less explicit expression of meaning that has been found in recent studies (e.g. Biber & Gray, 2010). In contrast to the (mostly implicit) reference to elements in the situational context in spoken discourse where pronouns (including deictics) abound, academic written discourse is often claimed to be “maximally explicit in meaning” (Biber & Gray, 2010, p. 11). In writing, deictic pronouns are used for textual reference, pointing to a specific part of the text (or proposition). Failure to use referring expressions that can identify antecedents clearly could create ambiguity and cause confusion in particular to non-experts (Biber & Gray, 2010).

The demonstratives this and these have also been associated with a perceived increase of informality in research articles, which is not as prevalent as generally thought (Hyland & Jiang, 2017; Biber & Gray, 2010, p. 17). From a list of ten features typically considered ‘informal’

and proscribed in academic writing, Hyland and Jiang (2017) found that unattended reference was one of the three main features that influenced the overall results the most. Despite their high frequency, anaphoric unattended pronouns (this, these, that, those, it) showed a declining

(23)

22

trend over time in the four disciplines chosen to represent the hard and soft sciences (Applied Linguistics, Sociology, Engineering, and Biology). It is unclear, however, to what proportion each of the anaphoric pronouns declined (or increased). Adopting the same list of ten categories of informality, Lee, Bychkovska and Maxwell (2019) compared the use of informal language in argumentative essays by native English and non-native undergraduate students using data extracted from the MICUSP and COLTE corpora of student writing.2 They found that both groups frequently use features of informality, in particular anaphoric unattended pronouns, which represented over 47% and 55% of all informal features in the native and non-native student corpora, respectively. Mixed patterns were found, however, concerning the individual pronouns preferred by each group, with native students using unattended this significantly more than non-native students, while the remaining pronouns (these, that, those, it) occurred significantly more frequently in the non-native corpus. Lee et al. (2019) argue that the native students use “a broader range of informal features, particularly those that have become relatively legitimized in academic writing such as […] unattended this” (p. 152). It could be added that not only first language but also level of study/expertise is an important explanatory factor, since the native students, as senior undergraduates, are presumably more exposed to and more aware of research writing practices than non-native students from first-year writing courses. Further studies would help “to determine whether it is the L1, writing experience, reader (i.e., subject-matter or composition instructor), writing task, or a combination of these factors that affect undergraduate students’ stylistic choices” (Lee et al., 2019, p. 152).

2 The Michigan Corpus of Upper-Level Student Papers (MICUSP) corpus contains 830 A-graded, upper-level papers of different genres from 16 academic disciplines. The Corpus of Ohio Learner and Teacher English (COLTE) is a large collection of English as a second language (ESL) student writing and teacher written feedback, compiled at Ohio University.

(24)

23 3 Material and Methodology

This chapter presents the material used for the study and describes the quantitative research methods employed and the models of classification used in the qualitative analysis.

3.1 Data

The material for this study was extracted from the British Academic Written English (BAWE) corpus, which comprises texts in a wide range of university genres (e.g. essays, research reports, case studies) from four levels of study (first-year undergraduate to taught master’s level) across 30 main academic disciplines, totaling approximately 6.5 million words.3 The corpus includes coursework assignments produced by native and non-native English speakers from four British universities which were awarded ‘merit’ and ‘distinction’ grades, and hence the writing can be considered matching the standards set by subject tutors (Nesi & Gardner, 2012, p. 6).

The documentation provided with the BAWE corpus includes a spreadsheet with metadata about its content (e.g. discipline, grade, first language of the author). By using the metadata filters, the corpus data was restricted to texts by first-year native English speaker students from four disciplines representing each of the four broad disciplinary groups—Arts and Humanities, Social Sciences, Life Sciences, and Physical Sciences.4 The choice of discipline was based on the most amount of data available for the first-year level of study:

Linguistics (LING), Law (LAW), Biological Sciences (BIO) and Engineering (ENG). The composition of the four subcorpora that were created is shown in Table 2.

3 “The data in this study come from the British Academic Written English (BAWE) corpus, which was developed at the Universities of Warwick, Reading and Oxford Brookes under the directorship of Hilary Nesi and Sheena Gardner (formerly of the Centre for Applied Linguistics, Warwick), Paul Thompson (formerly of the Department of Applied Linguistics, Reading) and Paul Wickens (School of Education, Oxford Brookes), with funding from the ESRC (RES-000-23-0800).” https://warwick.ac.uk/fac/soc/al/research/collections/bawe/how_to_cite_bawe Additional information about the corpus is available at www.coventry.ac.uk/BAWE

4 Two main groups can be distinguished corresponding to the general distinction between ‘hard’ and ‘soft’

sciences: the natural and physical sciences (Biological Sciences and Engineering) versus the humanities and social sciences (Linguistics and Law).

(25)

24 Table 2. Overview of the subcorpora

Disciplinary group Discipline Number of words

Number of texts

Average text length

Arts and Humanities Linguistics 40,812 25 1,632

Social Sciences Law 56,630 26 2,178

Life Sciences Biological Sciences 72,492 42 1,726

Physical Sciences Engineering 58,857 36 1,635

228,791 129

Note: No editing was made to the original texts; any footnotes or reference lists are included in the word count.

As can be seen, there is a certain imbalance between the four subcorpora, with the Biological Sciences subcorpus being larger than the rest. In addition, genre could not be controlled for due to the wide variation of the assignments and the unequal distribution of data available across disciplines and genres (see Gardner and Nesi, 2013, for details on the classification of texts in the BAWE corpus into 13 ‘genre families’ according to their purpose and generic structure). Each subcorpus comprises different types of texts, but predominantly:

essay in Linguistics; essay and critique in Law; methodology recount and explanation in the Biological Sciences; and methodology recount in Engineering—closely matching the general distribution of these genres across disciplines and levels of study in the BAWE corpus (Nesi &

Gardner, 2012, pp. 51-52). This means that the results of the present study could be mediated not only by discipline but also genre. While acknowledging these limitations, the selected data can nevertheless be considered representative of native English proficient student writing in each of the selected disciplines. It is acknowledged, however, that the generalizability of the findings is limited by the small size of the data sets.

3.2 Method of Analysis

Corpus linguistics has been used by researchers from different fields as a methodology to study language in real-life contexts through corpora of authentic texts and the use of computer software (e.g. concordance tools). A corpus can be defined as “an electronically stored

(26)

25

collection of samples of naturally occurring language” (Hunston, 2006, p. 234) designed for a particular purpose. Several tools and methods, such as word lists, keywords and lists of collocates, can be used to explore corpora and investigate specific linguistic features or recurrent phraseology (Nesi, 2016; Hunston, 2002). While predominantly quantitative in nature (based on frequency data), corpus-based analyses are often combined with a close examination of concordance lines that show the context in which words co-occur, allowing researchers to test hypotheses and identify patterns of language use in large amounts of data that could go unnoticed if relying only on intuition or introspection.

This study employed quantitative corpus-based methods to determine the frequencies of occurrence and the extent of variation or similarity in the use of the demonstratives this and these in the data sets detailed in the previous section. A further analysis of the context in which this and these occur as a pronoun or as a determiner was carried out to determine the main types of verbs and nouns that follow the demonstratives.

As a first step, each of the four corpora was searched for the terms this and these using the concordance tool AntConc (Anthony, 2019) and the total number of occurrences per corpus was recorded (see Section 4.1). The output of each search was saved as a text file and imported into an Excel spreadsheet for subsequent analysis and annotation of the data.

Next, the context of each concordance line was analysed to distinguish and code each this and these as a pronoun or a determiner, after which their frequencies (raw and normalised) and percentages were calculated and their distribution across disciplines determined (see Sections 4.1 and 4.4). Some instances were excluded from further analysis, including (3) below, where the demonstrative this was used for cataphoric reference. Additionally, concordance lines where this or these were part of quoted text or linguistic examples, such as (4) and (5), were also excluded.

(27)

26

(3) Alan Freeman in "Truth and Mystification in Legal Scholarship" says this: […]

(LAW_0086d)

(4) Denning effectively overruled this, disregarding the doctrine of precedent, and stating that "beneficial interests in the matrimonial, or furniture, belongs to one or the other absolutely, or it is clear that they intended to hold it in definite shares. The court will give effect to these intentions;" striking an accommodating approach. (LAW_0209a)

(5) In a phrasal exchange we get I got into this guy with a discussion for I got into a discussion with this guy. Perhaps the speaker meant to say I got into this discussion with a guy.

(LING_6010c)

The subsequent step was to identify and classify the verbs and nouns that follow this/these as pronouns or determiners, respectively, according to the models of classification described in the next section. Figure 1 illustrates the coding of this in a sample of the data.

Figure 1. Example of coding of a data sample from the Engineering corpus

In the case of verbs, these were classified only when the pronoun served as the subject of the verb, which means that for instance are in example (6) below was coded (verb: be; type of verb:

primary), whereas are in (7) was not analysed further since its subject is the noun implications.

(28)

27

(6) However, this patient speaks fluently with quite long utterances and few pauses. Again, these are typical signs of Wernicke's aphasia (LING_6067f)

(7) It makes sense that this could work vice versa with words altering the meaning of pictures.

The implications of this are huge. Moving away from the sphere of music videos and the potential of multi-modal texts is huge and important (LING_6018a)

With regard to nouns, the frequency counts and classification of types apply to head nouns only, where the demonstrative was part of a noun phrase with pre- or post-modification, except for compound nouns. For instance, in examples (8) and (9) below, only the head nouns module and dilution were counted and coded. In the case of coordinated noun phrases with two head nouns, only the first one was recorded. In addition, nouns that were misspelled were excluded from annotation.

(8) If there was a fire in the house then this module will receive a high value in the heat section and so call for the fire brigade. This software module would have different levels of action. (ENG_0228f)

(9) This end point dilution gives the HA titre of the virus, which was 6400 HA units/ml and there was 2.88 x 1010 virus particles/ml. (BIO_0006b)

To make the coding of the noun types more manageable, for this part of the analysis only a subset of the data was used, namely those instances of this/these in sentence-initial position, which constitute approximately 40% of the total amount of data (see Table 3 in Section 4.1).

The specific challenges related to the classification of noun types will be further discussed in the following section.

Finally, after the data coding was completed, the frequencies of the nouns and verbs that follow this/these and respective types were calculated for each corpus and for the total, after which their usage across disciplines was compared (see Sections 4.2, 4.3 and 4.5).

(29)

28 3.2.1 Models of classification

For pronominal uses of this/these, the first verb following the demonstrative was classified as lexical (e.g. give, explain), primary (be, have, and do), or modal (e.g. may, will), according to the three main verb classes in the Longman Grammar of Spoken and Written English (Biber et al., 1999, p. 358). This framework was employed to distinguish between the different types of verbs found in the data and to determine which types are most used with unattended instances of this/these overall and whether any disciplinary preferences can be found.

The noun taxonomy in Gray’s (2010) study was initially adopted to classify the nouns modified by this/these as concrete, shell, abstract, and species/quantifier. To these, a category used by Gray and Cortes (2011), deictic nouns, was later added to account for the prevalence in the analysed data of nouns referring to the text itself or to parts of it (e.g. this essay, this paper, this graph), as could be expected in university assignments where students often refer to their own writing. In contrast, species/quantifier nouns were too infrequent in the data (less than half a percent of the total occurrences) to represent a separate category and were thus grouped under “other abstract nouns”. Table 3 below shows the four noun types adopted in this study (concrete, deictic, shell, other abstract nouns), based on a combination of the categories used by Gray (2010) and Gray and Cortes (2011).

Table 3. Definitions and examples of noun types, adapted from Gray (2010) and Gray and Cortes (2011)

Concrete nouns Nouns that represent physical entities or objects that can be touched, heard or seen.

e.g. apparatus, card, kit, student, specimen Deictic nouns Nouns that orient the reader by pointing to

the overall text or a specific part of it, or to extralinguistic elements.

e.g. study, article, figure, section

(30)

29

Shell nouns Abstract nouns that summarise or encompass preceding information and carry it into the next clause or sentence.

e.g. method, result, model, finding, issue, analysis

Other abstract nouns Nouns that refer to concepts or ideas that cannot be measured or observed.

e.g. requirement, value, level, range, type, kind

The above definitions of each category are useful to a certain extent; however, it became clear early on in the coding that there is no straightforward one-to-one correspondence between form and meaning (see examples (16) to (18) in Section 4.5). Shell nouns in particular (also referred to in the literature as ‘general nouns’, ‘anaphoric nouns’, ‘carrier nouns’ or ‘signalling nouns’) could only be coded as such if they performed the function of “encapsulation of meanings expressed in prior discourse” (Gray & Cortes, 2011, p. 34). Distinguishing between shell and other abstract nouns thus proved to be the most time-consuming and complex part of the classification, which mostly rested on the interpretation of the specific meaning of each noun based on its use in context. This contrasts with Gray’s operationalisation of shell nouns as those that had been identified in previous research, which could have had some influence on the results reported (2010, p. 181). Given the overlap between these noun types and the degree of subjectivity involved in their analysis, it remains to be determined whether such distinction can be objectively and effectively operationalised.

4 Results and Discussion

This section presents the main findings of this study as follows: in Section 4.1 the overall frequencies of this/these as determiners and pronouns are analysed; the nouns, verbs, and respective types that follow the demonstratives are then identified in Sections 4.2 and 4.3;

(31)

30

finally, Sections 4.4 and 4.5 compare the distribution across academic disciplines of pronominal and determiner uses of this/these, as well as of the most frequent nouns, verbs and types of nouns and verbs.

4.1 Overall Frequencies of Attended and Unattended this/these

A total of 2,547 instances of this (2,046) and these (501) were found in the corpora of student writing analysed in this study, as can be seen in Table 4 below. The singular form is much more frequent, as might be expected due to the predominant reference to abstract entities and concepts in academic prose, occurring on average nearly nine times for every thousand words in the data. This contrasts with the average of six times per thousand words reported for published research articles (Swales, 2005, p.1). The number of texts further reinforce this difference between the singular and plural forms: this is present at least once in each of the texts that comprise the corpus, whereas these can be found in fewer texts. In contrast, the two forms are similar in terms of the position they take in the sentence, with both this and these occurring more often in a medial or final position in the sentence than sentence-initially.

Table 4. Number of occurrences of this and these (percentages rounded to the nearest whole number)

Demonstrative Raw frequency

Frequency per 1,000 words

Number of texts

Sentence-initial (%)

this 2,046 8.94 129/129 870 (43)

these 501 2.19 116/129 174 (35)

2,547 1,044 (41)

A further distinction between this and these is found in their use as determiners or pronouns.

The figures in Table 5 show that this is employed fairly evenly as determiner and as pronoun, which contrasts with the overwhelming use of these followed by a noun in the data.

(32)

31

Table 5. Frequencies of this/these as determiners and pronouns (percentages rounded to the nearest whole number)

Demonstrative Determiner % Pronoun %

this 1,063 52 970 47

these 390 78 108 21

1,453 1,078

Similarly to other studies of student academic writing (Wulff et al., 2012; Petch-Tyson, 2000; Römer & Wulff, 2010), these results show that first-year undergraduates are not observing the prescriptive guidelines that generally advise against leaving a demonstrative unattended. They also indicate that student writers use pronominal this more frequently than research article writers (e.g. Gray, 2010; Gray & Cortes, 2011). On the other hand, the results for this differ somewhat from those reported previously by Wulff et al. (2012) and Petch-Tyson (2000), even though no direct comparison can be made due to differences in operationalisation.

Wulff et al. (2012) identified more cases of attended than unattended this (57% versus 43%) in papers by advanced-level students (MICUSP corpus), but they counted only those instances in sentence-initial position rather than in all positions in the sentence. A much higher percentage of demonstrative determiners (64%) than of pronouns (36%) was also reported by Petch-Tyson (2000), who compared the use of both this/these and that/those by American university students (a subset of the LOCNESS corpus) and English as a foreign language (EFL) students.5

It could be argued that the educational context (North American in the case of the above mentioned studies, or British in the case of the material used for the present study) might play a role in the degree to which novice student writers use this/these as determiners or pronouns.

More exposure in North American contexts to style manuals, writing handbooks and textbooks (e.g. American Psychological Association, 2010; Glenn & Gray, 2013; Swales & Feak, 2012)

5 The Louvain Corpus of Native English Essays (LOCNESS) corpus comprises general argumentative essays written by American and British students.

(33)

32

that offer explicit guidance on what constitutes ‘good’ academic writing could then provide at least part of the explanation for the differences observed in the frequencies of use of attended versus unattended this/these in this study. The high percentage of unattended this could also be the result of a more limited range of academic vocabulary or fewer lexical options at the disposal of first-year undergraduates when compared to more advanced-level students, which could potentially lead first-year students to resorting to somewhat less complex structures when constructing their arguments. This possible explanation might be further supported by analysing the most frequent nominal and verbal structures following this and these, which will be reported on in the next sections.

4.2 Most Frequent Nouns and Noun Types Following this/these

Overall, 1,423 head nouns (lemmas) were identified in the data (the complete list is provided in Appendix 1), a substantial number of which occurs only once (272 nouns).6 Table 6 shows the most common nouns (ten or more instances) that follow this and these combined, which account for approximately one third of the total number of occurrences. The majority are methodology- or results-related nouns (e.g. method, process, result) and nouns that relate to student activity or type of task (e.g. essay, report), but also nouns with general meaning are found (e.g. case, point, way), which in context take on specific meanings. The latter include nouns that occur in specific prepositional phrases with anaphoric function, such as “in this case”

and “in this way”, which can be considered semi-fixed expressions and occur relatively frequently in the data (for example, there are 36 instances of in this case). It is also interesting to note that as few as seven nouns occur 20 or more times: case, experiment, essay, point, way, theory and value. The specificity of this short list confirms the main uses of the demonstratives to refer not only to elements of the real world, but also to discourse-level elements.

6 The results reported on throughout Section 4 refer to lemmas for both nouns and verbs.

(34)

33

These findings are generally consistent with those of other studies on (un)attended this both in student and professional academic writing. Just over a third of the nouns in Table 6 appear in the top 25 head nouns attending this in MICUSP papers, as reported by Römer and Wulff (2010). A number of those are also part of the top 50 attendant nouns that Swales (2005) found to be most frequent in a corpus of research articles representing ten different disciplines.

Both studies also point out the prominence in their data of metadiscoursal nouns, as well as nouns referring to methodology, which seem to be not only frequent but also common to several disciplines.

Table 6. The most frequent nouns following this/these (cut-off frequency of 10)

Noun Frequency

case

experiment essay point way theory value method model process laboratory result data

investigation reaction report approach organism stage

56 47 37 29 25 24 20 17 17 17 16 16 14 14 14 14 13 13 13

References

Related documents

V této kapitole vymezíme faktory, které mohou ovlivnit vzájemné působení mezi mikroregiony. Pokusíme se najít jistou etapovost ve vývoji těchto faktorů,

Stöden omfattar statliga lån och kreditgarantier; anstånd med skatter och avgifter; tillfälligt sänkta arbetsgivaravgifter under pandemins första fas; ökat statligt ansvar

Judging from the overall impression of the students’ writing and the very small differences between the English linguistics and physics abstracts regarding the

Only Corporate 3 means that their external auditor identifies the risks the companies are exposed to due to their environmentally hazardous activities in the

While trying to keep the domestic groups satisfied by being an ally with Israel, they also have to try and satisfy their foreign agenda in the Middle East, where Israel is seen as

The benefit of using cases was that they got to discuss during the process through components that were used, starting with a traditional lecture discussion

Through investigating previous findings regarding online trust and trust in AI, we hypothesized that data transparency and anthropomorphism would have a direct effect

In other words, Beauty is in the Eye (or at least, in the Mind) of the beholder. The aesthetic perception toward machines based on my experiences and taste. Especially