• No results found

Assessing Linguistic Proficiency -The Issue of Syntactic Complexity

N/A
N/A
Protected

Academic year: 2021

Share "Assessing Linguistic Proficiency -The Issue of Syntactic Complexity"

Copied!
40
0
0

Loading.... (view fulltext now)

Full text

(1)

Örebro University

Department of Humanities, Education and Social Sciences English

Assessing Linguistic Proficiency

The Issue of Syntactic Complexity

Author: Patrik Rönnkvist Degree Project Essay Spring 2021 Supervisor: Dr. Hayo

(2)

Abstract

This study investigates the syntactic complexity of the example texts used as guides for assessment in the national tests of the Swedish upper secondary school courses English 5 and English 6. It is guided by two research questions: (1) Is there a progression of increased complexity between the grades assigned to the example texts, and, if so, is any specific measure of syntactic complexity more strongly linked to a higher grade than the rest? (2) Is there a progression of increased complexity between the two courses, and, if so, how does this progression manifest itself? A set of 14 quantitative measures of syntactic complexity as identified by the L2 Syntactic Complexity Analyzer (L2SCA) are examined to answer these questions. The majority of the differences between the grades and/or courses represented are shown to be statistically insignificant, and the few instances of statistical significance likely occurred either due to a small sample size or due to a questionable tendency of L2SCA when dealing with run-on sentences. In the end, syntactic complexity as expressed through the 14 measures seems to be a poor indicator of why a text received a certain grade in either of the two represented courses.

Keywords: linguistic proficiency, syntactic complexity, national tests, nationella prov, Swedish upper secondary school, L2 Syntactic Complexity Analyzer.

(3)

1. Introduction 1

2. Research questions 3

3. Theoretical background 4

3.1 Defining complexity 4

3.2 Defining syntactic complexity 6

3.3 Developmental trajectories 9

4. Material and method 11

4.1 Material 11

4.2 Method 12

4.2.1 Human rater or computer program 12

4.2.2 Measures of syntactic complexity 14

4.2.3 Statistical analysis 17 5. Results 18 6. Discussion 19 7. Conclusion 23 References 26 Appendix 32

(4)

1. Introduction

For the annual national tests in the English subject, the Swedish Ministry of Education (henceforth, Skolverket) provides example texts written by students for previous tests as guides for teachers’ marking. The texts have all received different grades (from low to high: E, C, and A) and holistically represent the level of linguistic output indicative of said grade.1

The national tests in general are supposed to facilitate equitable marking amongst teachers so that the same level of knowledge is assessed the same no matter which school in the country the student attends (see Bonnevier, Borgström & Yassin Falk, 2017; Lundahl, 2009). This makes the example texts a vital normative instrument by highlighting to all the teachers in the country what level of output is required for the different grades. What, then, do they say about the level of syntactic complexity best indicated by each grade?

There is “a strong link between the (syntactic) complexity of learners’ [second language (L2)] production and their overall level of L2 development and/or L2 proficiency” (Bulté & Housen, 2018, p. 148), so more syntactically complex texts should receive higher grades than less complex ones. We have yet to verify this, however. By examining the example texts, we aim to get a clearer picture of (1) what is syntactically expected of the students at different grading levels, (2) whether teachers are exposed to such complexity as an overall indicator of (higher) proficiency, and (3) whether certain kinds of complexity might be more indicative of a higher grade than others. Not only would this help us understand how teachers are

(implicitly) being led to consider syntactic complexity, but this could also help teachers decide whether they need to consider spending more lesson time on strengthening the students’ syntactic complexity, either in general or by targeting specific measures of complexity (e.g. clausal coordination as a means of enriching arguments), in order to strengthen their chances of receiving a higher (or passing) grade.

The example texts from the national tests for Upper Secondary school are relevant research material for this. This stage includes three English courses - English 5, 6, & 7 - but the national tests are only held for the first two. These two courses also require completion for a student to be eligible for university studies, whereas the advanced English 7 is not required. This means that the example texts for the national tests in English 5 and English 6 can be seen as representative of the base level and expected/desired progression of English skills required for higher education, i.e. what skill level is needed to pass English 6 - which

1The teachers do receive more than just the holistic score to guide their assessment, but this is

(5)

acts as a proxy of the skills required for higher education - and what progression in writing ability is expected/desired between the two courses. So how does this pertain to syntactic complexity? As the link between it and “overall level of L2 development and/or L2 proficiency” is strong (Bulté & Housen, 2018), syntactic complexity acts as a quantifiable measure of the skill level represented by each grade in the two courses. If we then find a link between syntactic complexity and the progression between both/either the courses and/or the grades, this could act as a signal to teachers to spend some (more) of their lesson time on developing/strengthening students’ syntactic skills. If, on the other hand, the link is weak or not apparent, that could help encourage teachers to consider spending their lesson time on other aspects of language teaching.

Under the heading of “Ämnets syfte” ‘Aim of the subject’2in the English subject syllabus,

it is stated that the students should “ges möjlighet att utveckla [… en] förmåga att uttrycka sig med variation och komplexitet” ‘be given the opportunity to develop [… an] ability to express themselves with variation and complexity’ (Skolverket, n.d.i, my emphasis). The term

complexity here is vague and not really defined3; even when it is mentioned in the

commentary material (Skolverket, n.d.j), the material designed to address and explain

complex terminology in the syllabus, it is never connected to students’ linguistic output. This leaves the term open to interpretation, but syntactic complexity should certainly not be excluded from any of them. How words are strung together is, after all, one of the

foundations of language, and stringing them together in ever more intricate configurations is how you create precision and are able to communicate ever more complex ideas; that is, increasingly complex syntax allows for and enables the communication of more complex ideas. How, then, is the idea of complexity expressed further along in the syllabus?

Whenever the term complexity is used in relation to the students’ linguistic output in any of the courses, it never refers to the output itself, but rather highlights the students’ ability to maneuver themselves, both orally and in writing, “i formella och komplexa sammanhang” ‘in formal and complex contexts’ (Skolverket, n.d.i, my emphasis). This absence, this lack of any explicit mention of complexity in regards to the output itself,4is important to note since it

likely affects the teaching; if the syllabus never explicitly connects complexity to the students’ output itself, teachers might not spend any lesson time focusing on it. Since

4This is addressed in the revised edition of the syllabus (SKOLFS 2010:261), applicable from 1 july

2021, though only for English 7 where the national test is not taken. I will return to this later.

3This is, sadly enough, a portent of things to come.

2All translations of quotations from the national curriculum, syllabus or commentary material - for

(6)

complexity is not highlighted as required by teachers to touch upon in the courses, one would think this undermines the need to research syntactic complexity in the example texts from the national tests. However, it is just the opposite. Although complexity is not afforded any explicit attention in the courses, different aspects of it influences other terms that are. In English 5, the students should work through their texts to, among other things, “variera […] och precisera” ‘vary […] and specify’ (Skolverket, n.d.i). To specify entails creating

precision, and, as stated earlier, increasingly complex syntax allows for and enables more precise writing that can carry more complex ideas. The importance of variation in your writing is highlighted even further in the knowledge requirements in both English 5 & 6, in that students need to phrase themselves “relativt varierat” ‘with some variation’ or “varierat” ‘with variation’ (Skolverket, n.d.i) The commentary material explains the term as, among other things, being about “språkstrukturer och förmågan att använda dessa i […] skrift” ‘language structures and the ability to use these in […] writing,’ of which a possible purpose is to “undvika att språket i en framställning blir monotont” ‘avoid making the language in a product monotonous’ (Skolverket, n.d.j). Syntax is, of course, part of language structures and putting forth your ideas through sentences with differing degrees of syntactic complexity is one way of achieving variation in your writing. It is then interesting to see just how the results of teaching syntactic complexity through these other terms shows itself through the example texts and if the received grades highlight any specific measure of syntactic complexity as more desirable.

The structure of this essay is as follows: Section 2 states the research questions guiding our study. Section 3 presents our theoretical background, first by covering the issue of defining complexity, followed by the issue of defining syntactic complexity. A discussion is also had regarding systemic functional linguistics, which posits a specific developmental trajectory for syntactic complexity. The previous research in the field is discussed alongside all of these. Section 4 covers the material and the method by which it was analyzed. Section 5 covers the results of our study, which are then discussed in section 6. Finally, section 7

concludes and summarizes our essay, while also suggesting trajectories for future research.

2. Research questions

This study will focus on the syntactic complexity of the example texts for the national tests in Swedish upper secondary school and sets out to answer the following research questions:

(7)

1. Is there a progression of increased complexity between the grades assigned to the example texts, and, if so, is any specific measure of syntactic complexity more strongly linked to a higher grade than the rest?

2. Is there a progression of increased complexity between the two courses, and, if so, how does this progression manifest itself?

3. Theoretical background

3.1 Defining complexity

Before we can discuss syntactic complexity, we must first consider the term complexity itself. It was mentioned in the introduction that the term is vague and not really defined in the English subject syllabus, and that is, sadly, actually fairly reflective of the research, where the term lacks a “commonly accepted definition” (Bulté & Housen, 2012). Many studies,

especially the older ones, either do not define what they mean by complexity or they “do so in general, vague or even circular terms” (Bulté & Housen, 2012, see example on p. 22). This means that “complexity has been used mainly in an intuitive manner and more time has been spent on developing new measures of language complexity than on thinking about what complexity in language actually entails” (Bulté & Housen, 2012), which has led to a difficulty in comparing “the results of individual studies” (Bulté & Housen, 2014; see also Norris & Ortega, 2009; Pallotti, 2009). Recently, however, attempts have been made to address this issue (e.g. Bulté & Housen, 2012; Norris & Ortega, 2009; Pallotti, 2009, 2015), specifically by highlighting previous assumptions, creating a taxonomy to show the levels which can be/have been studied, and/or discussing how the facets of complexity can/should be operationalized. A common denominator amongst these is to view complexity as “a highly complex construct, consisting of several sub-constructs, dimensions, levels, and components” (Bulté & Housen, 2014), i.e. as a multifaceted construct that can show itself in many different ways.5 This leads us to the first distinction we can make: that between absolute and relative

complexity. Absolute complexity refers to “inherent properties of linguistic units and/or systems thereof” (Bulté & Housen, 2014), which can be seen in “quantitative terms as the number of discrete components that a language feature or a language system consists of, and as the number of connections between the different components” (Bulté & Housen, 2012, emphasis in original). A study focusing on absolute complexity could then, for example, be

(8)

interested in what kinds of clauses a text contains, how many there are, and how they are connected. Relative complexity, on the other hand, is a lot broader, because it also looks at the language user. Basically, here “a language feature or system of features is seen as complex if it is somehow costly or taxing for language users and learners,” especially with regards to “the mental effort or resources that they have to invest in processing or

internalizing the feature(s)” (Bulté & Housen, 2012). This makes relative complexity highly subjective, since different learners will struggle with different features (Bulté & Housen, 2012), and here a study might focus more on learners’ mental processes. Given that we are looking at anonymous example texts in this essay, absolute complexity is more relevant to us.

Bulté & Housen (2012) further divide absolute complexity into three components:

propositional, discourse-interactional, and linguistic complexity. Of these, the most attention has been afforded linguistic complexity (Bulté & Housen, 2012, 2014), which will be the focus of this essay. Linguistic complexity can, in turn, focus on either (1) the language system as a whole (or its major subsystems) – i.e. “the number, range, variety or diversity of different structures and items that [the learner] knows or uses” (Bulté & Housen, 2012) – known as system complexity, or (2) the individual linguistic features making up such (sub)systems, known as structure complexity (Bulté & Housen, 2012, 2014). Further divisions can be made (see Bulté & Housen, 2012), but at this point it would not help our analysis so we will stop there. As distinguishable as all of the facets of complexity thus far discussed may appear, they can be a lot harder to differentiate outside the theoretical domain (Bulté & Housen, 2012). What can be said, though, is that when we are looking at one specific text and all the measures of syntactic complexity found therein, we are investigating the system complexity of the text. Conversely, when we are looking at one specific measure either within one or across several texts, we are looking at structure complexity. Finally, all the facets of

complexity thus far discussed “can be studied across various domains of language such as the lexicon, syntax, and morphology” (Bulté & Housen, 2014), of which syntax obviously

interests us the most.

In summary, complexity in this essay refers to absolute complexity – the occurrences of specific components of language – seen through the linguistic complexity of several texts. The specific components are discussed in section 4.2.2, but first, we must define what we mean by syntactic complexity.

(9)

3.2 Defining syntactic complexity

What is syntactic complexity? In general terms, it refers to “the variety and degree of

sophistication of the syntactic structures deployed in written production” (Lu, 2017), but what does that mean for our study? What does it mean to say that one text is more syntactically complex than another? This straightforward question does not have as straightforward of an answer, though, because, just as with the term complexity, there is “a lack of agreement about the specific meaning of syntactic complexity,” both in regards to how it is conceptualized and operationalized (Lu, 2017).

Bulté & Housen (2014) helpfully identify five related assumptions underpinning the current measures used in linguistic complexity research. (1) “[M]ore is more complex” (Bulté & Housen, 2014). Regarding syntax, this could mean that more occurrences of certain

linguistic units equate to more complexity, and these are usually measured in ratios. For example, when looking at the first and last essays of 45 randomly selected learners in a semester-long English for Academic Purposes course, Bulté & Housen (2014) found a significant increase in, among other measures, the use of compound sentences and the number of coordinated clauses per sentence. As the occurrences increased, the texts were seen as more syntactically complex. Interestingly, however, these two measures had little to no significant impact on the perception of the students’ overall writing quality, indicating that a more complex text does not necessarily equate to a better perceived text, which is a

sentiment echoed in Ortega (2003). At the same time, Bulté & Housen (2014) did find a significant decrease in the number of simple sentences, which did coincide with an increase in perceived writing quality. This seems to indicate that the interaction of different measures in relation to perceived writing quality is not quite so simple, which Biber, Gray & Poonpon (2011) actually emphasize quite a bit. Looking at what grammatical features “are most strongly characteristic of advanced academic writing” they found that these differed quite a bit “from the complexity common in conversation,” leading them to state that “complexity is not a single unified construct, and it is therefore not reasonable to suppose that any single measure will adequately represent this construct” (Biber, et al., 2011). Particularly, they were referring to the T-unit,6seemingly the most popular unit of measurement in other studies (see

Ortega, 2003), but what they say does stress the importance of keeping a broad perspective when choosing measures for identifying (syntactic) complexity, which this essay also did (see section 4.2.2).

6Discussed further in section 4.2.2, but, briefly, a T-unit consists of an independent clause and all its

(10)

(2) “[L]onger linguistic units are more complex” (Bulté & Housen, 2014). Just as

straightforward as it seems, it is assumed that the longer a phrase, clause, sentence, etc. (see Figure 1 for the linguistic units targeted) is, the more complex it is. This assumption

underpins next to all studies researching syntactic complexity, but they are best exemplified by those focusing on linguistic development. Bulté & Housen (2014) found that their students on average produced sentences around 1.4 words longer and T-units 1 word longer at the end of a course. In a later study (Bulté & Housen, 2018), they collected data from 11 writing tasks written by 10 beginning learners of English over a 19-month period and found “a clear

upward trend” of longer T-units and noun phrases, but the mean length of finite clauses showed no clear pattern. Crossley & McNamara (2014) analyzed the writing samples of 57 university students collected at the beginning, middle and end of a semester and found an increase in, among other measures, the number of words used before the main verb of a sentence. Interestingly, although that allowed them to call the later texts more syntactically complex in that regard, the same measure did not correlate to perceived writing quality. Broadening our perspective a little, Lahuerta Martínez (2018) examined the writings of 188 students from two adjacent years in the Spanish secondary education system, and found that learners from the grade above on average produced longer sentences (grade 3: 12.16 words; grade 4: 17.93 words), while girls also wrote longer sentences than boys (girls: 15.76 words; boys: 13.32 words). In fact, looking at all the measures Lahuerta Martínez (2018) used, students from the grade above significantly outperformed students from the grade below on virtually all measures, while the girls similarly outperformed the boys, though not quite to the same statistical extent. Lu & Ai (2015) and Ai & Lu (2013) have both shown a discrepancy regarding the length of the linguistic units produced by non-native and native speakers, where the latter produced longer clauses in Lu & Ai (2015) and longer clauses, T-units, and

sentences in Ai & Lu (2013). Ai & Lu (2013) also showed that the length of the linguistic units produced by learners at a higher proficiency better approximated the length produced by native speakers than did the units produced by learners at a lower proficiency.

(11)

(3) “[M]ore and/or more deeply embedded is more complex” (Bulté & Housen, 2014). For our purposes, this is best expressed through subordination. Bulté & Housen (2014) found no statistically significant changes in their learners’ development when looking at these

measures, but as we shall see in the next section, this might not actually be surprising.

Lahuerta Martínez (2018) found the more senior students to be producing more subordination than the younger students, and Bulté & Housen (2018) found their students to be producing an overall higher subclause ratio by the end of their 19-month study. Similarly positive increases in subordination were found in, for example, Mazgutova & Kormos (2015),

Rosmawati (2014) and Schenker (2016). On the other hand, Knoch, Rouhshad, Oon & Storch (2015) found no such significant increase while studying 31 non-native speakers at an

English medium university, and Lu (2011) found a negative correlation between

subordination measures and his learners’ language development. As with Bulté & Housen (2014), though, this will be discussed more in depth in the next section.

(4) “[M]ore varied or diverse is more complex” (Bulté & Housen, 2014). For syntax, this would mean that the less similar the syntactic units of a text are to each other, the more complex the text is. Crossley & McNamara (2014) used a measure specifically targeting this – syntactic similarity score – and found that their 57 students overall produced less

syntactically similar sentences by the end of the semester, indicating that the measure was a sign of their development, but they also found that the same measure had no significant correlation to human judgment of writing quality. McNamara, Crossley & McCarthy (2010) likewise found no correlation between said measure and perceived writing quality.

Mazgutova & Kormos (2015), interestingly, found their students to be writing more

syntactically similar sentences by the end of their study. Other than these studies and others using the same program to identify the measure, Coh-Metrix, this fourth assumption is

actually quite difficult to discuss with regards to syntax. This is largely due to the tendency of early studies to focus on a rather small number of syntactic complexity measures in general. For example, in Ortega’s (2003) research synthesis of 25 studies, she found that only four studies examined four or more measures, with the majority only focusing on one (most often: mean length of T-unit). This is understandable to some extent, given the non-existence of computer programs to aid in identification of measures for the longest time and the labor intensiveness of manual identification - both factors of which also contributed to a tendency of using small sample sizes, which further complicates the studies’ generalizability (Ortega, 2003) - but that does make it very difficult to discuss, among other things, this fourth assumption. In fact, even though more recent studies (e.g. Bulté & Housen, 2018; Lahuerta

(12)

Martínez, 2018; Lu, 2011) have started using more measures, syntactic variety/diversity is very rarely touched upon, so we cannot really keep exploring the fourth assumption at this juncture.

(5) “[M]ore marked, infrequent, sophisticated, semantically abstract, costly, cognitively difficult or later acquired features are more complex” (Bulté & Housen, 2014). The last part of this assumption is foundational for our next section, but a brief discussion can be had regarding the other parts. When Crossley & McNamara (2014) utilized the number of words before the main verb of a sentence as a measure, they argued for its inclusion as a measure “under the hypothesis that the main verb controls the arguments in the sentence and the longer it takes to access the main verb, the more complex the sentence is.” As we can see, longer time to access the main verb is thereby linked to an increase in cognitive difficulty, which is what makes it more complex. At the same time, this highlights the fact that the assumptions are very much interconnected, since the cognitive difficulty (complexity as seen through assumption 5) of said measure increased as the unit measured became longer

(complexity as seen through assumption 2). A similar link can be drawn when Ai & Lu (2013) and Lu & Ai (2015) talk about a higher degree of phrasal sophistication in texts by native speakers as opposed to those by non-native speakers: an increase in phrasal

sophistication (assumption 5) means an increase in the occurrences (assumption 1) of complex nominals.

Finally, let us return to the beginning of this section. If syntactic complexity refers to “the variety and degree of sophistication of the syntactic structures deployed in written

production” (Lu, 2017), what does that mean for our study? It means that ratios of selected measures – targeting, e.g., subordination and phrasal sophistication (see section 4.2.2) – and the length of different linguistic units are what determines a text’s syntactic complexity.

3.3 Developmental trajectories

One of the problems of earlier studies’ usage of few complexity measures is that this did not allow them to measure syntactic complexity multidimensionally. This is problematic since syntactic L2 proficiency and, especially, syntactic L2 development cannot be fully captured unless multiple dimensions are examined (Norris & Ortega, 2009). Norris & Ortega (2009) argue for this need through the theoretical lens of systemic functional linguistics, which posits that linguistic development follows a specific trajectory. The basic idea is that development moves from “the expression of ideas first by means of mostly” coordination,

(13)

through “an expansion by which hypotaxis (i.e. subordination) is added as a resource to express the logical connection of ideas via grammatically intricate texts,” and finally arrives at “the emergence of and reliance on grammatical metaphor (achieved through

nominalization, among other processes)” (Norris & Ortega, 2009). The developmental trajectory then moves from a global level (e.g. linking sentences and clauses through coordination), to a slightly more local level (e.g. using subordinate clauses to create a bit more precision), to finally a very local level (e.g. creating even stronger precision through longer noun phrases) (for an in-depth exploration of how phrasal complexity might develop, see Biber, et al., 2011; Parkinson & Musgrave, 2014). Interestingly, a trade-off is expected to happen as one moves through these, meaning that the language produced by the end should be “advanced language that actually exhibits lower levels of subordination or grammatical intricacy but much higher levels of lexical density and more complex phrases (as opposed to more clauses)” (Norris & Ortega, 2009; see also Ortega, 2003). This trade-off also signals why it is so important to measure multidimensionally when studying syntactic complexity: a decrease in complexity through one measure might actually be a sign of positive development at a certain level. If one only focuses on, say, subordination, one can miss the complexity that is developing/has been developed at the phrasal level. Norris & Ortega (2009) therefore advises researchers to “at a minimum[, …] measure global or general complexity, complexity by subordination, and complexity via phrasal elaboration, as well as possibly coordination if early proficiency data are also included.” We need to note, though, that this proposed

trajectory should not be assumed to be linear for every individual learner. As Bulté & Housen (2018) show, different students’ syntactic development can vary quite a bit (see also

Larsen-Freeman, 2006).

After the publication of Norris & Ortega (2009), it has seemingly become common practice to heed their advice, leading to interesting, but mixed, results. Lu (2011) strengthens the theory, since his more advanced learners produced less subordination but more phrasal elaboration than the less advanced ones, and Taguchi, Crawford & Zawodny Wetzel (2013) also lends some credence to it. Bulté & Housen (2014), on the other hand, found no

significant change in their students’ usage of subordination, but did find an increase in the mean length of their noun phrases. Mazgutova & Kormos (2015) found no discrimination between subordination and phrasal complexity; instead, their less proficient students

increased on both levels. Nevertheless, in regards to research question 2 of this study, it will be interesting to see whether the trajectory is somehow represented in the comparison between the example texts from English 5 and those from English 6. No strong conclusions

(14)

should be drawn as to the validity of the proposed trajectory, though, as we shall explain in the following section.

4. Material and method

4.1 Material

In any one given year, teachers receive one example text representing each of the different grades (E, C, & A), in each of the courses taking part in the national tests, as a guide for their marking. As a huge portion of the tests are still bound by secrecy (in case Skolverket wants to reuse any assignments), they could not be analyzed for this essay. Each year not bound by secrecy could of course have been identified and the example texts (possibly) tracked down, yet that would have been outside of the feasible time frame of this essay. Skolverket does, however, link to some older tests and example texts through their website (Skolverket, 2020). The courses English 5 & English 6 are here both represented by three texts for each of the three grades, giving us a total of 18 texts. As mentioned above, though, one of the problems with early studies around syntactic complexity was their tendency to use small sample sizes (Ortega, 2003). Since 18 texts certainly counts as a contextually small sample size, this is an indicator of how careful we need to be when drawing conclusions. We will not get a clear overall picture of Skolverket’s underlying view of syntactic complexity, so too broad a statement regarding it cannot be made, yet our results could at least somewhat hint at said view. At the same time, 18 texts is at least three times larger than the amount of example texts received by teachers in any given year (assuming the teacher actually teaches both courses), so our conclusions will at the very least have a stronger foundation than any one given year would allow for.

The texts representing English 5 (student age: 16-17) were all written around the same topic, with the instructions leaving it open for the students to write around said topic in a variety of ways (Skolverket, n.d.a). One text per assigned grade has also been supplied with a detailed comment in which the assessment is motivated. Such comments are of course helpful when convincing teachers why a text deserves a certain grade, yet since our research focuses on the example texts we will not be analyzing these. If the comments somehow relate to the results of our quantitative analysis, it will be mentioned, but nothing more.

The texts representing English 6 (student age: 17-18) were all written around a different topic from English 5, and this time the students were given the choice of either writing an

(15)

investigative or an argumentative text (Skolverket, n.d.h). This means that one investigative and one argumentative text is represented at each grading level alongside one extra text (E & C: investigative; A: argumentative). The extra text is the only one that has not been supplied with a comment, and as with English 5, these comments will only be mentioned when

relevant. Now, recent studies have indicated that “both topic and genre can have a significant impact on complexity scores” (Bulté & Housen, 2018; e.g. Beers & Nagy, 2009; Polio & Yoon, 2018; Yang, Lu & Weigle, 2015; Yoon & Polio, 2017). Any difference between or within the courses might then be influenced by this. However, given the limited dataset we are working with, such differences could also be due to the individual ability of the writers, meaning that no grand conclusions should be drawn with regards to the impact of topic/genre differences. Apart from the comparison between the courses that will already be made, though, a comparison between the investigative and the argumentative texts in English 6 should not really be made, given both the fact that the investigative texts make up a larger portion of the exhibits and that the argumentative texts are better represented at the highest grade; any comparison therein should be called into serious questioning. For the sake of transparency, though, the results of the analysis of each individual text can be found in the Appendix, where the investigative and argumentative texts are clearly marked as either “Discuss” or “Argue.”

4.2 Method

4.2.1 Human rater or computer program

The first decision that had to be made regarding our methodological approach was how to identify the measures of syntactic complexity in the example texts. The single most common approach in the literature was to have human raters go through the texts manually, which provided several advantages to them, not the least of which being that it granted them access to a larger vault of syntactic complexity measures to choose from as deemed appropriate to their study, instead of being limited by what a computer program would allow. This approach could, sadly, not be taken for this essay. As Crossley & McNamara (2014) point out, “human raters are prone to subjectivity and require training, time to score, and monitoring,” making it both a temporally demanding task and one in which subjective interpretations, or even errors, should be discussed and ironed out. All of the read studies which utilized human raters to identify the measures (e.g. Bulté & Housen, 2014; Bulté & Housen, 2018; Lahuerta Martínez,

(16)

2018) had the advantage of more than one rater. This allowed the researchers to discuss differing interpretations and come to an informed conclusion regarding how those differences should be interpreted, e.g. if a particular run-on sentence should be counted as one or two T-units (explained below in 4.2.2). Unfortunately, this option was unavailable to me. Performing the identification alone, then, would have meant both relying on only one

subjective interpretation and leaving the process more vulnerable to human errors, which was deemed too high of a methodological weakness. A computer program was then to be used instead.

As far as this author is aware, two programs for analyzing syntactic complexity, whose validity has been tested a number of times, are available to the public. Coh-Metrix seems to be more popular, and its validity has thus far stood up to scrutiny, yet information regarding exactly how it performs its calculations is not readily available, and the measures it uses have rarely been used by studies not using the program (Polio & Yoon, 2018; see also Lu, 2017). The L2 Syntactic Complexity Analyzer (L2SCA) provided a better alternative. Developed by Lu (2010), its accessibility was later made easier through a web-based interface by Ai (n.d.), which is why said interface was used. The program uses 14 measures to calculate syntactic complexity, all of which were used in this essay and which had either already shown to have some “correlation with or effect for proficiency” or been recommended by other researchers to pursue further (Lu, 2010; i.e. Wolfe-Quintero, Inagaki & Kim, 1998). Each measure gauges one of five dimensions of syntactic complexity: length of production unit, (overall) sentence complexity, (amount of) subordination, (amount of) coordination, and particular structures, i.e. degree of phrasal sophistication (Ai & Lu, 2013; Lu, 2010, 2011; Lu & Ai, 2015). Together, they cover all levels of syntactic complexity that Norris & Ortega (2009) advises should be represented when studying the topic. We will go over the measures in a bit more detail soon, but we must first address the reliability of the instrument.

Lu (2010) tested the reliability of L2SCA by comparing its results to that of two trained human raters when analyzing 10 texts and found that they strongly correlated. Yoon & Polio (2017) performed a similar test for 12 of the 14 measures (the reason for excluding clauses per sentence and complex T-units per T-unit will be touched upon when said measures are discussed below), but this time on 30 texts, and found a reliability score of 0.81 on all but one measure (T-unit per sentence = 0.74). Polio & Yoon (2018) then replicated that study with 30 different texts and once again found a high degree of reliability where only one measure was below 0.8 (T-unit per sentence = 0.745). The results then seem to indicate L2SCA to be fairly reliable.

(17)

There is the slight issue of redundancy, though. There has been criticism (e.g. Bulté & Housen, 2014; Norris & Ortega, 2009) levied toward earlier studies for using several measures that, in essence, measure the same thing, e.g. subordination. Lu & Ai (2015) did note this for L2SCA, yet for this essay they were still deemed interesting. Identifying them did not add any significant amount of processing time, and the potential benefits outweighed “the concern for potentially superfluous information” (Lu & Ai, 2015), since it opened up more subtle means of comparing the different texts.

Finally, one weakness that impacts the validity of the program: it was “designed with advanced second language proficiency research in mind” (Lu, 2010, my emphasis), meaning that it is not optimal for the level of our learners (less so for English 5 than English 6). This is because the system works best when given grammatically complete sentences, which we cannot expect at that stage. With that said, (1) L2SCA is still better than one lone human rater, (2) it is difficult to calculate if the texts contain a large enough amount of grammatical errors without altering the texts - which would require interpreting and perhaps changing the author’s intent (see T-unit below) - and (3) of the problems it faces - explored further

alongside the affected measures - we can really only fault the program for the obvious errors (see complex nominals below) and not the debatable errors (see T-unit below) - the ratio of which is unclear.

4.2.2 Measures of syntactic complexity

If we now turn to the measures themselves - summarized, alongside their definitions and the dimensions they gauge, in Table 1, adapted from Lu (2010) and Lu & Ai (2015) - we can start with the measures regarding what Norris & Ortega (2009) refer to as the “global or general” level, starting with the length of production unit dimension. Mean length of sentence (MLS) straightforwardly measures the number of words divided by the number of sentences, where a sentence is a “group of words delimited by one of the following punctuation marks […]: period, question mark, exclamation mark, quotation mark, or ellipsis” (Lu, 2010). Mean length of clause (MLC) does the same thing for clauses, where a clause is defined as “a structure with a subject and a finite verb,” which here includes “independent clauses,

adjective clauses, adverbial clauses, and nominal clauses” (Lu, 2010). Mean length of T-unit (MLT) does the same for T-units. “A T-unit consists of one independent clause with all of its dependent (subordinate) clauses” (Bulté & Housen, 2014; see Fig. 1 above), making it a unit bigger than a clause but smaller than a sentence. Seemingly the most popular unit of

(18)

measurement in the previous research (see Ortega, 2003), there has been some criticism as to its validity as a measurement for the syntactic complexity of advanced learners (i.e.

Bardovi-Harlig, 1992), but those criticisms are better explored below. For now, suffice to say that our texts do not represent high enough of a level to disregard the T-unit.

Table 1. Syntactic complexity measures used. Adapted from Lu (2010) and Lu & Ai (2015).

The (overall) sentence complexity dimension is also part of the global level. Here, we only find one measure, the sentence complexity ratio (C/S), which simply measures the number of clauses per sentence. This measure, and the soon to be mentioned complex T-unit ratio, was excluded when Yoon & Polio (2017) and Polio & Yoon (2018) tested the validity of L2SCA because earlier studies (i.e. Ai & Lu, 2013; Lu, 2011) had deemed them “less valid as language development indicators” (Yoon & Polio, 2017). Though the validity of these two measures are not as well tested as the others, they were not excluded from this study for two

(19)

reasons. Firstly, they added no significant workload during the identification process. Secondly, it was deemed interesting to see whether these two measures, despite their lack of validity as indicators of language development, still managed to correlate to differences between the grades and/or courses. This did mean, however, that an extra amount of caution had to be taken when interpreting the results related to these two measures.

The third dimension measures (the amount of) subordination, and is part of the

subordination level discussed by Norris & Ortega (2009). It is important to note here that the different types of subordination (e.g. spatial (‘where’), temporal (‘when’), conditional (‘if’)) are not identified. The T-unit complexity ratio (C/T) measures the number of clauses per T-unit, while the complex T-unit ratio (CT/T) measures the number of complex T-units per T-unit. “A complex T-unit is one that contains a dependent clause” (Lu, 2010). These two are interesting to compare with each other since the second one highlights how many of the clauses in the first one are dependent, i.e. how many of the T-units contain subordination. The dependent clause ratio (DC/C) measures the number of dependent clauses per number of clauses, and the dependent clause per T-unit (DC/T) does the same for T-units. A dependent clause is here “defined as a finite adjective, adverbial, or nominal clause” (Lu, 2010).

The fourth dimension measures (the amount of) coordination. Norris & Ortega (2009) highlights this level as especially appropriate when dealing with “early proficiency data.” Coordinate phrases per clause (CP/C) and coordinate phrases per T-unit (CP/T) measures the number of coordinate phrases per clause or T-unit, and “[o]nly adjective, adverb, noun, and verb phrases are counted in coordinate phrases” (Lu, 2010). The sentence coordination ratio (T/S) measures the number of T-units per sentence, and it is now appropriate to tackle the criticism of the T-unit. Bardovi-Harlig (1992) correctly points out the unit’s usefulness when dealing with run-on sentences - i.e. sentences with more than one independent clause where their relation is not marked by any coordinators or correct punctuations - and convincingly argues that it is less than ideal when dealing with syntactic complexity through coordination, especially in advanced learners. However, since we cannot call all our texts examples of advanced learners, this author does not see enough of a reason to disregard the T-unit as a whole. The specific measure of T/S, however, needs further justification. This was the measure that showed the least amount of reliability in both Yoon & Polio (2017) and Polio & Yoon (2018), and the reason is likely, as pointed out in the latter, connected to run-on

(20)

Example 1: “Friends are actually very important too a good friends is like a sisters/brothers.” (Skolverket, n.d.b)

Correction 1: Friends are actually very important too. A good friends is like a sisters/brothers. (2 Sentences, 2 T-units)

Correction 2: Friends are actually very important too, and a good friends is like a sisters/brothers. (1 Sentence, 2 T-units)

Correction 3: Friends are actually very important too, because a good friends is like a sisters/brothers. (1 Sentence, 1 T-unit)

Example 1 is obviously grammatically incorrect, but “[w]ith a run‐on sentence, it is not clear if the writer intended a coordinated sentence, and thus more a [sic] complex sentence, or if the writer simply made a punctuation error” (Polio & Yoon, 2018). That is, the reader has to interpret which of the corrections the student had intended. Notice, however, that the number of T-units do not change between Correction 1 & 2, while Correction 3, using subordination, does change it. It can be argued which of the corrections are most valid, but what matters here is that L2SCA has a tendency to interpret run-on sentences as if Correction 3 was intended (see Polio & Yoon, 2018), which is likely what lowered the reliability of the T/S measure in the mentioned studies. Was the measure then to be excluded from this study? This author chose not to, since Correction 3 still counts a valid interpretation and the number of run-on sentences in the texts is not so frequent as to cause too big of a problem. Caution, though, was taken when interpreting the results, since this tendency would eschew them some, which was always a danger given that L2SCA was designed for more advanced learners.

The fifth and final dimension was initially called “Particular structures” (Lu, 2010), but “Degree of phrasal sophistication” (Ai & Lu, 2013) is more transparent and will therefore be used, and it covers the phrasal level of complexity as discussed by Norris & Ortega (2009). Complex nominals per clause (CN/C) and complex nominals per T-unit (CN/T) measures the number of complex nominals per clause/T-unit. Lu (2010) defines complex nominals as comprising of “(i) nouns plus adjective, possessive, prepositional phrase, relative clause, participle, or appositive, (ii) nominal clauses, and (iii) gerunds and infinitives in subject position.” There are some parsing errors in L2SCA (see Lu, 2010), though, which can make the program incorrectly identify a complex nominal as longer than it actually is. However, since the measures have been shown to be fairly reliable when compared to human raters (i.e. Polio & Yoon, 2018; Yoon & Polio, 2017), they were not excluded from this essay. Finally,

(21)

verb phrases per T-unit (VP/T) measures, as it says, the number of verb phrases per T-unit, which includes both finite and non-finite verb phrases (Lu, 2010).

4.2.3 Statistical analysis

To investigate whether there was any increase in complexity between the grades (research question 1) and/or the courses (research question 2) and whether any specific measure was more linked to a higher grade (research question 1), mean scores for each grade/course and standard deviations were calculated. In order to compare the differences between the assigned grades/the represented courses (research questions 1 & 2) for each of the 14 measures,

two-tailed independent sample t tests were performed. These were, of course, preceded by a set of F-tests to determine whether they were to assume equal or unequal variance

(significance level: a = 0.05). The t tests’ resulting p-values were deemed statistically significant when at or above 95 % certainty (p⪯ 0.05). Tables containing all the complexity scores (both for the individual texts and the mean values of each grade and course), standard deviations, and the p-values of every t-test can be found in the Appendix.

5. Results

We will start by looking at the average complexity scores for the example texts representing English 5. Just looking at the mean values, there appears to be a tendency for most measures to move in an inverted bell curve between the grades, where they decrease in complexity between grades E & C and then go back up again in grade A. Only three measures (CT/T, CN/C & CN/T) follow a linear trajectory of increasing/decreasing complexity between the grades, where CT/T decreases in complexity, while CN/C & CN/T increase. However, almost none of the differences are statistically significant. All differences in syntactic complexity between adjacent grades are negligible (though the increase in CN/C from grade C to A comes close with a p-value of 0.057), and looking at the differences between grades E & A, we find that only one measure (CN/C: from 0,623 to 0,913) is statistically significant (p⪯ 0.0002). It is interesting that, though the sheer mean number of complex nominals actually almost exactly doubles between the grades E & A (see Appendix), the increase of complex nominals is only significant in the clauses and not the T-units. Admittedly, CN/T is relatively close to statistical significance (p⪯ 0.082), but the overall higher amount of dependent clauses in the texts representing grade A appears to be what holds it back.

(22)

Turning to English 6, we find almost the exact same pattern with a general dip in

complexity in the texts representing grade C that is then recovered in grade A. This time, only two measures (CT/T & CN/T) follow a linear trajectory, showing a decrease in complexity as the grades increase. Once again, though, very few of the differences in syntactic complexity are statistically significant. Within the differences between grades E & C, two measures pass our set threshold (MLS: p⪯ 0.007; MLT: p ⪯ 0.015). Both of these measures target the length of production unit dimension, and they both decrease between the grades (MLS: from 19,061 to 14,667; MLT: from 16,264 to 13,141). Likewise, the subordination measure CT/T

significantly (p⪯ 0.015) decreases between grades E & A (from 0,674 to 0,445). One measure (CP/T) is on the cusp of showing a statistically significant increase between the grades C & A, but it falls short right before the finish line (p⪯ 0.051) and is therefore not significant enough.

Comparing the differences within the same grade between the different courses (i.e. English 5, Grade E vs. English 6, Grade E, etc.), we yet again find that most of the

differences are not statistically significant. In fact, only one difference passes the threshold of 95 %: CN/C within grade A (p⪯ 0.027). Comparing the measures between the courses, we here find a decrease from 0,913 in English 5 to 0,719 in English 6.

Finally, we can compare the differences between the different grades in the different courses (i.e. English 5, Grade E vs. English 6, Grade C, etc.). As with the other instances, the majority of the differences are statistically insignificant. This time, two are significant: the first to be found (p⪯ 0.02) is that for the subordination measure CT/T between English 5, Grade E and English 6, Grade A. Here, the ratio of complex T-units decreases from 0,671 to 0,445. The second to be found (p⪯ 0.002) is for the phrasal complexity measure CN/C between English 5, Grade A and English 6, Grade E, which decreases from 0,913 to 0,645.

6. Discussion

For this section, we will start by discussing the statistically significant differences and then broaden our view to the insignificant ones. The difference between the texts representing English 5, Grade A – the one group of texts always represented as statistically significant for our first discussed measure – and English 6, Grade E mischievously appears the most

straightforward, so we will start with that one. Here we found a decrease in phrasal complexity through the measure CN/C. At first glance, we would infer from this that the students producing Grade A output in English 5 likely possess an ever so slightly higher

(23)

degree of linguistic proficiency than those producing Grade E output in English 6. This would make sense if we assume that students in the former group are more likely to produce C or A grade output in English 6, rather than going from producing A grade output in English 5 to E grade output in English 6. Whether or not this is true is actually irrelevant, though, for what complicates matters are the other significant differences related to the CN/C measure. The significant difference for CN/C between the Grade A texts in both English 5 & 6 also indicated a decrease in complexity. Can we then say that a high degree of syntactic

complexity through CN/C is a better indicator of a high grade in English 5 than in English 6? No, since by the principle of Occam’s razor the simplest explanation is more likely the correct one, which is that our small sample size just happened to include texts representing Grade A in English 5 possessing a very high degree of complexity through CN/C. This would also help explain the discrepancy in why the difference in CN/C complexity between said group and the only remaining group of texts in English 6 (Grade C) is statistically

insignificant – a discrepancy which otherwise would have been very difficult to explain. However, what can we then say about the remaining significant difference in CN/C? Between grades E & A in English 5, we found a significant increase in this measure – in fact, the most significant difference encountered in the entire study (p⪯ 0.0002) – and this would line up with the developmental trajectory discussed by Norris & Ortega (2009) – though, similar to Bulté & Housen (2014), the increase in phrasal complexity seemingly did not happen at the expense of complexity through subordination. With the above, we have reason to mistrust the appearance of the trajectory in this measure, so we should not draw any larger conclusion. What we can say is that a larger study is needed to determine whether our sample (in general, but especially for Grade A in English 5) actually is representative of the example texts as a whole.

Moving from the phrasal level to the global level, the two length of production unit

measures MLS & MLT showed significant decreases between grades E & C within English 6. These two are a bit safer to discuss than CN/C, since no larger discrepancy points to the represented texts being outliers - although we, of course, cannot and should not completely disregard that possibility either. If the assumption that “longer linguistic units are more complex” (Bulté & Housen, 2014) holds true, the decreases in syntactic complexity appears odd, or, if one is so inclined, as if less complex units are being prized more than more complex units in relation to these two grades. That, however, would be looking at things too narrowly. Syntax is actually discussed in the available comments explaining the assessments for four of the six affected texts (Skolverket, n.d.e, n.d.f), and one of these do relate to the

(24)

measures MLS & MLT: the second text representing the E grade is noted to exhibit “[v]issa tendenser till satsradning” ‘[s]ome tendencies for run-on sentences’ (Skolverket, n.d.e). This obviously raised the MLS, since L2SCA does not count commas as the endpoint of sentences, but what about the MLT? As discussed in section 4.2.2, L2SCA has a particular tendency of dealing with run-on sentences as if dependent clauses were intended, and this tendency raises the MLT. Does this mean that run-on sentences completely account for the statistical

significance? That does seem likely, and it would line up with the rest of the results, but further analysis is needed to determine whether that is really the case.

The final measure that exhibited statistical significance was the subordination measure CT/T. Both of the times it exhibited significance were decreases as both the texts in English 5 & 6 with the E grade were more complex than the texts in English 6 with the A grade. The astute reader will have already made a critical observation here, but before addressing that, it will be interesting to first take a little look at the developmental trajectory discussed by Norris & Ortega (2009). On its own, the decrease might appear odd, but in relation to the developmental trajectory we could have found an interesting theoretical explanation for it. That is, if the phrasal complexity had increased at the same time. Since they did not show any statistically significant differences for this same group of texts, we cannot assume that the decrease happened in line with the trajectory. At the same time, we cannot completely disregard it. It could be that the particular measures for phrasal complexity used did not register the differences, which others, e.g. mean length of noun phrases, might have done. Until such a study is conducted, however, there is a more pressing concern that needs

addressing, which the astute readers have already noticed. We are, in fact, once again dealing with a T-unit measure in the same English 6 texts with the E grade that made us doubt the significance in the previous paragraph. Since a complex T-unit is identified as one with a dependent clause (Lu, 2010) and L2SCA identifies run-on sentences as dependent clauses, this clearly casts doubt on the validity of the significance. Just as with the MLT measure, it seems likely that the significance would disappear if the run-on sentences were identified differently. However, that just explains one of the times the CT/T measure was significant, so what about the other one? Well, one of the texts representing the E grade in English 5 was what we used in section 4.2.2 to explain L2SCA’s tendency of dealing with run-on sentences, and that one example was not the only instance of run-on sentences in that group of texts. Further analysis is of course needed to determine whether that was enough to account for the statistical significance, but that would both make sense and be more in line with the rest of the results. With this, we have now cast doubt on all the instances of statistical significance.

(25)

Broadening our view to the insignificant differences, we saw that these made up the majority of the comparisons. From these results, we can gather that syntactic complexity, as expressed through our 14 measures, most likely played little to no role in the assessment which led to their grades. This is understandable. The example texts do, after all, present a holistic explanation for why a particular text is indicative of a particular grade, and other facets would then serve as better indicators of different grades than syntactic complexity. However, it is still interesting that the only overarching trend that could be discerned was the absence of one. Because of the “strong link between the (syntactic) complexity of learners’ L2 production and their overall level of L2 development and/or L2 proficiency” (Bulté & Housen, 2018, p. 148), one would have expected to find some sort of trend, but the null hypothesis was upheld. How are we to understand this? For one, we do have the already mentioned holistic perspective of the example texts, which definitely could have hindered syntactic complexity from leaving a significant impact. In addition to this, or perhaps as a separate issue, maybe the students writing the national tests already possess a fairly high capacity for producing a certain level of syntactic complexity. To answer that, a larger study not just looking at the example texts is of course required, and even if they did possess said capacity, they would likely benefit more from improving other facets of their linguistic proficiency if they desire a high grade on the national tests, thereby leaving their syntactic complexity skills to develop slowly and organically in the background. We might also understand the results as a consequence of the syllabus. As touched upon in the introduction, no explicit mention is made of complexity in the syllabus (see Skolverket, n.d.i). The

example texts were neither produced by the students nor chosen as example texts by

Skolverket with complexity in mind. However, the revised edition of the syllabus, applicable from 1 july 2021, might come to affect this. This new edition does explicitly mention

complexity when it states that the teaching in English 7 should cover “Variation och

anpassning som skapas genom komplex meningsbyggnad” ‘Variation and adaptation that is created through complex sentence structure’ (SKOLFS 2010:261). Although the national test is not taken in English 7, meaning that no example texts are produced, the teachers simply knowing that they need to cover complexity in English 7 might affect their teaching in the other courses as they might become more aware of complexity as a concept in general. This could in turn affect the texts produced for the national tests in English 5 & 6, though if that would be enough to leave a significant impact in the syntactic complexity of the example texts is far from certain. It would still be interesting, though, to replicate this study after the revised edition of the syllabus has been in effect for some time to see whether the explicit

(26)

mention of complexity in English 7 affected the other courses in such a manner. Finally, our results may be understood as a consequence of the fact that, as shown in section 3, syntactic complexity is a subset within the larger category of linguistic proficiency. It might just be that it is too small of a subset to compete in the holistic assessment the example texts represent. More example texts will of course need to be studied for us to draw any strong conclusions, but this study seems to indicate that the null hypothesis would still be upheld. At the very least, it seems unlikely that the example texts of any one given year would show statistically significant differences for the measures used. In the end, one idea emerges from the results: syntactic complexity is a seemingly poor indicator of why a text received a certain grade in the national tests of either of the represented courses.

7. Conclusion

This study set out to investigate the syntactic complexity of the example texts used as guides for assessment in the national tests of the Swedish upper secondary school courses English 5 and English 6. Two research questions guided the study: (1) Is there a progression of

increased complexity between the grades assigned to the example texts, and, if so, is any specific measure of syntactic complexity more strongly linked to a higher grade than the rest? (2) Is there a progression of increased complexity between the two courses, and, if so, how does this progression manifest itself? 14 measures as identified by the computer program L2SCA were used to answer these. For both questions, the answer is a resounding no. No overarching progression could be found and therefore no specific measure was identified as standing above the rest. Only four measures (CN/C, MLS, MLT & CT/T) exhibited any statistical significance, and even then, only for some of the comparisons. All instances of statistical significance, however, could be doubted as only occurring either due to a small sample size or due to a questionable tendency of L2SCA when dealing with run-on sentences.

What, then, are the pedagogical implications of the results? Since teachers are seemingly not (implicitly) being led to consider syntactic complexity, as expressed through our 14 measures, while assessing the national tests, this would help encourage them to consider spending their lesson time on other aspects of language teaching. One should not, however, view this as syntactic complexity being neglected. As previously mentioned, the students’ syntactic complexity skills can develop in the background, which is far from an uncommon occurrence. For example, Mazgutova & Kormos (2015) found that their students produced more syntactically (and lexically) complex texts by the end of an English for Academic

(27)

Purposes program despite the fact that such programs “are not, or only indirectly, focused on the syntactic and lexical aspects of writing.” Focusing on other aspects of language teaching than syntactic complexity therefore seems justifiable. This does not mean that lessons focusing on syntax would not be valuable in and of themselves; as already stated, syntax is discussed in the commentaries to the example texts, but not in a way that relates much to the measures here employed. Teachers could, for example, focus on clausal coordination as a means of enriching arguments. However, since teachers know the needs of their students best, we will leave that up to them.

Before we conclude, we must acknowledge the limitations of this study, the first and most obvious being its small sample size. As we saw, the limited number of example texts most likely accounted for two instances of statistical significance (both regarding the CN/C measure), and it is difficult to draw any larger conclusion when the material analyzed only comprises 18 texts in total. 18 example texts is still a higher number than a teacher would encounter during any one given year, so our results do hint at the differences in syntactic complexity of the example texts in general, but we cannot say any more than that. To see whether the results of this study actually do point towards a larger tendency within the

example texts of not showing any significant differences in syntactic complexity regardless of grade or course and whether the instances of significance for the CN/C measure actually were caused by our sample size, a larger scale study should be performed. Performing such a study would be challenging given the large amount of texts still bound by secrecy, but it would not be impossible and should therefore be encouraged. Another limitation is the specific

measures used. Our results are only reflective of the 14 measures used, and there might be significant differences in other measures between the texts. For example, the mean length of noun phrases and the number of words before the main verb are two measures that would be interesting to examine. However, to do that one would need to perform the identification of the measures manually as L2SCA is only able to identify the 14 measures used. This does lead into another limitation, which is L2SCA itself. Though its validity has been tested (e.g. Lu, 2010; Polio & Yoon, 2018; Yoon & Polio, 2017) and seemingly holds strong, it was not designed for the proficiency level our texts exhibited. We saw this in its tendency of dealing with run-on sentences as if subordination was intended, which accounted for the remaining instances of statistical significance. If the measures were manually identified, these instances might not have been significant, and to determine if that is the case, a replication study where L2SCA is swapped for human raters could be performed. However, such an approach should take the opportunity to expand upon this initial study in two ways. Firstly, other measures

(28)

such as those suggested above would prove interesting additions as they would provide a richer view of the syntactic complexity of the example texts. Secondly, and more importantly, a larger number of texts should be studied for the same reasons as highlighted above.

However, if a larger sample were to be studied, more consideration needs to be given to the topic and/or genre of the texts. We did not pay much attention to the differences between the investigative and argumentative texts in English 6 as our small sample did not justify it, but such a luxury cannot be afforded on a larger scale. Aside from these two recommendations, a different replication study could also, as mentioned in the discussion, focus on example texts produced when the revised edition of the syllabus has been in effect for some time, to see whether the explicit mention of complexity in English 7 had any effect on the example texts’ syntactic complexity.

As a final note, further research could also employ a broader perspective than this study and investigate the syntactic complexity of the texts produced for the national tests in general, not just the ones chosen as example texts. This broader look would provide interesting

information on the students’ general level of syntactic complexity and could be studied from a lot of perspectives, such as if there are differences in syntactic complexity in the texts produced by students of different genders (see Johansson & Geisler, 2011) or students with different first languages. One could also perform long term case studies to see how syntactic complexity develops among individual students, though in order to comply with the secrecy of the national tests one could then focus on texts produced in class as preparation for said tests.

(29)

References

Ai, H. (n.d.) L2SCA. Aihaiyang. Retrieved 09-04-2021, from

https://aihaiyang.com/software/l2sca/

Ai, H., & Lu, X. (2013). A corpus-based comparison of syntactic complexity in NNS and NS university students’ writing. In Díaz-Negrillo, A., Ballier, N., & Thompson, P. (Eds.),

Automatic treatment and analysis of learner corpus data (pp. 249-264). John Benjamins. Bardovi-Harlig, K. (1992). A second look at T-unit analysis: Reconsidering the sentence. TESOL Quarterly, 26(2), 390-395.https://doi.org/10.2307/3587016

Beers, S. F., & Nagy, W. E. (2009). Syntactic complexity as a predictor of adolescent writing quality: Which measures? Which genre?. Reading and Writing, 22, 185-200.

https://doi.org/10.1007/s11145-007-9107-5

Biber, D., Gray, B., & Poonpon, K. (2011). Should we use characteristics of conversation to measure grammatical complexity in L2 writing development?. TESOL Quarterly, 45(1), 5-35.

https://www.jstor.org/stable/41307614

Bonnevier, J., Borgström, E., & Yassin Falk, D. (2017). Normers roll i ett mål- och kriterierelaterat bedömningssystem. Utbildning & Demokrati, 26(2), 21-47.

Bulté, B., & Housen, A. (2012). Defining and operationalising L2 complexity. In Housen, A., Kuiken, F., & Vedder, I. (Eds.). Dimensions of L2 performance and proficiency: Complexity, accuracy and fluency in SLA (pp. 21-46). John Benjamins.

Bulté, B., & Housen, A. (2014). Conceptualizing and measuring short-term changes in L2 writing complexity. Journal of Second Language Writing, 26, 42-65.

https://doi.org/10.1016/j.jslw.2014.09.005

Bulté, B., & Housen, A. (2018). Syntactic complexity in L2 writing: Individual pathways and emerging grouptrends. International Journal of Applied Linguistics, 28(1), 147-164.

(30)

Crossley, S. A., & McNamara, D. S. (2014). Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second Language Writing, 26, 66-79.https://doi.org/10.1016/j.jslw.2014.09.006

Johansson, C., & Geisler, C. (2011). Syntactic aspects of the writing of Swedish L2 learners of English. In Newman, J., Baayen, H., & Rice, S. (Eds.). Corpus-based studies in language use, language learning, and language documentation (pp. 139-155). Rodopi.

Knoch, U., Rouhshad, A., Oon, S. P., & Storch, N. (2015). What happens to ESL students’ writing after three years of study at an English medium university?. Journal of Second Language Writing, 28, 39-52.https://doi.org/10.1016/j.jslw.2015.02.005

Lahuerta Martínez, A. C. (2018). Analysis of syntactic complexity in secondary education EFL writers at different proficiency levels. Assessing Writing, 35, 1-11.

https://doi.org/10.1016/j.asw.2017.11.002

Larsen-Freeman, D. (2006). The emergence of complexity, fluency and accuracy in the oral and written production of five Chinese learners of English. Applied Linguistics, 27(4), 590-619.https://doi.org/10.1093/applin/aml029

Lu, X. (2010). Automatic analysis of syntactic complexity in second language writing. International Journal of Corpus Linguistics, 15(4), 474-496.

https://doi.org/10.1075/ijcl.15.4.02lu

Lu, X. (2011). A corpus-based evaluation of syntactic complexity measures as indices of college-level ESL writers’ language development. TESOL Quarterly, 45(1), 36-62.

https://www.jstor.org/stable/41307615

Lu, X. (2017). Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Language Testing, 34(4), 493-511.

References

Related documents

a) Inom den regionala utvecklingen betonas allt oftare betydelsen av de kvalitativa faktorerna och kunnandet. En kvalitativ faktor är samarbetet mellan de olika

Parallellmarknader innebär dock inte en drivkraft för en grön omställning Ökad andel direktförsäljning räddar många lokala producenter och kan tyckas utgöra en drivkraft

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i

Denna förenkling innebär att den nuvarande statistiken över nystartade företag inom ramen för den internationella rapporteringen till Eurostat även kan bilda underlag för

Detta projekt utvecklar policymixen för strategin Smart industri (Näringsdepartementet, 2016a). En av anledningarna till en stark avgränsning är att analysen bygger på djupa

DIN representerar Tyskland i ISO och CEN, och har en permanent plats i ISO:s råd. Det ger dem en bra position för att påverka strategiska frågor inom den internationella

Den här utvecklingen, att både Kina och Indien satsar för att öka antalet kliniska pröv- ningar kan potentiellt sett bidra till att minska antalet kliniska prövningar i Sverige.. Men