Peer review and scientific quality judgement

Chapter 4. Theoretical framework:

6. Peer review and scientific quality judgement

of reasoning plays in relation to the boundaries of mainstream and heterodox economics and their variable permeability.

Boundary work draws our attention to the constructed and contested nature of symbolic and social boundaries. It lets us highlight the agency of actors involved in the rhetorical reproduction or transformation of bounded structures. But is not only a rhetorical practice. For when the symbolic boundaries of what is considered to be proper economics, real science, are generally agreed and sufficiently established, symbolic boundaries may become social boundaries. That is, conceptions of what constitutes good science, of the boundaries of the accepted mode of reasoning, affects the distribution of resources and opportunities. In other words, resources in terms of both material reward (research grants and academic positions) as well as ever-important scientific recognition (publications in variably valued outlets and subsequent citations) are dependent on ongoing arguments about ambiguous and contested boundaries of proper economic science. The central mechanism through which this transformation of symbolic into social boundaries takes place are the various locations of the institutionalised practice of peer review.

valuation or evaluation, ranging from the economic valuation of commodities to the cultural sociology of tastes to judgement of scientific quality (Beljean et al.

2016:201; Lamont 2012b). Here, I draw on insights from specific studies of peer review processes and the recent general sociology of valuation and evaluation that, in Lamont’s (2012a, 202) words, “can be useful for understanding the cultural or organizational dimensions of all forms of sorting processes and for connecting microdynamics of exclusion to macrodefinitions of symbolic community and patterns of boundary work”.

Evaluation practices are closely related to the notion of boundaries and boundary work. In both cases, we are dealing with social practices that establish some sort of boundary or sorting. However, I suggest that boundary work and evaluation practices can be considered analytically distinct phenomena which are often entangled in practice. Boundary work ideally deals with the establishment of a singular dividing boundary between an inside and an outside of a symbolic category, like (proper) professional “economics”, and often includes a connection to a (more or less) macrolevel bounded social group. Evaluation practices, on the other hand, are first of all a wider class of phenomena, often involving a more formal and fine-grained classification and sorting. While boundary work may be an intended or unintended rhetorical device, it is seldom an explicit purpose in itself to establish symbolic boundaries, but rather its by-effect. Research evaluation on the other hand is a highly formalised practice with the explicit and purposive task of establishing a classification according to some quality criteria.

Evaluation practices can be said to be properly social processes, as distinct from psychological, on at least three grounds (Lamont 2012b:205). First, evaluation requires intersubjective agreement on a set of evaluation criteria or referents. In our specific case, this consists of disciplinary conceptions of good science, proper methodology, valid research questions and fields of study, etc. Second, evaluation involves negotiations about both choice and interpretation of quality criteria, as well as who is a legitimate evaluator. Third, it relies on a relational or indexical process of comparison. In our setting, being competent to evaluate research requires, among other things, good knowledge of the discipline, its research fields and their state of art, to have a yardstick for comparison. In all these processes, power struggles, positioning and boundary work potentially play important roles.

To say that evaluation is a social practice highlights that the outcomes of evaluation processes is not predetermined by the object of evaluation, but that there is always a measure of contingency involved in the social process of evaluation. In research evaluation, it means that the quality of the object to be evaluated doesn’t just exist as an objective property to be read like one would read a table of numbers. Instead, the outcome of the evaluation process is an

achievement that is underdetermined by its object. Clearly, social aspects may enter into the evaluation process in different ways. There is thus a clear parallel between the way that the outcome of empirical investigations are understood as underdetermined by data in science studies, and the way that quality evaluation is underdetermined by the object of evaluation. In neither case does this mean that results are constructed out of thin air, but that if there is not a single necessary outcome of these epistemic processes, social factors may tip the balance in one direction or the other. In research evaluation, it may mean that certain interpretations or quality criteria or, say, relying on a specific technique of categorisation or emphasising some criteria above others as important to the field, may turn the final decisions in different directions.

It is useful to distinguish between two aspects of evaluation practices (Lamont 2012b:206). First, it entails some method or process for categorising. That is, evaluators may employ different strategies for determining how to categorise their object of evaluation. For example, they may rely on only their professional judgement and knowledge and their deep feel for quality (“I know it when I see it”). Or, they may rely on more formalised or mechanised techniques such as external ranking systems or metrics. Second, categorisations need legitimation or consecration. That is, the value of an entity or validity of an evaluation needs to be justified so as to be recognised by other relevant actors as legitimate. The power of consecration has been a central topic in much cultural sociology following Bourdieu, and deals with the struggle for power over the ability to impose one’s taste or judgement on a field. In academic peer review, the selection of reviewers is highly important to the reproduction of the field and its style of reasoning. But legitimation is also a central aspect of every single evaluation practice, where the evaluator not only needs to arrive at a categorisation, but also to justify beyond doubt why a particular result should be accepted as the outcome of the evaluation process.

Scientific peer review, as with any evaluation practice, relies to a great extent on the field-specific habitus of the reviewer, activated in the form of his or her trained judgement. Although peer review involves formal evaluation criteria about which reviewers are often explicit and reflexive, informal criteria also play a very important part in review practices, something that reviewers themselves may acknowledge (Gemzöe 2010; Lamont 2009). These informal criteria may range from the moral character of an applicant to the intangible qualities of a research proposal or paper as exciting, interesting or elegant. This is a site that well illustrates the activation of a scientific habitus, the semi-conscious dispositions and practical sense of what is reasonable, good, exciting and so on, that is seldom explicated but rather functions as the mastery of a practice and its tacit dimensions

learnt through practice. But this also explains another prevalent feature of peer review processes, namely the tendency towards homophily and system-preserving conservatism.

However, while peer review processes may be imperfect and biased, for example in terms of old boy networks or gender, it is crucial to distinguish social bias from cognitive bias (Gemzöe 2010; Travis and Collins 1991). Cognitive bias of peer review processes means that it is not a bias against certain individuals or even social groups or categories. Rather, it is a bias in ways of thinking and understanding science. In disciplines with a very high level of consensus, this is unlikely to be a problem. However, if there exist marked intellectual boundaries and minorities, that is, scientific heterodoxies, the effects of what Travis and Collins call cognitive particularism will be significant (Travis and Collins 1991). What matters then is from which side of an intellectual boundary evaluators are recruited. However, if such boundaries, as in the case of minority heterodoxies, is not well known or acknowledged, we can expect that peer review processes are marked by such cognitive particularism, and that they will tend to reinforce the dominant style of reasoning.

This insight has been developed in more recent research on peer review, that emphasises not only that scientific quality is a contingent outcome of evaluation processes (rather than a pre-existing objective property), but also that the conceptions of scientific quality are diversified, rather than universal (Gemzöe 2010). Furthermore, this diversification forms distinct disciplinary cultures and

“epistemic styles” in evaluation, but as Lamont (Lamont 2009) shows in her study on interdisciplinary panels, it is not as simple as a one-to-one mapping of epistemic styles onto disciplines. This translates well to the theoretical framework presented here, and the imperative to look simultaneously at the level of disciplinary styles of reasoning, and at the various ways in which Crombian styles of reasoning may act as barriers or bridges in the maintenance of boundaries with heterodoxy and other disciplines.

Objectivising metrics as judgement devices

The role of professional scientific judgement is central to peer review. But there are also techniques that can be used in evaluation that rely on external quantitative metrics as objectivising tools. The last decades have seen a rapid increase in the availability and use of bibliometric indicators in research assessment and evaluation, a veritable “metric tide” (Wilsdon et al. 2016). As argued in the literature review in chapter 3, recent research at the crossroads between bibliometrics and science studies has started to investigate the epistemic impact of

the use of bibliometric indicators (Rijcke et al. 2016). In this vein, Hammarfelt and Rushforth have shown how indicators, like authors’ h-index, Google scholar citation counts, JIF or journal rankings are used as aids in the evaluation of scientific oeuvres, drawing on Swedish expert evaluation reports from three different disciplines (biomedicine, history, and economics) (Hammarfelt 2017;

Hammarfelt and Rushforth 2017). I will draw on their notion that this use of bibliometrics in evaluation reports can be understood as a judgement device, and that we should study how these are employed by evaluators and integrated into their judgement practices. The use of such judgement devices adds a very different social aspect to the evaluation practice, which potentially contributes to the determination of evaluation outcomes.

If the role of the habitus in peer review functions means that the individual evaluator’s judgement functions as a subjective mediator in the reproduction of disciplinary standards, the use of bibliometric indicators instead reallocates more of the judgement to the system of academic journals and their editors and reviewers. For example, imagine a reviewer of candidates for a professorship. This expert may rely solely on reading the applicants’ submitted texts, using his or her deep knowledge of the field and scholarly judgement to rank the candidates and provide arguments legitimising the ranking in terms of quality criteria. The reviewer may also invoke an external quantitative indicator, like the JIF of the applicants’ publications and use it both to categorise (that is, base the ranking fully or partly on it), and furthermore to legitimise the ranking, where the measure becomes an indicator of quality, impact or similar evaluation criteria. In effect, the evaluation outcome relies to a greater extent on previous evaluations distributed among reviewers and editors at various scientific journals. This way, the evaluation becomes in a sense more objective, but it is nevertheless a social and susceptible form of cognitive particularism, a form of social objectivity where the outcome of social processes, once categorised and quantified into numbers, achieve an air of inevitability and objectivity. This complex and distributed quantification of evaluation is an instance of what Fourcade and Healy (2017) have called a classification situation, where new powerful technologies of quantification and categorisation come to take on a life of their own in the ordering of social life.

The important shift emphasised in this literature is the transformation of mechanisms of reproduction of scholarly standards or styles of reasoning, where relative weight has shifted from the judgement of the individual evaluator to the distributed evaluative capacity of an institutionalised system of scientific journals more or less connected to a scientific discipline. In my integration of this insight into the styles framework, I will particularly attend to how evaluators

simultaneously reproduce detailed accounts of the disciplinary style and its boundaries, while accomplishing categorisation and legitimation in their evaluations. Furthermore, the extent to which evaluation relies on judgement grounded in the scientific habitus on the one hand, and on various judgement devices on the other, and how these are integrated, should be a central question for investigation. On an overarching level, the question is, to what extent and in what way are peer review processes involved in the reproduction of the disciplinary (or Crombian) styles through cognitive particularism?

In document Disciplined reasoning Styles of reasoning and the mainstream-heterodoxy divide in Swedish economics Hylmö, Anders (Page 150-155)