• No results found

New computerized staging method to analyze mink testicular tissue in environmental research

N/A
N/A
Protected

Academic year: 2022

Share "New computerized staging method to analyze mink testicular tissue in environmental research"

Copied!
9
0
0

Loading.... (view fulltext now)

Full text

(1)

NEW COMPUTERIZED STAGING METHOD TO ANALYZE MINK TESTICULAR TISSUE IN ENVIRONMENTAL RESEARCH

AZADEHFAKHRZADEH,y ELLINOR SPÖRNDLY-NEES,*z ELISABETHEKSTEDT,z LENAHOLM,z and CRISL. LUENGOHENDRIKSy

yCentre for Image Analysis, Uppsala University, Uppsala, Sweden

zDepartment of Anatomy, Physiology, and Biochemistry, Swedish University of Agricultural Sciences, Uppsala, Sweden (Submitted 26 January 2015; Returned for Revision 7 March 2016; Accepted 4 June 2016)

Abstract: Histopathology of testicular tissue is considered to be the most sensitive tool to detect adverse effects on male reproduction.

When assessing tissue damage, seminiferous epithelium needs to be classified into different stages to detect certain cell damages; but stage identification is a demanding task. The authors present a method to identify the 12 stages in mink testicular tissue. The staging system uses Gata-4 immunohistochemistry to visualize acrosome development and proved to be both intraobserver-reproducible and interobserver-reproducible with a substantial agreement of 83.6% (kappa¼ 0.81) and 70.5% (kappa ¼ 0.67), respectively. To further advance and objectify this method, they present a computerized staging system that identifies these 12 stages. This program has an agreement of 52.8% (kappa 0.47) with the consensus staging by 2 investigators. The authors propose a pooling of the stages into 5 groups based on morphology, stage transition, and toxicologically important endpoints. The computerized program then reached a substantial agreement of 76.7% (kappa¼ 0.69). The computerized staging tool uses local ternary patterns to describe the texture of the tubules and a support vector machine classifier to learn which textures correspond to which stages. The results have the potential to modernize the tedious staging process required in toxicological evaluation of testicular tissue, especially if combined with whole-slide imaging and automated tubular segmentation. Environ Toxicol Chem 2017;36:156–164. # 2016 The Authors. Environmental Toxicology and Chemistry Published by Wiley Periodicals, Inc. on behalf of SETAC.

Keywords: Male reproductive toxicology Endocrine disruptor Computational toxicology Histopathology Method

INTRODUCTION

In recent decades, there has been a growing concern for the increased frequency of reproductive disturbances seen in both nonhuman animals and humans. Evidence has accumulated over the years that the disturbances are linked to environmental pollutants that can disrupt the action of reproductive hormones [1]. To detect adverse effects on male reproduction, histopathology of testicular tissue has been considered the most sensitive tool [2]. Testicular tissue is a complex structure with seminiferous tubules where the cells differentiate from primitive germ cells to spermatozoa. The germ cells pass through a number of different steps during differentiation.

These steps are combined in 12 stages in the cycle of the seminiferous epithelium in the mink, defined in detail by Pelletier [3]. The Society of Toxicological Pathology recom- mends classifying the testicular epithelium into stages when assessing tissue damage, because this helps to determine if the dynamics in the spermatogenic cycle have been disturbed [2]. If a toxicant affects the testis, it might lead to a new combination of the cells in a specific stage. For example, a particular cell type can be missing or an inappropriate cell type can be present in a certain stage as a result of exposure to a cytotoxic agent [4].

These changes may be the single morphological sign of toxic damage and can only be detected by staging [4].

Studying hundreds of images of tissue manually is very time- consuming. Staging of testicular tissue also requires years of experience and is a subjective analysis. Computer-aided quantitative histomorphometric analysis is an emerging field that uses powerful computing to identify, characterize, and quantify histological features of tissue in a way that comple- ments human evaluation. In the present study, we used image analysis techniques on microscopic images of testicular tissue from mink (Neovision vison). Mink is a semiaquatic top predator that accumulates certain chemicals and is sensitive to their toxic effects. It has therefore been suggested as a suitable sentinel species in environmental monitoring of endocrine disruptive chemicals [5,6]. Laboratory animals are widely used to detect adverse effects by single chemicals. But using wild animals such as the mink gives a better real-life picture of how the complex mixture of chemicals in the environment affects reproduction. The latest report on endocrine disruption from the World Health Organization [1] underlines the importance of studying how the chemical cocktail in the environment disturbs reproduction in humans and wildlife and urges more investiga- tion. To assess how mink reproduction is affected, as a model for both human and wildlife reproduction, there is a need for valid and reliable methods to analyze mink testicular tissue.

Periodic acid-Schiff has traditionally been used to stain the acrosomes purple to facilitate the staging. However, periodic acid-Schiff staining differs between species [7] and cannot be used when developing a computerized method because of poor color contrast. An alternative stain has been suggested by McClusky et al. [8], who proposed immunolocalization of Gata-4 in rat testis to mark acrosomes and facilitate staging.

Gata-4 has been evaluated in mice, human, pig, mink, and rat, where it is expressed in Sertoli and Leydig cells in the testis

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

* Address correspondence to Ellinor.Sporndly-Nees@slu.se Published online 8 June 2016 in Wiley Online Library (wileyonlinelibrary.com).

DOI: 10.1002/etc.3517

Published 2016 SETAC Printed in the USA

156

(2)

through adulthood [8–10]. McClusky et al. [8] showed that a specific polyclonal Gata-4 antibody stained the developing acrosome in rat very clearly, with a significant contrast between the stained acrosome in brown and the counterstained nuclei in blue, which is needed for a computerized staging method.

Generally, a scoring method in histopathology should be reproducible [11,12]. The aim of the present study was 2-fold:

first, to describe a staging method in mink based on the definition by Pelletier [3], using Gata-4 to visualize the development of the acrosome, and to test the reproducibility of this method; and second, to design and validate a computerized system that automates this staging. Our goal with the computerized system is to provide an objective tool for the assessment of a large number of testicular sections, which can, ultimately, be adjusted to function for the analysis of tissue also from other species.

MATERIALS AND METHODS Animals and dissection

Five healthy, sexually mature minks were collected at the annual culling on a mink farm. No ethical approval was required because of the use of routinely culled mink from a fur farm. The commercial fur farm approved the use of the mink for the study.

Transverse tissue slices (approximately 2 mm thick) were cut from the left testis post mortem and fixed in modified Davidson’s fluid for 24 h at 4 8C. Modified Davidson’s fluid has been suggested as a superior substitute to Bouin’s [13] and is recommended by the Society of Toxicological Pathology [2].

The modified Davidson’s fluid consists of 30% of 37% to 40%

formaldehyde, 15% ethanol, 5% glacial acid, and 50% distilled water [13]. After rinsing in phosphate buffer, tissue slices were trimmed, followed by dehydration in increasing concentrations of ethanol. Samples were then embedded in paraffin wax.

Immunohistochemistry

Immunohistochemical localization of Gata-4 was investi- gated in sections from all 5 mink. Paraffin-embedded tissues were cut into 4-mm–thick sections and mounted on Superfrost Plus Gold slides (Menzel-Glaser). The slides were deparaffi- nized in xylene, rehydrated in a graded series of ethanol, and rinsed with phosphate-buffered saline (PBS). Antigen retrieval was achieved by submerging the slides in 0.01 M sodium citric buffer (pH 6.0), followed by pressure-heating for 20 min in a pressure boiler (21100 Retriever; Histolab Products). After cooling and rinsing in PBS, the endogenous peroxidase activity was blocked using 0.3% hydrogen peroxide diluted in methanol.

Immunolocalization of Gata-4 antibody was identified using the ImmunoCruz goat ABC Staining System (sc-2023; Santa Cruz Biotechnology). In brief, tissue sections were treated with blocking goat serum for 30 min, and excess serum was blotted from the slides. Gata-4 (sc-1237; Santa Cruz Biotechnology) antibody was diluted 1:50, and the sections were incubated in the dark for 20 h at 48C. The sections were rinsed with PBS between each of the subsequent steps. Secondary antibody (donkey antigoat, sc-2023; Santa Cruz Biotechnology) was applied to each section and incubated for 30 min, followed by AB enzyme reagent for 30 min. Immunoreactivity was visual- ized using 3,30-diaminobenzine tetrahydrochloride (DAB Safe;

Saveen Biotech), to which H2O2 was added to visualize the bound enzyme activity as a brown color. Sections werefinally rinsed in H2O, dehydrated, and mounted with Pertex. Negative controls were run by excluding the primary antibody and by replacing the primary antibody with nonimmune serum from

goat (goat immunoglobulin G Sc-2028; Santa Cruz Biotechnol- ogy). The slides were counterstained with hematoxylin for 10 s.

Imaging

Digital images of the sections were taken with a Nikon Eclipse 50i microscope and Nikon Digital Sight DS-2M camera, using the 20 objective lens. Images were stored as uncompressed TIFF files, at 1200  900 pixels and 0.4 mm per pixel, as red–green–blue images with 8 bits per channel. A total of 545 tubules were imaged (140, 93, 93, 93, and 126 from each mink, respectively) in 188fields of view.

Manual staging

Two trained investigators (E. Spörndly-Nees and E. Ekstedt) staged the seminiferous tubules in the images collected 2 times each. After their individual staging, they discussed all tubules until a consensus staging was obtained. The A through E staging was obtained from the consensus staging by mapping each of the 12 original stages to the corresponding A through E stage.

Computerized staging

Tubules presented to the computerized staging tool, in the form of digital images, had to be delineatedfirst. A few methods exist that attempt to delineate tubules automatically [14,15]; but for the purposes of the present study, we chose to use a semiautomatic approach that, with quick user intervention, produced a more consistent result than any known automatic method [10]. This approach is based on the Livewire algorithm [16,17] and allowed the user to delineate a tubule very precisely with only a few clicks.

The computerized staging tool described the Golgi shapes visible within each tubule using rotation invariant improved local ternary patterns [18,19]. The improved local ternary pattern was computed using the red channel of the red–green–

blue color images, by sampling 8 points in a circle of radius 4 pixels (chosen based on the size of nuclei) and using a threshold of t¼ 5. Binary codes were grouped if they could be circularly shifted to the same code. The co-occurrences within rotation groups were Fourier-transformed, and one-half of the power spectrum was used as the feature vector [20]. Finally, a soft linear support vector machine classifier [21,22] was able to distinguish the stage of a tubule given its feature vector. The support vector machine was trained on a given set of tubules with known stages. The slack variable was optimized using a grid search.

Ten-fold cross-validation [23] was used to assess the program’s accuracy: the set of tubules was randomly divided into 10 groups of equal size and stage frequencies. Nine groups were used to optimize the slack variable and train the classifier, and the 10th group was used to assess the classification result.

The training was repeated 10 times, such that each of the groups was used once for assessment. The stage assigned to each tubule during assessment was that recorded for the confusion matrix and other statistics.

Statistics

We present intraobserver and interobserver agreement as confusion matrices, which give insight into which stages are more frequently confused. These matrices are constructed by taking the 2 stages assigned to each tubule (through 2 different processes, e.g., by 2 different investigators) and using them as a row and column index into a matrix. This matrix element is then increased by 1. The result is a number, Ci,j, at each matrix element (i, j) that indicates how many tubules were assigned

(3)

stage i by thefirst process and stage j by the second. For display, each row is normalized to 100, and the numbers are rounded.

This normalization converts the absolute number of tubules into a relative percentage of tubules, which can then more easily be compared across experiments.

The confusion matrices, C, are also summarized into descriptive statistics before the normalization of the rows. We computed the agreement (accuracy [a])

a¼ 1 N

P

iCi;i ð1Þ

where N¼P

i;jCi;j, and Cohen’s weighted kappa (K) [24]

k ¼ 1 

1 N

X

i;jwi;jCi;j

X

i;jwi;jpipj ð2Þ

where pi¼N1P

jCi;j and pj¼N1P

iCi;j are the marginal proportions. With the weights (w)

wi;j ¼ 1; if i¼ j 0; otherwise (

ð3Þ

Cohen’s weighted kappa yields the standard, unweighted Cohen’s kappa. We also used the weights

wi;j¼

1; if i¼ j 0:5; if i ¼ j  1 mod12ð Þ

0; otherwise 8>

<

>: ð4Þ

to reduce the penalty when disagreement was between consecutive stages. Note the modulo operation in this last expression, which we used because stages I and XII are to be considered consecutive. In the Discussion section, we translate kappa values to degrees of agreement following Viera and Garrett [25] and Watson and Petrie [11].

RESULTS Manual staging in 12 stages

Twelve stages in mink testicular tissue are described by Pelletier [3] and were adjusted in the present study to light microscopic evaluation using immunohistochemistry for Gata-4 as described. The 12 stages were based on morphologic evaluation of the polyclonal Gata-4 antibody staining the acrosome dark brown and highlighting differences in the shape of the developing acrosome. The morphology (chromatin appearance and tail), orientation, and position of the round and elongated spermatids were studied. The amount of cytoplasm from the elongated spermatids lining the lumen of the seminiferous epithelium was also observed. In stages I to VII (Figures 1 and 2A), 2 layers of spermatids were seen; and after spermiation in stage VII (Figure 2A), only 1 layer of elongated spermatids was present in stages VIII to XII (Figure 2B–F).

Stage I (Figure 1A). The Golgi complex formed acrosome vesicles, marked with brown stain and located next to the nucleus of the round spermatid (arrow and insert). At the same time, the acrosome of the elongated spermatids was clearly stained for Gata-4, and the head had an ellipsoidal shape, which made it easy to identify (arrowhead). Cytoplasm from the elongated spermatids occupied the border of the seminiferous lumen (star).

Stage II (Figure 1B). The acrosome vesicles were more clearly stained with Gata-4 and could be seen as a pronounced brown dot at the nuclear membrane pole of the round spermatid in Golgi phase (arrow and insert). The elongated spermatids started to lose their ellipsoid shape and became flatter and thinner (arrowhead). Cytoplasm from the elongated spermatids occupied the border of the seminiferous lumen (star).

Stage III (Figure 1C). The acrosome vesicles at the nuclei of the round spermatids were more heavily marked with brown stain (arrow and insert). The acrosome made contact with the nucleus and caused a transient impression. At the same time, the head of the elongated spermatids changed from ellipsoid to more slender, which made them much less visible (arrowhead).

An increased amount of cytoplasm from the elongated spermatids occupied the border of the seminiferous lumen (star).

Stage IV (Figure 1D). At this stage, the stained acrosome became more flattened and started to spread further over the nucleus of the round spermatid (arrow and insert), and the small impression disappeared. The round spermatid now entered early cap phase. The elongated spermatids were thin and slender and therefore less visible (arrowhead). An increased amount of cytoplasm from the elongated spermatids occupied the border of the seminiferous lumen (star).

Stage V (Figure 1E). The spermatids were in cap phase, and the Gata-4–stained acrosome continued to spread and covered approximately one-third of the nucleus of the round spermatid (arrow and insert). The elongated spermatids were slender and less visible. The border lumen of the seminiferous tubule was still occupied by cytoplasm from the elongated spermatid (star).

Stage VI (Figure 1F). The acrosome of the round spermatids showed strong staining and now covered approximately one- half of the nucleus (arrow and insert). The elongated spermatids had started to back out toward the lumen, and some tails were clearly visible in the lumen (arrowhead). The cytoplasm lining the lumen was absorbed by the Sertoli cells.

Stage VII (Figure 2A). The acrosome covered more than one-half of the nucleus of the round spermatids in late cap phase, and the acrosomes were intensely stained (arrow and insert).

Elongated spermatids ready to leave line the border of the seminiferous lumen (arrowhead).

Stage VIII (Figure 2B). The transition from cap phase to acrosome phase was characterized by movement and reorienta- tion of the spermatid nucleus and acrosome toward the basement membrane (arrow and insert). The elongated mature spermatids had left, and the lumen of the seminiferous tubule was empty.

Stage IX (Figure 2C). Round spermatids entered acrosome phase and started to spire. Elongation and flattening of the nucleus of the round spermatid was seen, and the stained acrosome proceeded into a pear shape (arrow and insert). The spermatid rostral pole pointed toward the basement membrane.

Stage X (Figure 2D). The spermatids were more tapered and narrow and had a conical shape. Progressive chromatin condensation appeared at the rostral pole of the nucleus and all along the nuclear membrane (arrow and insert). The elongated spermatids were directed toward the basement membrane. They stared to cluster and move toward the basement membrane.

Stage XI (Figure 2E). The spermatids were further elongated and tapered. The acrosome had a conical shape, and the rostral end of the elongated spermatid appeared drop- shaped (arrow and insert). The chromatin condensation proceeded during the differentiation of the elongated spermatid.

Elongated spermatids were directed toward the basement

(4)

membrane. They clustered toward the basement membrane even more.

Stage XII (Figure 2F). The spermatids were now more slender with an ellipsoid shape (arrow and insert), and the rostral poles were pointed. There was meiotic division of primary spermatocytes to secondary spermatocytes, and meioticfigures were present (red arrow).

In total, 545 tubules were staged manually by 2 investigators (E. Spörndly-Nees and E. Ekstedt), using the critera described;

each investigator staged all tubules twice. Next, the 2 investi- gators examined all cases and obtained a consensus staging for each tubule. The 12 stages were distributed in the 5 mink, as seen in Table 1.

From the 4 independent stagings, 2 from each investigator, we determined intraobserver and interobserver agreement. The confusion matrix for intraobserver agreement (Figure 3) was computed by averaging together the confusion matrices for the 2 investigators. Agreement was 83.6%, Cohen’s kappa was 0.81, and Cohen’s weighted kappa [24] was 0.89 (Table 2). Cohen’s weighted kappa counts an error between consecutive stages as half an error; note that stages I and XII are considered consecutive.

The confusion matrix for interobserver agreement (Figure 4) was computed by averaging together 4 confusion matrices; each compared 1 of the stagings to the consensus staging. Agreement was 70.5%, Cohen’s kappa was 0.67, and Cohen’s weighted kappa was 0.80 (Table 2).

Computerized staging of 12 stages

The computerized staging tool was applied on the same 545 seminiferous tubules. These tubules were randomly divided into 10 equal-sized groups, such that each group had the same proportion of stages as the whole set. The program was trained using 9 groups, using the consensus staging, and then used to stage the 10th group. This training was repeated 10 times, using different combinations of groups, such that each of the 10 groups was staged once by the program. This staging was summarized in a confusion matrix (Figure 5). Agreement was 52.8%, Cohen’s kappa was 0.47, and Cohen’s weighted kappa was 0.61 (Table 2).

Pooling stages into 5 groups based on morphology (A–E) The 12 stages described were pooled in 5 groups according to morphologic criteria based on the formation of the acrosome in

Figures 1. Stages I to VI in mink seminiferous tubules showing Gata-4 immunolabeled acrosome development as brown staining. Round spermatids are seen in Golgi phase at stages I to III (A–C), and the acrosome vesicles are seen as a brown dot (arrow). The round spermatids then enter the cap phase in stages IV to VII, and the acrosomes are seen as u-shaped brown structures (D–F, arrow; see Figure 2A for stage VII), followed by a release of the elongated spermatids into the lumen at stage VIII (see Figure 2B). The arrow indicates the acrosome development in the round spermatocyte, and the arrowhead shows the acrosome in the elongated spermatids. Weak hematoxylin counterstain, bar¼ 50 mm. Insets show spermatids in stages I to VI, bar ¼ 5 mm.

(5)

the round spermatid and if the seminiferous epithelium consisted of 1 or 2 layers of spermatids. Stages important in toxicological evaluation were also considered when grouping the 12 stages into 5 groups. The main Gata-4 characteristics of the acrosome development are described in the 5 pooled groups.

Group A includes stages I, II, and III. The round spermatids were in Golgi phase with acrosomes seen as a brown dot (Figure 1A–C).

Group B includes stages IV and V. The round spermatids had entered cap phase, and the acrosomes were seen as triangular, u- shaped brown structures covering up to one-third of the nuclear pole (Figure 1D,E).

Group C includes stages VI and VII. The brown-stained acrosomes continued to spread over the nuclei of the round spermatids and covered approximately one-half of the nuclear envelope (Figures 1F and 2A).

Group D includes stages VIII and IX. The pear-shaped spermatids were oriented toward the basement membrane and had entered acrosome phase with brown-stained acrosomes covering more than one-half of the nuclear membrane (Figure 2B,C).

Group E includes stages X, XI, and XII. The elongated spermatids became more tapered and narrow and changed from conical to more elongated and slender (Figure 2D–F).

Applying this pooling of stages to the 545 tubules resulted in a more even distribution of the stages, as seen in Table 3,

Figure 2. Stages VII to XII in mink seminiferous tubules showing Gata-4 immunolabeled acrosome development as brown staining. The round spermatids enter the cap phase in stages IV to VII, and the acrosomes are seen as u-shaped brown structures (A, arrow; see Figure 1D–F for stages IV, V, and VI), followed by a release of the elongated spermatids into the lumen at stage VIII (B). In stages VIII to XII, a new generation of elongated spermatids is seen in acrosome phase, and the acrosomes are seen as a brown-stained, pear-shaped to elongated structures (B–F, arrow). The arrow indicates the acrosome development in the round spermatocyte, the arrowhead shows the acrosome in the elongated spermatids, and the red arrow shows the meioticfigure. Weak hematoxylin counterstain, bar¼ 50 mm. Insets show spermatids in stages VII to XII, bar ¼ 5 mm.

Table 1. Distribution of tubular stages within the data set

Stage No. of tubules Percent of tubules

I 73 13

II 39 7

III 46 8

IV 35 6

V 34 6

VI 96 18

VII 77 14

VIII 63 12

IX 25 5

X 10 2

XI 18 3

XII 29 5

(6)

although there is still a 3-fold difference between the largest group (C) and the smallest (E). We also translated the investi- gators’ stagings into the A through E groups and recomputed intraobserver and interobserver agreement (Table 4).

Computerized staging of the 5 groups (A–E)

The computerized staging tool was applied on the same 545 seminiferous tubules, following the same procedure but training the tool to stage tubules into the 5 groups (AE). Figure 6 shows the confusion matrix obtained for the 5 groups. Agreement was 76.7%, Cohen’s kappa was 0.69, and Cohen’s weighted kappa was 0.77 (Table 4).

DISCUSSION The suggested staging system was reproducible

The increased frequency of reproductive disturbances in humans and other animals and the linkage to endocrine- disrupting chemicals urge us tofind new methods in mink to evaluate the testicular tissue for disrupted reproduction. We present a staging system of mink testicular tissue using Gata-4 antibody as an acrosome marker. The overall interobserver agreement was 70.5% with a kappa value of 0.67, which is considered substantial, and an intraobserver agreement of 83.6% with a kappa value of 0.81, which is considered almost perfect [11,25]. We found no intraobserver and interobserver

agreement studies regarding staging of seminiferous tubules in any species, but many studies have been published on similarly difficult pathological tasks. For example, Allsbrook et al. [26]

involved 40 pathologists for Gleason grading of prostate cancer

Table 2. Summary statistics for investigator and computer staging Agreement (%) Kappa Weighted kappa

Intraobserver 83.6 0.81 0.89

Interobserver 70.5 0.67 0.80

Computer 52.8 0.47 0.61

Figure 3. Confusion matrix for intraobserver agreement. Numbers indicate how many tubules were assigned to stage Y (row number) in 1 staging and to stage X (column number) in another staging by the same investigator, as a percentage of all tubules assigned to stage Y. Thus, each row adds up to 100%. Along the diagonal, coded yellow (0%) to green (100%), are the tubules staged identically in both stagings by the same investigator. Other boxes, coded yellow (0%) to red (100%), are tubules where the investigator disagreed with herself. Empty boxes indicate 0%. Percentages are averaged for the 2 investigators. To the right of the matrix are the total number of tubules assigned to stage Y, which can be used to turn percentages into number of tubules. These numbers add up to double the number of tubules.

Note that stages I and XII are to be considered consecutive.

Figure 4. Confusion matrix for interobserver agreement. Numbers indicate how many tubules were assigned to stage Y (row number) in the consensus staging (when both investigators staged the tubules together) and to stage X (column number) by an investigator, as a percentage of all tubules assigned to stage Y. Thus, each row adds up to 100%. Along the diagonal, coded yellow (0%) to green (100%), are the tubules staged identically by both investigator and consensus. Other boxes, coded yellow (0%) to red (100%), are tubules where the investigators disagreed. Empty boxes indicate 0%.

Percentages are averaged over 4 repetitions (2 stagings per investigator, 2 investigators). To the right of the matrix are the number of tubules assigned to stage Y (consensus), which can be used to turn percentages into number of tubules. Note that stages I and XII are to be considered consecutive.

Figure 5. Confusion matrix for the computerized staging method with 12 stages. Numbers indicate how many tubules were assigned to stage Y (row number) in the consensus staging (when both investigators staged the tubules together) and to stage X (column number) by the computerized staging tool, as a percentage of all tubules manually assigned to stage Y.

Thus, each row adds up to 100%. Along the diagonal, coded yellow (0%) to green (100%), are the tubules staged identically in the consensus and by the computer program. Other boxes, coded yellow (0%) to red (100%), are tubules where the program did not agree with the consensus. Empty boxes indicate 0%. To the right of the matrix are the number of tubules assigned to stage Y (consensus), which can be used to turn percentages into number of tubules. Note that stages I and XII are to be considered consecutive.

(7)

into 4 classes. The pathologists agreed moderately (kappa

¼ 0.435) with a consensus diagnostic in 60% of the cases on average. In the present study, the interobserver agreement and kappa value were better, but only 2 investigators did the staging compared with the 40 pathologists seeking agreement in Allsbrook et al. [26]. To advance the suggested method for staging further in terms of objectivity and reproducibility, we designed a computerized staging system. The computer program is deterministic and, thus, always yields the same result given the same input.

In the present study, stages IV to V, X to XI, and II to III were the most commonly misclassified by both investigators and by the computerized staging tool (see Figures 3–5). Since spermatogenesis is a transitional process, a clear-cut boundary between stages is sometimes difficult. Hess [27] describes stages IV, V, VII, XI, and XII to be frequently found in transition in rat, either early or late. Even though it is difficult to compare stages between animal species, these stages correspond well to the stages that were frequently misclassified in the present study.

The texture pattern of Gata-4 stain was key to the computer program

Different techniques have been used to facilitate staging;

periodic acid-Schiff is probably the most commonly used.

However, the low color contrast of periodic acid-Schiff puts high requirements on the microscopic illumination and calibration for a computer program to reliably differentiate the colors. A staining protocol that yields high contrast and more obvious color differences significantly simplifies the image acquisition protocol and the computer program. Osuru et al. [28]

describe a staging method to show acrosome development in mice. Muciaccia et al. [29] suggest a stage classification in humans based on the acrosome development visible by immunohistochemistry for (pro) acrosin, and McClusky et al. [8] propose immunolocalization of Gata-4 in rat testis to mark acrosomes and facilitate staging. In the present study, Gata-4 expressed a very distinctive dark brown stain of acrosome development in mink, with a high contrast to the counterstained blue nuclei, which proved to be the key to the computer program for staging in mink testicular tissue.

We have evaluated many different algorithms that can be used for the purpose of staging, including nucleus-based bag-of- words models, Gaborfilters, and local ternary pattern as texture descriptors and random forests, AdaBoost, and support vector

machines as classifiers [30]. The combination of local ternary pattern with a support vector machine performed best. The reason some algorithms work better than others on specific problems is difficult to discern. The local ternary pattern texture descriptor we used looks at the epithelium cross-section as a textured pattern, not relying on correct identification of nuclei or Golgi. The shapes of the high-contrast Gata-4 stain across a tubule were encoded by the local ternary pattern into a numeric descriptor that could be compared to those of other tubules. Local ternary patterns have not been used in histology previously, but local binary patterns, from which the local ternary pattern was derived, have been shown to work well in histological tissue preparations [31–33]. The support vector machine classifier used in the present study has also been shown to work well in a wide variety of histological applications [34–36]. Furthermore, it is one of the few classifiers that does not suffer from a poor ratio of training examples to feature dimensionality (what is commonly referred to as the

“curse of dimensionality”). The present study had relatively few training samples per stage, compared with the 652 features produced by the local ternary pattern; thus, the support vector machine was one of the very few classifiers that we could use.

The validity of the computerized method is enhanced by pooling The validity of the computerized staging method in the present study defining the 12 stages was moderate with an agreement of 52.8% and a kappa value of 0.47 between consensus and the computer program. Both the investigators and the computerized staging tool confused consecutive stages much more frequently than more distant stages (Figures 3–5).

This was expected because the development of spermatids is a continuous process that has been split into stages. Consecutive stages are most similar, and their boundary is fuzzy. Thefiner the division into stages, the more likely it is that a tubule is close to the boundary between stages and may be confused with neighboring stages.

Twelve different stages were initially identified, a high number in relation to staging, for example, cancer, which often includes 3 to 4 classes [26,37,38]. We suggested a pooling of the 12 stages in mink into 5 different groups (A–E). The pooling

Table 4. Summary statistics for investigator and computer staging (A–E staging)

Agreement (%) Kappa Weighted kappa

Intraobserver 93.2 0.91 0.94

Interobserver 89.4 0.86 0.91

Computer 76.7 0.69 0.77

Table 3. Grouping of tubular stages and their distribution within data set Group Stages No. of tubules Percent of tubules

A I–III 158 29

B IV and V 69 13

C VI and VII 173 32

D VIII and IX 88 16

E X–XII 57 10

Figure 6. Confusion matrix for the computerized staging method with 5 groups (A–E). Numbers indicate how many tubules were assigned to group Y (row number) in the consensus staging (when both investigators staged the tubules together) and to stage X (column number) by the computerized staging tool, as a percentage of all tubules manually assigned to group Y.

Thus, each row adds up to 100%. Along the diagonal, coded yellow (0%) to green (100%), are the tubules staged identically in the consensus and by the computer program. Other boxes, coded yellow (0%) to red (100%), are tubules where the program did not agree with the consensus. Empty boxes indicate 0%. To the right of the matrix are the numbers of tubules assigned to stage Y (consensus), which can be used to turn percentages into number of tubules. Group A corresponds to stages I to III, group B to stages IV to V, group C to stages VI to VII, group D to stages VIII to IX, and group E to stages X to XII. Note that stages A and E are to be considered consecutive.

(8)

raised the agreement between the consensus staging and the computer to a substantial 76.7% with a kappa value of 0.69.

Pooling also raised the intraobserver and interobserver agreement. Pooling of stages is not new. In rat, McClusky et al. [39] pooled the stages into 7 different groups based on the 14 stages defined by Leblond and Clermont [40]. Hess et al. [41], on the other hand, pooled the 14 stages of the rat into 4 groups. If the stages were pooled into even fewer groups, the kappa value would probably be higher; but this would also lead to a marked reduction of information transmitted, as discussed in the review by Cross [42].

Another reason why the 5-group classification was easier for the computer program was that the grouping of tubules resulted in more example tubules for each category. This led to a better generalization of these examples by the classifier. Kappa is dependent on the prevalence of the condition [11]. This was seen in the present study where tubules used for training were imbalanced between stages: some stages were heavily repre- sented, such as stage VI with 18% of all tubules, whereas other stages were scarce, such as stage X with only 2% of all tubules. This is problematic because the classifier does not take this imbalance into account and is likely to be biased toward the more populous stages. The more examples for a stage, the better the program was at identifying that stage (Figure 5).

However, the difference in prevalence of stages did not pose a problem in the manual staging system, where stage X showed intraobserver and interobserver agreement of 84% and 85%, respectively.

How to use the method in evaluation of toxicological damage Staging is the approach recommended by the Society of Toxicologic Pathology and the regulatory guidelines when analyzing testicular tissue to detect adverse effects of drugs and chemicals on reproduction and fertility [2]. The purpose of staging, according to the recommendation, is to evaluate morphological changes in tubules of specific stages. Thus, the recommendation does not include collecting statistics on the distribution of tubules across stages. One reason for this might be the large number of tubules and animals that need to be examined to obtain a sufficiently precise estimate of the stage distribution for comparative statistics [27]. Computerized staging, as suggested in the present study, would remove the main barrier to obtain a sufficiently precise estimate, enabling a new, less labor-intensive method of toxicological analysis.

Stage frequency can be altered by endocrine-disrupting chemicals, which support the importance of this end- point [39,43]. The frequency of the different stages in mink in the present study has, to our knowledge, not previously been reported. The number of stages differs between species. In mink, 12 stages are defined by Pelletier [3]. This description was based on the changes in the development of the acrosome in the spermatid, in accordance with the method proposed by Leblond and Clermont [40], who defined the 14 stages in rats. The frequency of stages is proportional to their duration.

Stage VII, the most frequent stage in the rat, lasts for 58 h [7,41]; with a spermatogenic cycle of 12.9 d, 58 h corresponds to 18.7% of tubules. This frequency corresponds well to findings in the present study in mink, where the corresponding stage VI was found to be the most prevalent with 18% of tubules.

The association of cells, defining the stages, can also be disrupted if animals are exposed to chemicals affecting reproduction. Knowledge of staging is required to identify cell-specific and stage-specific effects [44]. Currently,

toxicological pathology of the testis involves searching for tubules in specific stages [2]. There are numerous stage-specific chemical effects described [2,44]. For example, it is recom- mended to check tubules in stage VII for degenerated pachytene spermatocytes and round spermatids since decreased testoster- one levels will lead to increased rate of degeneration of these cells in this stage [2]. It is time-consuming to manuallyfind sufficient tubules of a certain stage. However, even if only one- half of the tubules marked by the program as stage VII are actually of that stage (Figure 5), the pathologist will have to review many fewer tubules tofind a sufficient number of the desired stage. If the stages are pooled, group C will include stages VI and VII, which is found correctly by the computerized staging tool in 84% of cases (Figure 6). Another important endpoint that requires staging to be detected is sperm retention, where spermatids do not exit into the tubular lumen as normal during stage VII (or stage VIII in rat) [44]. To detect sperm retention, it is recommended to observe tubules in stages IX to XI in the rat (corresponds to stages VIII–IX in mink) to ensure that they only contain 1 population of elongated spermatids by the lumen. This would correspond to group D in the pooled staging system described in the present study.

The future use and enhancement of the computerized staging tool The computerized staging tool proposed in the present study canfind applications in the new world of digital pathology and whole-slide imaging [45,46]. With whole-slide imaging and automated tubule segmentation in place, the tool can be used to efficiently direct the investigator to tubules of the required stage.

Increasing the number of example tubules for classifier training would significantly improve staging accuracy. This is some- thing that could be accomplished as the tool is used: tubules identified by the investigator as wrongly staged can be added to the training set, steadily improving the computerized staging tool.

Immunohistochemical localization with polyclonal antibody Gata-4 shows a similar pattern marking the acrosome development in rats as in the present study [8], and the computerized staging tool would likely work on rats, after retraining with rat-specific data.

CONCLUSIONS

The presented staging method defining the 12 stages in mink testicular tissue using Gata-4 to visualize acrosome develop- ment proved to be both intraobserver-reproducible and interobserver-reproducible. To further advance and objectify this method, we propose a computerized staging system in which the 12 stages were pooled into 5 groups with an almost perfect agreement with the consensus staging by 2 investigators.

These results have the potential to modernize the tedious staging process required in toxicological evaluation of testicular tissue, especially if combined with whole-slide imaging and automated tubular segmentation.

Furthermore, a computerized method facilitates the handling of large amounts of data and collaboration between research groups.

Acknowledgment—Authors A. Fakhrzadeh and E. Spörndly-Nees have contributed equally to the present study. We are grateful for excellent technical assistance of G. Ericson-Forslund. The authors declare that no conflict of interest exists.

Data Availability—Data are available upon request from the corresponding author (Ellinor.Sporndly-Nees@slu.se).

(9)

REFERENCES

1. Bergman A, Heindel JJ, Jobling S, Kidd KA, Zoeller RT, eds. 2013.

State of the Science of Endocrine Disrupting Chemicals 2012: An Assessment of the State of the Science of Endocrine Disruptors Prepared by a Group of Experts for the United Nations Environment Programme and World Health Organization. World Health Organiza- tion, Geneva, Switzerland.

2. Lanning LL, Creasy DM, Chapin RE, Mann PC, Barlow NJ, Regan KS, Goodman DG. 2002. Recommended approaches for the evaluation of testicular and epididymal toxicity. Toxicol Pathol 30:507–520.

3. Pelletier RM. 1986. Cyclic formation and decay of the blood-testis barrier in the mink (Mustela vison), a seasonal breeder. Am J Anat 175:91–117.

4. Creasy DM. 2003. Evaluation of testicular toxicology: A synopsis and discussion of the recommendations proposed by the Society of Toxicologic Pathology. Birth Defects Res B Dev Reprod Toxicol 68:408–415.

5. Basu N, Scheuhammer AM, Bursian SJ, Elliott J, Rouvinen-Watt K, Chan HM. 2007. Mink as a sentinel species in environmental health.

Environ Res 103:130–144.

6. Persson S, Brunstrom B, Backlin B-M., Kindahl H, Magnusson U.

2012. Wild mink (Neovision vison) as sentinels in environmental monitoring. Acta Vet Scand 54:S9.

7. Russel LD, Ettlin RA, Hikim APS, Clegg ED. 1990. Histological and Histopathological Evaluation of the Testis. Cache River Press, St. Louis, MO, USA.

8. McClusky LM, Patrick S, Barnhoorn IE, van Dyk JC, de Jager C, Bornman MS. 2009. Immunohistochemical study of nuclear changes associated with male germ cell death and spermiogenesis. J Mol Histol 40:287–299.

9. LaVoie HA. 2003. The role of GATA in mammalian reproduction. Exp Biol Med 228:1282–1290.

10. Sporndly-Nees E, Ekstedt E, Magnusson U, Fakhrzadeh A, Luengo Hendriks CL, Holm L. 2015. Effect of pre-fixation delay and freezing on mink testicular endpoints for environmental research. PloS One 10:

e0125139.

11. Watson PF, Petrie A. 2010. Method agreement analysis: A review of correct methodology. Theriogenology 73:1167–1179.

12. Gibson-Corley KN, Olivier AK, Meyerholz DK. 2013. Principles for valid histopathologic scoring in research. Vet Pathol 50:1007–1015.

13. Latendresse JR, Warbrittion AR, Jonassen H, Creasy DM. 2002.

Fixation of testes and eyes using a modified Davidson’s fluid:

Comparison with Bouin’s fluid and conventional Davidson’s fluid.

Toxicol Pathol 30:524–533.

14. Fakhrzadeh A, Spörndly-Nees E, Holm L, Luengo Hendriks CL. 2012.

Analyzing tubular tissue in histopathological thin sections. Proceed- ings, Digital Image Computing Techniques and Applications (DICTA), 2012 International Conference, Fremantle, Australia, December 3–5, 2012, pp 1–6.

15. Fakhrzadeh A, Spörndly-Nees E, Holm L, Luengo Hendriks CL. 2013.

Epithelial cell segmentation in histological images of testicular tissue using graph-cut. In Petrosino A, ed, Image Analysis and Processing ICIAP 2013, Vol 8157—Lecture Notes in Computer Science. Springer, Berlin, Germany, pp 201–208.

16. Barrett WA, Mortensen EN. 1997. Interactive live-wire boundary extraction. Med Image Anal 1:331–341.

17. Chodorowski A, Mattsson U, Langille M, Hamarneh G. 2005. Color lesion boundary detection using live wire. In Fitzpatrick JM, Reinhardt JM, eds, Medical Imaging 2005: Image Processing, Vol 5747—

Proceedings of SPIE. SPIE, San Diego, CA, USA, pp 1589–1596.

18. Nanni L, Brahnam S, Lumini A. 2010. A local approach based on a local binary patterns variant texture descriptor for classifying pain states.

Expert Syst Appl 37:7888–7894.

19. Xiaoyang T, Triggs B. 2010. Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans Image Proc 19:1635–1650.

20. Tan TN. 1998. Rotation invariant texture features and their use in automatic script identification. IEEE Trans Pattern Anal Mach Intell 20:751–756.

21. Burges CJC. 1998. A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167.

22. Chang C-C, Lin C-J. 2011. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2:7.

23. Devijver PA, Kittler J. 1982. Pattern Recognition: A Statistical Approach. Prentice-Hall, London, UK.

24. Cohen J. 1968. Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70:213–220.

25. Viera AJ, Garrett JM. 2005. Understanding interobserver agreement:

The kappa statistic. Fam Med 37:360–363.

26. Allsbrook WC Jr, Mangold KA, Johnson MH, Lane RB, Lane CG, Epstein JI. 2001. Interobserver reproducibility of Gleason grading of prostatic carcinoma: General pathologist. Hum Pathol 32:81–88.

27. Hess RA. 1990. Quantitative and qualitative characteristics of the stages and transitions in the cycle of the rat seminiferous epithelium:

Light microscopic observations of perfusion-fixed and plastic-embed- ded testes. Biol Reprod 43:525–542.

28. Osuru HP, Monroe JE, Chebolu AP, Akamune J, Pramoonjago P, Ranpura SA, Reddi PP. 2014. The acrosomal protein SP-10 (Acrv1) is an ideal marker for staging of the cycle of seminiferous epithelium in the mouse. Mol Reprod Dev 81:896–907.

29. Muciaccia B, Boitani C, Berloco BP, Nudo F, Spadetta G, Stefanini M, de Rooij DG, Vicini E. 2013. Novel stage classification of human spermatogenesis based on acrosome development. Biol Reprod 89:60.

30. Fakhrzadeh A. 2015. Computerized cell and tissue analysis. PhD thesis.

Uppsala University, Uppsala, Sweden.

31. Hegenbart S, Uhl A, Vecsei A. 2011. Impact of histogram subset selection on classification using multi-scale LBP-operators. In Handels H, Ehrhardt J, Deserno TM, Meinzer H-P, Tolxdorff T, eds, Bildverarbeitung für die Medizin 2011. Springer, Berlin, Germany, pp 359–363.

32. Alomari RS GS, Chaudhary V, Al-Kadi O. 2012. Local binary patterns for stromal area removal in histology images. In Medical Imaging 2012:

Computer-Aided Diagnosis, Vol 8315. SPIE, San Diego, CA, USA, p 831524.

33. Linder N, Konsti J, Turkki R, Rahtu E, Lundin M, Nordling S, Haglund C, Ahonen T, Pietikainen M, Lundin J. 2012. Identification of tumor epithelium and stroma in tissue microarrays using texture analysis.

Diagn Pathol 7:22.

34. Cosatto E, Miller M, Graf HP, Meyer JS. 2008. Grading nuclear pleomorphism on histological micrographs. Proceedings, 19th Inter- national Conference on Pattern Recognition: ICPR 2008, Tampa, FL, USA, December 8–11, 2008, pp 1–4.

35. Doyle S, Hwang M, Shah K, Madabhushi A, Feldman M, Tomaszeweski J. 2007. Automated grading of prostate cancer using architectural and textural image features. Proceedings, 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, April 12–15, 2007, Arlington, VA, USA, pp 1284–1287.

36. Yinhai W, Turner R, Crookes D, Diamond J, Hamilton P. 2007.

Investigation of methodologies for the segmentation of squamous epithelium from cervical histological virtual slides. Proceedings, International Machine Vision and Image Processing Conference, September 5–7, 2007, Washington, DC, pp 83–90.

37. Costantini M, Sciallero S, Giannini A, Gatteschi B, Rinaldi P, Lanzanova G, Bonelli L, Casetti T, Bertinelli E, Giuliani O, Castiglione G, Mantellini P, Naldoni C, Bruzzi P. 2003. Interobserver agreement in the histologic diagnosis of colorectal polyps. The experience of the Multicenter Adenoma Colorectal Study (SMAC). J Clin Epidemiol 56:209–214.

38. Lytwyn A, Salit IE, Raboud J, Chapman W, Darragh T, Winkler B, Tinmouth J, Mahony JB, Sano M. 2005. Interobserver agreement in the interpretation of anal intraepithelial neoplasia. Cancer 103:1447 1456.

39. McClusky LM, de Jager C, Bornman MS. 2007. Stage-related increase in the proportion of apoptotic germ cells and altered frequencies of stages in the spermatogenic cycle following gestational, lactational, and direct exposure of male rats to p-nonylphenol. Toxicol Sci 95:249 256.

40. Leblond CP, Clermont Y. 1952. Definition of the stages of the cycle of the seminiferous epithelium in the rat. Ann N Y Acad Sci 55:548 573.

41. Hess RA, Schaeffer DJ, Eroschenko VP, Keen JE. 1990. Frequency of the stages in the cycle of the seminiferous epithelium in the rat. Biol Reprod43:517–524.

42. Cross SS. 1998. Grading and scoring in histopathology. Histopathology 33:99–106.

43. Aravindakshan J, Gregory M, Marcogliese DJ, Fournier M, Cyr DG.

2004. Consumption of xenoestrogen-contaminatedfish during lactation alters adult male reproductive function. Toxicol Sci 81:179–189.

44. Creasy DM. 1997. Evaluation of testicular toxicity in safety evaluation studies: The appropriate use of spermatogenic staging. Toxicol Pathol 25:119–131.

45. Al-Janabi S, Huisman A, Van Diest PJ. 2012. Digital pathology:

Current status and future perspectives. Histopathology 61:1–9.

46. Madabhushi A. 2009. Digital pathology image analysis: Opportunities and challenges. Imaging Med 1:7–10.

References

Related documents

In this survey we have asked the employees to assess themselves regarding their own perception about their own ability to perform their daily tasks according to the

Continuing the cookbook metaphor, imagine that we're now interested in finding a specific book within a swimming pool filled with books, and we are still only able to read a

(Slide 1994, 308) In Leno’s number, lack of civilisation and the absence of modernity are again represented as defining characteristics of Arctic people. A similar idea

The EU exports of waste abroad have negative environmental and public health consequences in the countries of destination, while resources for the circular economy.. domestically

46 Konkreta exempel skulle kunna vara främjandeinsatser för affärsänglar/affärsängelnätverk, skapa arenor där aktörer från utbuds- och efterfrågesidan kan mötas eller

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Av tabellen framgår att det behövs utförlig information om de projekt som genomförs vid instituten. Då Tillväxtanalys ska föreslå en metod som kan visa hur institutens verksamhet

Närmare 90 procent av de statliga medlen (intäkter och utgifter) för näringslivets klimatomställning går till generella styrmedel, det vill säga styrmedel som påverkar