Assessing the Barley Genome Zipper and
Genomic Resources for Breeding Purposes
Cristina Silvar, Mihaela-Maria Martis, Thomas Nussbaumer, Nicolai Haag, Ruben Rauser, Jens Keilwagen, Viktor Korzun, Klaus F. X. Mayer, Frank Ordon and Dragan Perovic
Linköping University Post Print
N.B.: When citing this work, cite the original article.
Original Publication:
Cristina Silvar, Mihaela-Maria Martis, Thomas Nussbaumer, Nicolai Haag, Ruben Rauser, Jens Keilwagen, Viktor Korzun, Klaus F. X. Mayer, Frank Ordon and Dragan Perovic, Assessing the Barley Genome Zipper and Genomic Resources for Breeding Purposes, 2015, PLANT GENOME, (8), 3.
http://dx.doi.org/10.3835/plantgenome2015.06.0045 Copyright: Crop Science Society of America
https://www.crops.org/
Postprint available at: Linköping University Electronic Press http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-124523
1
Assessing the barley genome zipper and genomic resources for breeding
purposes
Cristina Silvar1,2, Mihaela M. Martis3,4, Thomas Nussbaumer3,8, Nicolai Haag1,5, Ruben Rauser1, Jens Keilwagen6, Viktor Korzun7, Klaus F.X. Mayer3, Frank Ordon1, Dragan Perovic1*
Running Head: Assessment of barley genomic resources
1Julius Kühn-Institute (JKI), Federal Research Institute for Cultivated Plants, Institute for Resistance Research and Stress Tolerance, 06484-Quedlinburg, Germany
2Grupo de Investigación en Bioloxía Evolutiva, Departamento de Bioloxía Animal, Bioloxía Vexetal e Ecoloxía, Universidade da Coruna, 15071-A Coruña, Spain
3Plant Genome and System Biology (PGSB), Helmholtz Center Munich, 85764-Neuherberg, Germany
4BILS (Bioinformatics Infrastructure for Life Sciences), Division of Cell Biology, IKE, Faculty of Health Sciences, Linköping University, SE-581 85 Linköping, Sweden
5Julius Kühn-Institute (JKI), Federal Research Institute for Cultivated Plants, Institute for Plant Protection in Fruit Crops and Viticulture, 76833 Siebeldingen, Germany
6Julius Kühn-Institute (JKI), Federal Research Institute for Cultivated Plants, Institute for Biosafety in Plant Biotechnology, 06484Quedlinburg, Germany
7KWS LOCHOW GmbH, 37574 Einbeck, Germany
8Division of Computational Systems Biology, Department of Microbiology and Ecosystem Science, University of Vienna, 1090, Vienna, Austria
2 Abstract
The aim of this study was to estimate the accuracy and convergence of newly developed barley
genomic resources, primarily GenomeZipper (GZ) and POPulation SEQuencing (POPSEQ), at the genome-wide level and to assess their usefulness in applied barley breeding by analysing seven
known loci. Comparison of barley GenomeZipper and POPSEQ maps to a newly developed
consensus genetic map constructed with data from thirteen individual linkage maps yielded an
accuracy of 97.8% (GenomeZipper) and 99.3% (POPSEQ), respectively, regarding the chromosome
assignment. The percentage of agreement in marker position indicates that on average only 3.7%
GenomeZipper and 0.7% POPSEQ positions are not in accordance with their cM coordinates in the
consensus map. The fine scale comparison involved seven genetic regions on chromosomes 1H, 2H,
4H, 6H and 7H, harboring major genes and quantitative trait loci (QTL) for disease resistance. In
total, 179 GZ loci were analyzed and 64 polymorphic markers were developed. Entirely, 89.1% of
these were allocated within the targeted intervals and 84.2% followed the predicted order.
Forty-four markers showed a match to a POPSEQ-anchored contig, the percentage of collinearity being
93.2% on average. Forty-four markers allowed the identification of twenty-five fingerprinted
contigs (FPC) and a more clear delimitation of the physical regions containing the traits of interest.
Our results demonstrate that an increase in marker density of barley maps by using new genomic
data significantly improves the accuracy of GenomeZipper. In addition, the combination of different
barley genomic resources can be considered as a powerful tool to accelerate barley breeding.
Abbreviations: Genome Zipper, virtually ordered barley genes; POPSEQ, Population sequencing,
Consensus map, Integration of individual linkage maps; PCR, polymerase chain reaction; QTL, quantitative trait loci.
3
Barley (Hordeum vulgare L.) was domesticated in the Fertile Crescent about 10,000 years ago
(Badr et al., 2000) and independently in Tibet, as the adaptation to the extreme environmental
conditions, about 3,500–4,000 years ago (Dai et al., 2012). Today, barley is one of the most
important cereal crops worldwide, ranking fourth in terms of total production (FAOSTAT, 2012).
Such relevance arises from its versatility to adapt to different stress conditions and from its essential
use in malting and brewing industries as well as for animal feed (Baik and Ullrich, 2008; Ceccarelli
et al., 2010; Verstegen et al., 2014). Recent reports on barley’s health benefits have also promoted a
renewed interest for this ancient food grain (Brockman et al., 2013; Sullivan et al., 2013). Apart
from this key role in agriculture, the diploid and inbreeding nature of barley makes it also a very
attractive model species for genetic studies within the Triticeae tribe (Bockelman and Valkoun,
2011). The major impediment for its full exploitation comes from the presence of a large and
complex repeat-rich genome of 5.1 Gbp (Dolezel et al., 1998). Nevertheless, progress in barley
genetics and genomics research has been continuously moving forward (Graner et al., 2011;
Kumlehn and Stein, 2014).
From the pioneering work of Sturtevant (1913), who constructed the first genetic map of barley, the
mapping of genes, morphological traits, and now molecular markers and sequences was one of the
most challenging tasks of many generations of geneticists. In the meantime, many tools and
strategies for the ordering of markers and sequences were developed, but all of them had some
advantageous and disadvantageous features, in introducing certain level of errors (Romero et al.,
2009; He et al., 2001). Nowadays, with the decline in the costs of next-generation sequencing
(NGS) technologies and high throughput genotyping platforms, permitting the generation of
thousands of data points in a very short time, there is a genuine need for new methods for ordering
of genetic data and strategies that assess the accuracy of the order. Among the ordering approaches,
the barley GenomeZipper (GZ) (Mayer et al., 2011) and POPulation SEQuencing (POPSEQ)
(Mascher et al., 2013a) are the most advanced ones in barley genetics. Furthermore, single maps in
4
many thousands of markers and NGS data. However, up to now little is known about the error rate
and precision of constructed maps. In last years, high-throughput techniques i.e. Illumina iSelect
platform, Genotyping by Sequencing (GBS) along with the flow-sorting of chromosomes have
revolutionized barley genotyping (Simková et al., 2008; Muñoz-Amatriaín et al., 2011, 2014a). For
example, an Illumina 9K SNP chip based on sequence polymorphisms in ten diverse cultivated
barley genotypes and a Genotyping-by-Sequencing (GBS) approach for barley have been recently
developed (Comadran et al., 2012; Poland et al., 2012). In spite of the enormous progress in barley
genomics, these are of limited use without the availability of a draft genome sequence. In 2012, the
International Barley Sequencing Consortium (IBSC) generated a densely anchored physical map of
the barley genome comprising 9,265 fingerprinted BAC contigs spanning 4.98 Gb (IBSC, 2012).
Furthermore, Mayer et al. (2009; 2011) developed another genomic resource, which provided clues
on the barley genome composition. They constructed linear ordered virtual gene maps of barley by
using the so-called GenomeZipper approach. The barley GZ assembles 86% of the barley genes in a
putative linear order along the individual barley chromosomes by exploiting the high synteny
among three reference grass genomes, namely Brachypodium distachyon, Oryza sativa, and
Sorghum bicolor (Goff et al., 2002; Yu et al., 2002; Paterson et al., 2009; International Brachypodium Initiative, 2010). More recently, POPSEQ facilitated the development of genetically
ordered contigs from a whole genome shotgun (WGS) assembly of barley cv. Morex by genotyping
a mapping population with shallow genome coverage (Mascher et al., 2013a). Subsequently, the
new information provided by POPSEQ was employed to order and genetically anchor the barley
physical map, establishing a minimum tilling path (MTP) that comprises more than 65,000 BAC
clones (Ariyadasa et al., 2014). Recently, 15,622 BACs representing the minimum tiling path of
72,052 physical-mapped gene-bearing BACs, were identified and sequenced (Muñoz-Amatriaín et
al. 2015).
Undoubtedly, all these advancements will be extremely beneficial in a wide range of studies in both
5
context, the use of this information remains mostly unexploited. Only few reports in the last years
have been partly focused on the application of the barley GenomeZipper for marker saturation of
genetic intervals containing interesting traits, such as spike density, resistance to powdery mildew,
barley yellow/mild mosaic virus or barley yellow dwarf disease (Shahinnia et al., 2012; Silvar et al.,
2013; Ordon and Perovic, 2013; Yang et al., 2013; Lüpken et al., 2013, 2014). In barley breeding
there is an urgent need for tools mostly directed to the quick and efficient identification of sets of
molecular markers which are closely linked to the traits of interest. Such markers may be readily
applied to marker-assisted selection (MAS), marker-assisted backcrossing strategies or so called
precision breeding (McCouch, 2004), which enable to select traits with greater accuracy and
deploying them cost-effectively into new varieties (Collard and Mackill, 2008). Similarly, those
markers would be advantageous in accelerating map-based cloning approaches (Bolger et al., 2014;
Yang et al., 2014) followed by allele mining and exploration of natural genetic variation
(Muñoz-Amatriaín et al., 2014b). The novel genomic resources of barley are valuable tools in this respect.
Nevertheless, to take full advantage of these, it is essential to firstly evaluate and validate these
tools in a breeding context.
In this present work, thirteen linkage maps (Muñoz-Amatriaín et al., 2014a; Perovic et al., in
preparation) were used to construct a consensus map, which was compared to the barley GenomeZipper, POPSEQ and the barley physical map. Furthermore, seven well-defined loci were
employed to assess the same resources at a microsyntenic level to get information on the accuracy
and usefulness to accomplish fine mapping schemes and physical delimitation of genomic regions
6
MATERIALS AND METHODS
Construction of a consensus marker map
A set of 13 genetic linkage maps was used for the construction of a consensus map. Twelve of them were taken from Muñoz-Amatriaín et al. (2014a), while the thirteenth map was developed on the cross MBR1012×Scarlett (Perovic et al., in preparation). This latter population of 86 doubled haploid (DH) lines was genotyped with the barley iSelect 9K SNP chip (Comadran et al., 2012).The
consensus map was constructed using the R package LPmerge according to Endelman and Plomion
(2014). In short, a consensus map was computed independently for each chromosome using a range
of possible interval values from 1 to 5. These interval values specify the number of neighboring
markers that were used to compute the consensus map. Subsequently, the root mean square error
(RMSE) values between the corresponding consensus map and the individual maps were computed
for each possible interval and chromosome. Based on these RMSE values, the final consensus map
was selected for a specific chromosome with the smallest value. All chromosomal maps were
manually verified and additional SNP markers, originating from different Illumina platforms, i.e. the 9K iSelect chip (Comadran et al., 2012), and a set of 459 BOPA markers (Close et al., 2009),
were included.
Comparative analysis of the consensus map to the Barley GenomeZipper and the POPSEQ map
On the basis of common BOPA markers, the genetic positions between the consensus marker map of this study and the consensus map of Close et al. (2009) were identified and compared. The results of this comparison were visualized for each barley chromosome individually by generating dot plots and statistically evaluated by calculating Spearman correlations using the python packages Matplotlib (Hunter, 2007) and NumPy, respectively.
The consensus map created in this work was used in assessment of the robustness of the
7
iSelect markers were aligned against the POPSEQ-anchored cv. Morex contigs (Mascher et al.,
2013a) and against the gene indices of the GenomeZipper for all seven barley chromosomes(Mayer
et al., 2011) by using BLASTN (Altschul et al., 1997). The barley 'zipper' data set consists of anchored barley fl-cDNAs, barley markers, and genes from the reference genomes of
Brachypodium, rice, and Sorghum. Only the first best hits with an alignment length of at least 100
bp and location on the same chromosomes were considered. The quality of the observed overlap between the three maps (consensus map, virtual ordered gene map (GZ), and POPSEQ map) was assessed by dividing each position by the total map length and allowing a 10% difference. The recombination frequency was computed in non-overlapping bins of 50 GenomeZipper loci. All consensus markers in a given bin were considered and the genetic distance between the marker with the highest and lowest position was computed. In order to filter for wrongly assigned marker, all markers with >10 cM compared to the median position per bin were removed. The results were statistically evaluated through a non-parametric measure of correlation (Spearman's rank correlation coefficient, SRC) and visualized by using CIRCOS (Krzywinski et al., 2009).
Microcollinearity - Comparative analysis of seven genetically mapped loci to GenomeZipper, POPSEQ and barley physical map
Seven loci or Quantitative Trait Loci (QTL)(subsequently termed from L1 to L7) located on five
different barley chromosomes and genetically mapped in the context of other studies were used for
comparative purposes at the microsynteny level: L1 (chromosome 1HS) (RphMBR1012)(König et
al.,2012), L2 (rym7) (1H centromere) (Yang et al., 2013), L3 (2HL) (RydHb_2HL)(Perovic et al.,
2013), L4 (rym11) (4H centromere) (Lüpken et al., 2013), L5 (Bg_QTL_6HL) (6HL) (Silvar et al.,
2011a; 2013), L6 (Bg_QTL_7HS) (7HS) (Silvar et al., 2010; 2012; 2013) and L7 (Bg_QTL_7HL)
8
some of them (L1, L5, L6 and L7) are under assumption of being less conserved (Leister et al.,
1998).
Firstly, the synteny-ordered virtual gene map of barley (GZ) was validated by using this set of
seven genetically mapped loci and developing markers from corresponding barley ‘zippers’
followed by mapping as described in the original publications. For the short arm of chromosome 1H
and the long arm of chromosome 2H (unpublished data) comparative analysis to barley ‘zippers’,
marker design and genotyping was essentially done according to Perovic et al. (2004) and Silvar et
al. (2013). Briefly, markers genetically flanking the regions of interest were used to select the target
intervals in the virtual linear gene map. Zipper-based markers were used for amplification in both
parental lines and amplicons were sequenced on an ABI377XL instrument using BigDye terminator
sequencing chemistry (ABI Perkin Elmer, Weiterstadt, Germany). Markers, for which
polymorphisms were based on presence/absence of PCR fragments between parental lines, were
directly mapped. In turn, SNPs were transformed to CAPS markers (Perovic et al., 2013) or
pyrosequencing markers using a biotin-labelled M13 primer (Silvar et al., 2011b). Linkage analyses
were performed with JoinMap 4.0 (van Ooijen, 2006). Secondly, 57 markers derived from the target
intervals of the seven loci were compared to the POPSEQ-anchored contigs of cultivar Morex
(Mascher et al., 2013a) by using BLAT (Kent, 2002)requiring an identity of 99% and a minimum
match length of 50 nucleotides. The whole-genome shotgun (WGS) contigs and the physical
fingerprinted contigs (FPC) were linked to each other by BLAST using stringent criteria and only
matches with at least 99% sequence identity and at least 90% sequence coverage of the smallest
sequence (either WGS contig or a physical contig) were considered. The genetic markers from the
seven loci were also compared to 15,622 recently sequenced barley clones (Muñoz-Amatriaín et al.,
2015). Marker sequences were assigned to a clone with help of the Harvest BLAST
(http://www.harvest-blast.org/) when clones matched a marker with an e-value of<=10e-5 and with a match length of at least 100 nucleotides. Clones were further linked to physical contigs on the
(http://harvest-9
web.org/hweb/utilmenu.wc?job=RTRVFORM&db=MOREX_HV3_10.4.1) where a mapping of
10
RESULTS
Construction of a consensus genetic map
The consensus map of barley in this study consists of SNP markers with an average length of 125
nucleotides and it was constructed by using 13 mapping populations and different Illumina
platforms (9K Infinium iSelect high density custom genotyping bead chip and Illumina BeadXpress
Array)(Supplemental Table S1).The resulting consensus map holds a total of 6,405 markers in
1,978 unique positions (bins) (Table 1). The total length of the genetic map is 1,120.27cM,
providing a density of theoretically one marker per 0.17 cM and one marker bin every 0.57 cM.
Chromosomes 2H and 5H had the highest number of markers and bins and their genetic maps were
also the largest in size, whereas chromosomes 1H, 4H and 6H were the smallest, containing the
lowest number of markers and bins (Table 1). Ordering conflicts among the set of linkage maps
ranged from zero for chromosome 6H to 18 for chromosome 2H (Table 1). The map displays two
gaps of 7.46 and 6 cM in the long arms of chromosomes 2H and 6H, respectively, with the
remaining gaps being smaller than 5 cM (Supplemental Table S1and Supplemental Fig. S1).
Comparative analysis to the barley GenomeZipper and POPSEQ at the genome-wide scale
A map comprising 2,785 BOPA markers constituted the genome-wide framework along which the
barley genes were ordered and positioned for each individual barley chromosome in the
GenomeZipper (Mayer et al., 2011). A proportion of these BOPA markers is also represented in the
iSelect 9K chip and is segregating in the consensus map. First comparison, which also served as a
positive control, consisted in corroborating the positions of commonly positioned BOPA markers
between both datasets. Entirely, 2,447 markers were in common between the consensus map and
11
chromosome 5H (Supplemental Fig. S2). The agreement in the order of shared BOPA markers
between the two maps at the individual chromosome level varied from 0.993 (4H) to 0.999 (2H,
7H), as measured by Spearman´s rank correlation coefficient. On average, the agreement in the
intra-chromosomal location for the BOPA markers was 99.6% (Supplemental Fig. S2).
The marker order in the consensus map was employed to evaluate the collinearity of predicted
genes in the barley ‘zippers’ and in their sequence counterparts in the POPSEQ-anchored data.
Based on the sequence homology, 689 markers (10.76%) from the consensus map did not find a
counterpart neither in the virtual map nor in the WGS contigs based map (Table 2). Additionally,
1,337 (20.9%) markers did not match to any loci inferred by the GenomeZipper. Out of 6,405
markers, 4,379 (68.4%) were represented in the GenomeZipper (Table 2). A high percentage
(97.8%) of the marker loci is located on the same chromosome in both datasets and only 2.2% of
markers showed a hit to an erroneous chromosome (Table 2). Detailed analysis considering the
number of markers per chromosome revealed that chromosome 2H showed the highest amount of
disagreements regarding chromosome assignments (3.2%), while chromosome 5H displayed the
lowest level of misaligned markers (1.6%). The genetic coordinates of markers in the consensus
map and the inferred loci in ‘zippers’ was consistent at the genome-wide scale, although the
resolution of comparison was lower in genetic pericentromeric regions, were a large amount of
genetic markers were co-segregating (Fig. 1). The percentage of agreement in marker position was
high (96.24% on average), varying from 94.19% (7H) to 97.63% (4H) (Table 2).
The comparison of the consensus map to POPSEQ revealed that 941 (14.7%) loci did not show any
hit to a WGS contig. The remaining markers displayed a match to a POPSEQ contig, 99.30%
having congruent chromosome positions (Table 2). Screening of a 5 cM window within both maps
indicated that only 0.7% of positions on average disagreed in cM coordinates between the
consensus and the POPSEQ, varying from chromosome 5H (0.43%) to chromosome 2H
12
The number of misaligned markers on each chromosome was plotted against their genetic
coordinates to evaluate the putative existence of specific regions holding higher frequency of
misallocated markers (Supplemental Fig. S3). Misaligned markers between the consensus map and
GZ or POPSEQ were evenly distributed along each barley chromosome and a specific pattern of
misallocation associated to initial positions could not be clearly appreciated (Supplemental Fig. S3).
On the contrary, recombination frequencies were not uniformly distributed along the
centromere-telomere axis, but tended to be higher in the distal ends of each chromosome (Fig. 1, track f).
Comparative analysis of seven genetically mapped loci to GenomeZipper, POPSEQ and Barley physical map
Seven loci (L1-L7) located on barley chromosomes 1H, 2H, 4H, 6H and 7H were employed to
experimentally test the accuracy in virtually ordered genes of the barley GenomeZipper at a fine
scale (Fig. 1). L1 and L4 were genetically positioned previously in a high-resolution mapping
population, while the others were mapped at a lower resolution. The sequences of flanking markers
from all loci were used to survey the data on the barley GenomeZipper. The fourteen markers
matched the corresponding barley unigenes, spanning intervals in the GZ that varied in size from
0.90 (L5) to 7.49 cM (L3). The locus L2, at the centromeric region of chromosome 1H, did not
display any interval, since the flanking markers match barley unigenes which are co-segregating
(Table 3). The combination of intervals contained a total of 486 loci included in the GenomeZipper
ranging from 30 (L6) to 198 (L2). Among these, only 62 loci (12.76%) corresponded to originally
used BOPA markers, the other 424 were postulated according to the sequence homology to
Brachypodium, rice and Sorghum genes. In total, 39.7% of targeted GenomeZipper loci possessed an orthologous gene in all three reference genomes, while the positions of 16.2%, 10.1% and7.3%
of selected gene models was based on their homology to Brachypodium, Sorghum or rice,
13
support from the above mentioned model grass species (Supplemental Table S2). A set of 179
barley predicted genes, varying from 10 on chromosome 7HL (L7) to 66 on chromosome 1H (L2),
putatively located within the target intervals were selected for further work (Table 3). Twenty-five
of them (13.9%) did not generate any PCR amplicon, even when different primers pairs were tested
at different positions in the gene. The majority of these were identified for the loci L1 (5.02%) and
L3 (5.02%) on chromosomes 1HS and 2HL (Table 4). The remaining 154 loci were amplified and
sequenced in the corresponding parental lines of the mapping population. Eighty-five markers
(55.2%) turned out to be monomorphic. The lowest level of polymorphism was observed in the
centromeric region of chromosome 1H, which showed 83.3% of monomorphic loci. On the
contrary, those regions on the distal part of chromosomes 6HL and 7HS turned out to be highly
polymorphic, the rate being 100% (Table 4). Among the remaining 69 polymorphic markers, 7 were
genotyped based on the presence/absence of the PCR amplicon in one of the parental lines, whereas
10 markers were mapped according to a size polymorphism. The other 47 loci containing SNPs
were converted to CAPS markers (Table 4). Eleven polymorphic markers were identified in the L2
interval, but five of them were not mapped in the original work (Yang et al., 2013), therefore they
were not considered in subsequent analyses.
In total, 8, 6, 13, 6, 11, 12 and 8 markers could be genetically mapped to chromosomes 1HS, 1H,
2HL, 4H, 6HL, 7HS and 7HL, respectively, in the corresponding mapping populations yielding 64
new markers. Seven markers (10.9%) were located outside of the target intervals, mainly on
chromosome 7HS (6.3%). Forty-eight (84.2%) markers out of 57 mapped in good collinearity with
their estimated positions in the barley ‘zippers’ (Table 5, Fig. 2), especially on chromosomes 4H
and 6HL, where 100% of zipper-based markers are located in the same position as those predicted
in the putative barley gene indexes (Table 5, Fig. 2).
The sequences from 57 consensus markers within the selected intervals from all seven target loci
were used to survey the POPSEQ (Mascher et al., 2013a), the IBSC (2012) and the
14
barley. In total, 44 (77.2%) newly developed markers showed a match to POPSEQ-anchored WGS
contigs. The highest number of hits was observed for locus L5 (6HL), where all newly developed
markers found a counterpart in POPSEQ. The lowest number of matches (3 out of 6) was detected
for the centromeric region of the chromosome 4H (Table 5, Supplemental Table S3). On average,
the percentage of collinearity between new zipper markers and contigs in POPSEQ compared to our
consensus map was of 93.2% in the combination of chromosomes (Table 5). The comparison to
POPSEQ allowed verifying the correct ordering for the majority of genome-zipper based markers in
their corresponding genetic map, but also the identification of markers with inconsistent location.
Thus, 5 out of 9 inaccurately predicted loci on chromosomes 1H and 7H were mapped at a right
position according to POPSEQ (data not shown).
The comparison of the evaluated intervals against the genetically anchored physical map of IBSC
(2012) and the recently 15,622 sequenced barley clones from Muñoz-Amatriaín et al. (2015)
allowed identifying the fingerprinted contigs (FPC) associated to the genome zipper-based markers
and therefore to delimit the physical regions underlying the seven target loci (Table 5, Fig. 2). In
total, forty-four markers (77.2%) showed a hit to a FPC allowing the identification of twenty-five
FPCs; two for L6 (7HS) and L7 (7HL), three for L2 (1H), four for L1 (1HS), L4 (4H), L5 (6HL)
15
DISCUSSION
Despite the difficulties due to the large size and complexity of the barley genome, relevant
achievements in barley genomics have been accomplished during the last years. GenomeZipper,
POPSEQ and the resources established in the framework of the IBSC, including various high
density marker maps, have accelerated the exploitation of this Triticeae crop (Graner et al., 2011;
Feuillet et al., 2012). However, these resources need further validation to become beneficial for
plant breeders, who demand tools that allow not only the expeditious improvement of molecular
markers tightly linked to genes or QTL of interest but also the usefulness of these markers across
different germplasm resources (Varshney et al., 2006; Kilian and Graner, 2012; Keilwagen et al.,
2014).
The construction of robust and highly resolved consensus linkage maps derived from experimental
mapping data has been a long-standing challenge in barley genetics. Integrated genetic maps
represent a more reliable resource for genetic anchoring of contig-based local or genome-wide
physical maps and allow the orientation of scaffolds in genome assemblies (Paux et al., 2008; Alsop
et al., 2011). Additionally, the accuracy and density of markers in a consensus map serve as
valuable features towards the assessment of newly developed barley genomic resources. Within the
barley research community, two integrated consensus maps have been recently published
(Muñoz-Amatriaín et al., 2011; 2014a). The consensus map constructed in the present study was intended to
be an improved version of those reported above, by incorporating additional informative
recombination events derived from the mapping population MBR1012×Scarlett. The inclusion of
this population had a relevant impact on the consensus map resolution. Thus, the newly integrated
linkage map consists of 6,405 markers, which represents an increase of 740 markers over the map
developed by Muñoz-Amatriaín et al. (2014a). If only iSelect SNPs are considered, the
MBR1012×Scarlett linkage map contributed 1,205 markers to the previous consensus map
representing an improvement of 2,438 SNPs with respect to the Morex×Barke map developed by
16
the map by Muñoz-Amatriaín et al. (2014a) revealed a good consistence in the locus order, except
for chromosome 2H, where a higher number of ordering conflicts were observed. Seventeen out of
eighteen markers in conflict derived from the MBR1012×Scarlett population, although they affected intervals equal or smaller than 2 cM. Considering the numbers of marker bins, the resolution of the current consensus linkage map was particularly improved for chromosomes 2H,
3H and 5H, each one showing an increase of 5, 18 and 86 unique positions over the consensus map
from Muñoz-Amatriaín et al. (2014a). Dense genetic maps will be also valuable for applied barley
breeding, to perform precise introgression of improved traits in elite cultivars as well as for accurate
association mapping studies and genomic selection approaches (Heffner et al., 2010; Lorenz et al.,
2012).
The newly developed consensus map was employed to investigate in silico the accuracy and
robustness of the barley ‘zippers’ at the genome-wide scale. A similar approach was performed
previously by Poursarebani et al. (2013), but in that report only an individual map containing 1,596
transcript derived markers was used for comparative purposes. On the contrary, the utilization of a
high-density consensus map holding 6,405 markers is a more appropriate framework for such
comparison and validation. Indeed, a higher percentage (68.4%) of shared markers was observed in
our work compared to Poursarebani et al. (2013), who found that only 37.8% of their genetic
markers were represented among the GenomeZipper gene panels. Additionally, the percentage of
markers and gene models which possess the same chromosomal location was also higher in the
present study (97.8% versus 95%). The average percentage of collinearity between both datasets, as
measured by the Spearman's coefficient, was similar to that reported by Poursarebani et al. (2013) (96.2% and 96%, respectively). Such results support the greater suitability of consensus maps in
order to validate grass-based comparative genome organization models in barley. Notwithstanding,
as proposed by Poursarebani et al. (2013), the predicted chromosomal positions and virtually gene
order postulated by genome ‘zippers’ resulted to be highly precise (~96% accuracy) at the
17
number of matches than those obtained with the GenomeZipper, the increase being of 6.2% of the
total markers. POPSEQ provides a linear order of WGS contigs genetically positioned along the
seven individual barley chromosomes (Mascher et al., 2013a). The power of such methodology has
been sufficiently demonstrated in previous reports by using different genotyping platforms and
mapping populations (Mascher et al., 2013a; Ariyadasa et al., 2014; Chapman et al., 2015). Our
results corroborated the robustness of POPSEQ, the genetic coordinates of contigs being coincident
with markers positions in the integrated map at a 99.30% on average.
Although the performance of barley ‘zippers’ at the genome-wide level appeared to be highly
reliable, the development of any genomics-based breeding strategy requires the examination of the
virtual gene order at a finer scale, when exploited syntenic relationships to B. distachyon, rice and
Sorghum might be more influenced by misinterpretations (Li et al., 2002; Caldwell et al., 2004; Pourkheirandish et al., 2007). With these drawbacks in mind, the predicted linear gene index was
investigated with experimental evidences at a low and high genetic resolution level, so called
microcollinearity (Keller and Feuilet, 2000). Original data were compiled from earlier studies
covering a set of seven loci conferring resistance to various fungal and viral diseases and traced to
distinct barley chromosomes (Lüpken et al., 2013; Perovic et al., 2013;Yang et al., 2013; Silvar et
al., 2010, 2012, 2013). A similar procedure was performed by Poursarebani et al. (2013), but in that
report a section in the long arm of chromosome 2H was randomly selected for comparative
purposes. On the contrary, the screening of several chromosomal regions spanning resistance genes
or QTLs, carried out in the present study, will guarantee a more reliable representation of the
GenomeZipper power at the one-to-one relationship among orthologous genes. Disease resistance
loci are particularly unstable, tend to be located in less conserved regions and are commonly
affected by structural variation (Leister et al., 1998; Meyers et al., 2003; Wicker et al., 2009). In
total, 179 out of 486 GZ genes were considered for further development of tightly linked markers.
Entirely, 13.9% of the loci could not be amplified on parental lines from different mapping
18
support previous data by Silvar et al. (2013), who suggested apparent modifications in the
gene-space of landrace-derived lines compared to the genome sequences of modern barley cultivars.
Additionally, sequencing errors in the reads generated by 454-pyrosequencing could not be
discarded as the source of absence of PCR products. Out of 154 primer pairs tested for
polymorphism on the five examined chromosomes, 85 (55.2%) generated a monomorphic PCR
amplicon. This was due to the presence of two centromeric loci on chromosomes 1H (L2) and 4H
(L4), which contributed 85.8% of non-polymorphic markers. Loci in centromeric and
peri-centromeric regions are on average less polymorphic than loci located on the rest of the barley
chromosomes (Dvorák et al., 1998) as they are commonly organized in haplotype blocks locked
into recombination-“inert” genomic regions (Thiel et al., 2009; Comadran et al., 2010). If these loci
were removed, the rate of polymorphisms increases up to 81.3%, which is similar to that found in
other reports based on barley ESTs or unigenes (Liu et al., 2010; Silvar et al., 2012). All 64 newly
developed markers were accurately assigned to the corresponding chromosome. However, the
genetic maps obtained with those markers were not in complete accordance with the putative linear
gene order described in the GenomeZipper. Thus, 10.9% were located outside of the initial target
intervals defined by the virtual gene index and 15.8% were not genetically mapped according to the
gene order expected from the barley ‘zippers’. Such absence of collinearity might be attributed to
insertions or inversions hypothesized in the virtually ordered gene inventory but not confirmed in
our results, as suggested by Silvar et al. (2013). This outcome supports the well documented
existence of multiple inter- or intra-chromosomal rearrangements throughout the evolution among
grass genomes (Bossolini et al., 2007; Salse et al., 2008; Bolot et al., 2009; Wicker et al., 2011).
Variations in collinearity can be also explained by some gene models that are supported only by
their counterpart in one or two reference genomes (Mayer et al., 2011). However, as suggested by
Poursarebani et al. (2013), a general rule asserting that more than one model genome will increase
the accuracy of the genome zipper, should not be established, since a few markers based on single
19
model genomes might serve to overcome limitations imposed by species-specific regional
variations (Mayer et al., 2011). Various reports also demonstrated that collinearity is commonly less
conserved at the most distal telomeric regions of chromosomes (Li et al., 2002; Caldwell et al.,
2004). This was actually the case of target loci on chromosomes 1HS, 2HL and 7HS (see Fig. 2),
but not for that on chromosome 6HL, which showed a 100% of agreement of the order for the
zipper-derived markers mapped to the interval of interest. Beyond this, a good performance was
observed for the barley ‘zippers’, allowing the development of 64 new markers and their mapping
with an accuracy of almost 85%. From a breeding point of view, the quality of the order prediction
permitted a more precise dissection of the regions containing interesting traits located on five
different barley chromosomes.
These new zipper-based markers were employed further for comparisons to the POPSEQ (Mascher
et al., 2013a) and barley physical map (IBSC, 2012; Muñoz-Amatriaín et al., 2015) in order to
circumscribe the regions in the barley genome conferring resistances. Based on sequence homology,
61.40% of the total markers found a hit to a Morex contig in POPSEQ. The genetic order of these
loci in their respective linkage maps coincide with the position of the POPSEQ contigs to a 100%.
This output served to verify that 55.5% of the target gene models previously cataloged with
erroneous positions according to the GenomeZipper, were correctly mapped and allowed to
partially resolve in silico blocks of co-segregating markers, which were assigned to distinct
positions in the linearly ordered index of WGS contigs. These aspects pave the way towards the
valuable utilization of POPSEQ in breeding. This tool should be more amenable than the barley
‘zippers’ for fine-mapping and cloning of agronomically important genes, provided that genetic
markers sufficiently close to the loci of interest and adequate resolution in the mapping population
are available. The outcome arising from POPSEQ could be anchored in a straightforward manner to
the barley physical map, accelerating the identification of BAC contigs and subsequent isolation of
20
than ordinary anchoring strategies relying on sequence comparison of flanking markers derived
from sequence tags (Mascher and Stein, 2014).
The new markers developed from the barley ‘zippers’ were also employed to demarcate, based on
sequence homology, the physical regions in the barley genome responsible for the traits of interest.
Between two and six contigs were identified by sequence comparison for the seven barley loci.
Those loci derived from mapping populations with higher resolution permitted the definition of
tiling paths holding a lower number of FPCs (data not shown) as long as the target loci are not
allocated to centromeric regions (Lüpken et al., 2014).. Identified contigs were employed to make a
shallow approximation to the lately anchored physical map of barley and the established minimum
tilling path (MTP) containing 66,772 overlapping clones (Ariyadasa et al., 2014). Discordant contig
placements were only observed on chromosome 7HS and might be explained by the technical and
biological inaccuracy inherent to the construction of any genetic map (Wenzl et al., 2006; Wu et al.,
2008). In addition, the short arm of chromosome 7H has been described as a “hot spot” of
recombination, which might also contribute to the order controversy (Drader et al., 2009). The
precise exploration of those disagreements should be carefully considered on the way to positional
isolation of resistance genes (Liu et al., 2014). Even though the rough identification of the physical
contigs provided extended information about the genomic context and local neighborhood
underlying the traits of interest, the low genetic resolution of the majority of assayed mapping
populations did not encourage us to speculate about the number or nature of putative candidate
genes lying within the delimited genomic areas. That should be the principal task of further projects
aimed to the map-based cloning of genes, as it was the case of rym11 (Yang et al., 2014).
The present work clearly demonstrated that recently established barley genomic resources can be
efficiently exploited for breeding purposes. In spite of the appearance of few discrepancies, such as
zipper-based markers outside the target intervals or erroneously positioned, our data elucidates that
GenomeZipper and POPSEQ are very powerful tools for marker saturation, chromosome dissection
21
meet some limitations depending on the target chromosomal region. This could be the case for those
loci allocated in the proximity of centromeres, which usually show low recombination (IBSC, 2012;
Muñoz-Amatriaín et al. 2015). Those strategies might be also employed with success in other
complex genomic contexts, such as wheat(http://wheat-urgi.versailles.inra.fr), or unsequenced
orphan crops of economic importance, like rye (Secale cereale) or perennial ryegrass (Lolium
perenne) (Pfeifer et al., 2013; Martis et al., 2013). As demonstrated in the present work, the combined use of various genomic tools will help plant breeders and geneticists in different manners.
Firstly, it will permit the rapid development of markers tightly associated with the gene of interest,
which might be further exploited or optimized for molecular marker-assisted selection (MAS) or
even cataloged as functional markers (Andersen and Lübberstedt, 2003; Palloix and Ordon, 2011).
Secondly, it will facilitate the disclosure of blocks of co-segregating markers, typically associated to
low-resolution mapping populations, in a more efficient manner (Silvar et al., 2013). Finally, the
combination of those different genomic resources should lead to a more straightforward and faster
physical delimitation of promising regions in the barley genome, which constitute the starting point
towards map-based cloning strategies (Lüpken et al., 2013; Yang et al., 2014). Some recently
published bioinformatics tools, such as Ensembl Plants (http://www.ensemblgenomes.org),
chromoWIZ (http://mips.helmholtz-muenchen.de/plant/chromowiz/indez.jsp) or BarleyMap (http://floresta.eead.csic.es/barleymap) should ease the integration of different available genomics resources, allowing plant geneticist and breeders to manage these information in a time-saving
manner (Kersey et al., 2014; Nussbaumer et al., 2014; Cantalapiedra et al., 2015). To our
knowledge, this study is among the first efforts oriented towards the unification of various genomic
resources with breeding purposes. Altogether the fast enrichment of barley genome sequence
information and novel techniques such as exome capture (Mascher et al., 2013b) should help to
move barley breeding to an unprecedented level of precision and productivity, as foresee by Bevan
22 Acknowledgements
This work was supported by the German Federal Ministry of Education and Research under the grant number AZ 0315702 and by the Spanish Ministry for Science under the grant number EUI2009-04075. The authors like to thank Nils Stein and Ping Yang for fruitful discussion about the rym7 locus. CS was supported by a mobility fellowship from Universidade da Coruña.
References
Alsop, A., P. Farre, J.M. Wenzl, M. Wang, X. Zhou, et al. 2011. Development of wild barley– derived DArT markers and their integration into a barley consensus map. Mol. Breed. 27:77–92 Altschul, S.F., T.L. Madden, A.A. Schaffer, J. Zhang, Z. Zhang, W. Miller, D.J. Lipman. 1997. Gapped BLAST and PSI–BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402
Andersen, J.R., and T. Lubberstedt. 2003. Functional markers in plants. Trends Plant Sci. 8:554– 560
Ariyadasa, R., M. Mascher, T. Nussbaumer, D. Schulte, Z. Frenkel, N. Poursarebani, et al. 2014. A sequence–ready physical map of barley anchored genetically by two million single–nucleotide polymorphisms. Plant Physiol. 164: 412–423
Badr, A., K. Müller, R. Schäfer–Pregl, H. El Rabey, S. Effgen, et al. 2000. On the origin and domestication history of Barley (Hordeum vulgare). Mol. Biol. Evol. 17:499–510
Baik, B.K., and S.E. Ullrich. 2008. Barley for food: Characteristics, improvement, and renewed interest. J. Cereal Sci. 48:233–242
Bevan, M., and R. Waugh. 2007. Meeting report: applying plant genomics to crop improvement. Genome Biol. 8:302
Bockelman, H.E., and J. Valkoun. 2011. Barley germplasm conservation and resources. In: Ullrich SE, editor. Barley: production, improvement, and uses. Wiley–Blackwel. 144–159
Bolger, M.E., B. Weisshaar, U. Scholz, N. Stein, B. Usadel, and K.F. Mayer. 2014. Plant genome sequencing — applications for crop improvement. Curr. Opin. Biotechnol. 26:31–37
Bolot, S., M. Abrouk, U. Masood–Quraishi, N. Stein, J. Messing, et al. 2009.The ‘inner circle’ of the cereal genomes. Curr. Opin. Plant Biol. 12:119–125
Bossolini, E., T. Wicker, P.A. Knobel, and B. Keller. 2007. Comparison of orthologous loci from small grass genomes Brachypodium and rice: implications for wheat genomics and grass genome annotation. Plant J. 49:704–717
Brockman, D.A., X. Chen, and D.D. Gallaher. 2013. Consumption of a high b–glucan barley flour improves glucose control and fatty liver and increases muscle acylcarnitines in the Zucker diabetic fatty rat. Eur. J. Clin. Nutr. 52:743–1753.
23
Caldwell, K.S., P. Langridge, and W. Powell. 2004. Comparative sequence analysis of the region harboring the hardness locus in barley and its collinear region in rice. Plant Physiol. 136:3177–3190 Cantalapiedra, C.P., R. Boudiar, A.M. Casas, E. Igartua, and B. Contreras–Moreira. 2015. BARLEYMAP: physical and genetic mapping of nucleotide sequences and annotation of surrounding loci in barley. Mol. Breed. 35:13
Ceccarelli S, S. Grando, M. Maatougui M. Michael, M. Slash, et al. 2010. Plant breeding and climate change. J. Agric. Sci. 148:627–637
Chapman, J.A., M. Mascher, K. Barry, E. Georganas, A. Session, et al. 2015. A whole–genome shotgun approach for assembling and anchoring the hexaploid bread wheat genome. Genome Biol. 16:26
Close, T.J., P.R. Bhat, S. Lonardi, Y. Wu, N. Rostoks, et al. 2009. Development and implementation of high – throughput SNP genotyping in barley. BMC Genomics 10:582
Collard, B.C., and D.J. Mackill. 2008. Marker–assisted selection: an approach for precision plant breeding in the twenty–first century. Philos. Trans. R. Soc. Lond. B: Biol. Sci. 363:557–572
Comadran, J., L. Ramsay, K. MacKenzie, P. Hayes, T.J. Close, G. Muehlbauer, N. Stein N, and R. Waugh. 2010. Patterns of polymorphism and linkage disequilibrium in cultivated barley. Theor. Appl. Genet. 122:523–531
Comadran, J., B. Kilian, J. Russell, L. Ramsay, N. Stein, et al. 2012. Natural variation in a homolog of Antirrhinum CENTRORADIALIS contributed to spring growth habit and environmental adaptation in cultivated barley. Nat. Genet. 44:1388–1392
Dai, F., E. Nevo, D.Z. Wu, J. Comadran, M.X. Zhou., L. Qiu, Z.H. Cheng, A. Belles, G.X. Chen, and G.P. Zhang. 2012. Tibet is one of the centers of domestication of cultivated barley. Proc. Natl. Acad. Sci. USA. 109:16969–16973
Dolezel, J., J. Greilhuber, S. Lucretti, A. Meister, M.A. Lysak, L. Nardi, and R. Obermayer. 1998. Plant genome size estimation by flow cytometry: Inter–laboratory comparison. Ann. Bot. 82:17–26 Drader, T., K. Johnson, R. Brueggeman, D. Kudrna, and A. Kleinhofs. 2009. Genetic and physical mapping of a high recombination region on chromosome 7H (1) in barley. Theor. Appl. Genet. 118:811–820
FAOSTAT. 2012. Website. Available: http://faostat.fao.org. Accessed 2014 November 12
Feuillet, C., N. Stein, L. Rossini, S. Praud, K. Mayer, et al. 2012. Integrating cereal genomics to support innovation in the Triticeae. Funct. Integr. Genomics 12:573–583
Goff, S.A., D. Ricke, T.H. Lan, G. Presting, R. Wang, M. Dunn, et al. 2002. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100
Graner, A., A. Kilian, and A. Kleinhofs. 2011. A. barley genome organization, mapping, and synteny. In: Ullrich SE (ed). Barley: Production, Improvement, and Uses. Oxford: Wiley– Blackwell, 2011:63–84
He, P., J.Z. Li, X.W. Zheng, L.S. Shen, C.F. Lu, Y. Chen and L.H. Zhu. 2001. Comparison of molecular linkage maps and agronomic trait loci between DH and RIL populations derived from the same rice cross. Crop Sci. 41:1240–1246
24
Heffner, E.L., M.E. Sorrells, J.L. Jannink. 2009. Genomic selection for crop improvement. Crop Sci. 49:1–12
Hunter, J.D. 2007. Matplotlib: A 2D Graphics Environment. Comput. Sci. Eng. 9:90–95
International Brachypodium Initiative. 2010. Genome sequencing and analysis of the grass Brachypodium distachyon. Nature 463:763–768
International Barley Genome Sequencing Consortium. 2012. A physical, genetic and functional sequence assembly of the barley genome. Nature 491:711–716
Keilwagen, J., B. Kilian, H. Ozkan, S. Babben, D. Perovic, et al. 2014.Separating the wheat from the chaff – a strategy to utilize plant genetic resources from ex situ genebanks. Scientific Rep. 4: 5231
Keller, B., and K. Feuillet. 2000.Colinearity and gene density in grass genomes. TIPS 5: 246–251 Kent, W.J. 2002. BLAT–the BLAST–like alignment tool. Genome Res. 12:656–64
Kersey, P.J., J.E. Allen, M. Christensen, P. Davis, L.J. Falin, et al. 2014. Ensembl genomes 2013: scaling up access to genome–wide data. Nucleic Acids Res. 42:D546–D552
Kilian, B., and A. Graner. 2012. NGS technologies for analyzing germplasm diversity in genebanks. Brief. Funct. Genomics 11: 38–50.
König, J.; D. Kopahnke, B.J. Steffenson, N. Przulj, T. Romeis, M.S. Röder, F. Ordon, and D. Perovic. 2012.Genetic mapping of a leaf rust resistance gene in the former Yugoslavian barley landrace MBR1012. Mol. Breed. 30:1253–1264
Krzywinski, M., J. Schein, I. Birol, J. Connors, R. Gascoyne, D. Horsman, S.J. Jones, M.A. Marra. 2009. Circos: an information aesthetic for comparative genomics. Genome Res. 19:1639–1645 Kumlehn, J., and N. Stein. 2014. Biotechnological Approaches to Barley Improvement. Springer, Germany
Leister, D., J. Kurth, D.A. Laurie, M. Yano, T. Sasaki, K. Devos, A. Graner, P. Schulce–
Lefert.1998. Rapid reorganization of resistance gene homologues in cereals genomes. Proc. Natl. Acad. Sci. USA 95:370–375
Li, W., and B.S. Gill. 2002. The collinearity of the Sh2/A1 orthologous region in rice, sorghum and maize is interrupted and accompanied by genome expansion in the Triticeae. Genetics 160:1153– 1162
Liu, H., M. Bayer, A. Druka, J.R. Russell, C.A. Hackett, et al. 2014. An evaluation of genotyping by sequencing (GBS) to map the Breviaristatum– e (ari–e) locus in cultivated barley. BMC Genomics 15:104
Lorenz, A.J., K.P. Smith, and J.L. Jannink. 2012. Potential and optimization of genomic selection for Fusarium head blight resistance in six–row barley. Crop Sci. 52:1609–21
Lüpken, T., N. Stein, D. Perovic, A. Habekuß, I. Kramer, et al. 2013.Genomics–based high– resolution mapping of the BaMMV/BaYMV resistance gene rym11 in barley (Hordeum
vulgareL.).Theor. Appl. Genet. 126:1201–1212
Lüpken, T., N. Stein, D. Perovic, A. Habekuß, A. Serfling, et al. 2014. High–resolution mapping of the barley Ryd3 locus controlling tolerance to BYDV. Mol. Breed. 33:477–488
25
Mascher, M., and N. Stein.2014. Genetic anchoring of whole–genome shotgun assemblies. Front. Genet. 5:208
Mascher, M., G.J. Muehlbauer, D.S. Rokhsar, J. Chapman, J. Schmutz, et al. 2013a. Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ). Plant J. 76:718–727
Mascher, M., T.A. Richmond, D.J. Gerhardt, A. Himmelbach, L. Clissold, et al. 2013b. Barley whole exome capture: a tool for genomic research in the genus Hordeum and beyond. Plant J. 76:494–505
Martis, M.M., R. Zhou, G. Haseneyer, T. Schmutzer, Jan Vrána, M. Kubaláková, et al. 2013. Reticulate Evolution of the Rye Genome. Plant Cell 25:3685–3698
Mayer, K.F.X., S. Taudien, M. Martis, H. Imkova, P. Suchankova, et al. 2009. Gene content and virtual gene order of barley chromosome 1H. Plant Physiol. 151:496–505
Mayer, K.F.X., M. Martis, P.E. Hedley, H. Kimkova, H. Liu, et al. 2011. Unlocking the barley genome by chromosomal and comparative genomics. Plant Cell 23:1249–1263
McCouch, S. 2004. Diversifying Selection in Plant Breeding. PLoS Biol 2:e347
Meyers, B.C., A. Kozik, A. Griego, H. Kuang, and R.W. Michelmore. 2003. Genome wide analysis of NBS–LRR–encoding genes in Arabidopsis. Plant Cell 15:809–34
Muñoz–Amatriaín, M., M.J. Moscou, P.R. Bhat, J.T. Svensson, J. Bartos, et al. 2011. An improved consensus linkage map of barley based on flow–sorted chromosomes and single nucleotide polymorphism markers. Plant Genome 4:238–249
Muñoz–Amatriaín, M., A. Cuesta–Marcos, J.B. Endelman, J. Comadran, M. Bonman, et al. 2014a. Genetic diversity and population structure in a worldwide barley collection of landraces and cultivars and its potential for genome–wide association studies. PLoS One 9:e94688
Muñoz–Amatriaín, M., A. Cuestos–Marcos, P.M. Hayes, and G.J. Muehlbauer.2014b. Barley genetic variation: implications for crop improvement. Brief. Funct. Genomics doi: 10.1093/bfgp/elu006
Muñoz-Amatriaín, M.,S.Lonardi, M. Luo, K. Madishetty, J.T. Svensson, M.J. Moscou, et al. 2015. Sequencing of 15,622 gene-bearing BACs clarifies the gene-dense regions of the barley genome. Plant J. DOI: 10.1111/tpj. 12959
Nussbaumer, T., K.G. Kugler, W. Schweiger, K.C. Bader, H. Gundlach, et al. 2014. ChromoWIZ: a web tool to query and visualize chromosome–anchored genes from cereal and model genomes. BMC Plant Biol. 14:348
Ordon, F., and D. Perovic. 2013. Virus resistance in barley. Translational Genomics for Crop Breeding, Volume I: Biotic Stresses, First Edition. Ed. R.K. Varshney and R. Tuberosa. John Wiley & Sons, 63–75
Paterson, A.H., J.E. Bowers, R. Bruggmann, I. Dubchak, J. Grimwood, H. Gundlach et al. 2009. The Sorghum bicolor genome and the diversification of grasses. Nature 457:551–556
Palloix, A., and F. Ordon. 2011. Advanced breeding for virus resistance in plants. In: Caranta C, Aranda MA, Tepfer M, Lopez–Moya JJ (eds) Recent advances in plant virology. Caister Academic Press, UK, pp 195–218
26
Paux, E., F. Legeai, N. Guilhot, A.F. Adam–Blondon, M. Alaux, et al. 2008. Physical mapping in large genomes: accelerated anchoring of BAC contigs to genetic maps through in silico analysis. Funct. Integr. Genomics 8: 29–32
Perovic, D., N. Stein, H. Zhang, A. Drescher, M. Prasad, et al. 2004. An integrated approach for comparative mapping in rice and barley based on genomic resources reveals a large number of syntenic markers but no candidate gene for the Rph16 resistance locus. Funct. Integr.Genomics 4: 74–83
Perovic, D., D. Kopahnke, B.J. Steffenson, J. Förster, J. König,et al. 2013. Genetic fine mapping of a novel leaf rust resistance gene and a Barley yellow dwarf virus tolerance (BYDV) introgressed from Hordeumbulbosum by the use of the 9K iSelect chip. In: Zhang G., C. Li, X. Liu (Eds.) Advance in Barley Sciences. Proceedings of 11th International Barley Genetics Symposium. Springer and Zhejiang University Press, pp. 269–284
Perovic, J., C. Silvar, J. König, N. Stein,D. Perovic, F. Ordon. 2013. A versatile fluorescence–based multiplexing assay for CAPS genotyping on capillary electrophoresis systems, Mol. Breed. 32:61– 69
Pfeifer, M., M. Martis, T. Asp, K.F. Mayer, T. Lubberstedt, S. Byrne, U. Frei, and B. Studer. 2013. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics. Plant Physiol. 161:571–582
Poland, J.A., P.J. Brown, M.E. Sorrells, and J.L. Jannink. 2012. Development of high density genetic maps for barley and wheat using a novel two–enzyme genotyping–by–sequencing approach. PLoS One 7:e32253
Pourkheirandish, M., T. Wicker, N. Stein, T. Fujimura, and T. Komatsuda.2007. Analysis of the barley chromosome 2 region containing the six–rowed spike gene vrs1 reveals a breakdown of rice– barley micro collinearity by a transposition. Theor. Appl. Genet. 114:1357–1365
Poursarebani, N., R. Ariyadasa, R. Zhou, D. Schulte, B. Steuernagel, et al. 2013. Conserved synteny–based anchoring of the barley genome physical map. Funct. Integr. Genomics 13:339–350 Romero, I.G., A. Manica, J. Goudet, L.L. Handley, and F. Balloux. 2009. How accurate is the current picture of human genetic variation? Heredity 102:120–126
Salse, J., S. Bolot, M. Throude, V. JouVe, B. Piegu, et al. 2008. Identification and characterization of shared duplications between rice and wheat provide new insight into grass genome evolution. Plant Cell 20:11–24
Shahinnia, F., A. Druka, J. Franckowiak, M. Morgante, R. Waugh, et al. 2012. High resolution mapping of Dense spike–ar (dsp.ar) to the genetic centromere of barley chromosome 7H. Theor. Appl. Genet. 124:373–384
Silvar, C., H. Dhif, E. Igartua, D. Kopahnke, M.P. Gracia, et al. 2010. Identification of quantitative trait loci for resistance to powdery mildew in a Spanish barley landrace.Mol. Breed. 25:581–592 Silvar, C., A.M. Casas, E. Igartua, L.J. Ponce–Molina, M.P. Gracia, et al. 2011a. Resistance to powdery mildew in Spanish barley landraces is controlled by different sets of quantitative trait loci. Theor. Appl. Genet. 123:1019–1028
27
Silvar, C., D. Perovic, A.M. Casas, E. Igartua, and F. Ordon.2011b. Development of a cost– effective pyrosequencing approach for SNP genotyping in barley. Plant Breed. 130:394–397
Silvar, C., D. Perovic, U. Scholz, A.M. Casas, E. Igartua, et al. 2012. Fine mapping and comparative genomics integration of two quantitative trait loci controlling resistance to powdery mildew in a Spanish barley landrace. Theor. Appl. Genet. 124:49–62
Silvar, C., D. Perovic, T. Nussbaumer, M. Spannagl, B. Usadel, et al. 2013. Towards positional isolation of three quantitative trait loci conferring resistance to powdery mildew in two Spanish barley landraces. Plos One 8: e67336
Šimková, H., J.T. Svensson, P. Condamine, E. Hřibová, P. Suchánková, P.R. Bhat, J. Bartoš, J. Šafář, T.J. Close, and J. Doležel. 2008. Coupling amplified DNA from flow–sorted chromosomes to high–density SNP mapping in barley. BMC Genomics 9:294
Sturtevant, A.H. 1913. The Linear Arrangement of Six Sex–Linked Factors in Drosophila, as shown by their mode of Association. Journal of Experimental Zoology 14: 43–59
Sullivan, P., E. Arendt, and E. Gallagher. 2013. The increasing use of barley and barley by– products in the production of healthier baked goods. Trends Food Sci. Technol. 29:124–134
Thiel, T., A. Graner, R. Waugh, I. Grosse, T.J. Close, et al. 2009. Evidence and evolutionary analysis of ancient whole–genome duplication in barley predating the divergence from rice. BMC Evol. Biol. 9:209
Van Ooijen, J.W. 2006. JoinMap version 4.0, software for the calculation of genetic linkage maps. Kyazma BV, Wageningen, the Netherlands
Varshney, R.K., D.A. Hoisington, and A.K. Tyagi. 2006. Advances in cereal genomics and applications in crop breeding. Trends Biotech. 24:490–499
Verstegen, H., O. Köneke, V. Korzun, and R. von Broock. 2014. The world importance of barley and challenges to further improvements. In Biotechnological Approaches to Barley Improvement, J. Kumlehn& N. Stein (Volume Editors), Biotechnology in Agriculture and Forestry, Springer, Germany
Wenzl, P., H.B. Li, J. Carling, M.X. Zhou, H. Raman, et al. 2006. A high–density consensus map of barley linking DArT markers to SSR, RFLP and STS loci and agricultural traits. BMC Genomics 7:206
Wicker, T., S. Taudien, A. Houben, B. Keller, A. Graner, M. Platzer, and N. Stein. 2009. A whole – genome snapshot of 454 sequences exposes the composition of the barley genome and provides evidence for parallel evolution of genome size in wheat and barley. Plant J. 59:712–722
Wicker, T., K.F. Mayer, H. Gundlach, M. Martis, B. Steuernagel, et al. 2011. Frequent gene movement and pseudogene evolution is common to the large and complex genomes of wheat, barley, and their relatives. Plant Cell 23:1706–1718
Wu, Y., P.R. Bhat, T.J. Close, and S. Lonardi. 2008. Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph. PLoS Genet. 4:e1000212
Yang, P., D. Perovic, A. Habekuss, R.N. Zhou, A. Graner, F. Ordon, and N. Stein. 2013. Gene– based high–density mapping of the gene rym7 conferring resistance to Barley mild mosaic virus (BaMMV). Mol. Breed. 32:27–37
28
Yang, P., T. Lüpken, A. Habekuss, G. Hensel, B. Steuernagel, et al. 2014. Protein disulfide isomerase like 5–1 is a susceptibility factor to plant viruses. Proc. Natl. Acad. Sci. USA. 111:2104– 2109
Yu, J., S. Hu, J. Wang, G.K.S. Wong, S. Li, B. Liu B, et al. 2002. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:79–92
29 Figure captions
Figure 1. Comparison of the Consensus–map against the barley GenomeZipper and POPSEQ based anchoring. The figure illustrates the comparison of the Consensus–map of this study against
the barley GenomeZipper (Track B) on the basis of common marker sequences. Track A illustrates the agreement between both maps with black showing perfect agreement and white showing an agreement of less than 80%. Markers were classified as correct when the respective genetic position was within 5 cM compared to the average genetic position of all markers that matched to a GenomeZipper Bin where a Bin comprises always 50 non–overlapping loci. The regions spanned by the seven resistance loci or QTL are shown in Track C. Track D gives the comparison of POPSEQ anchoring against the GenomeZipper–based anchoring. Connections were drawn between the POPSEQ based genetic positions of the WGS contigs of cultivar Morex and anchored resources from the GenomeZipper. Track E illustrates the agreement between POPSEQ and the GenomeZipper. Track F illustrates the recombination frequency within a GenomeZipper bin of 50 loci.
Figure 2.Comparison of collinearities among GenomeZipper–based markers, genome–zipper gene models and WGS contigs derived from POPSEQ at the seven target barley loci. The
different colors in the physical map (IBSC et al. (2012)) represent the different barley chromosomes. The different shades of grey indicate the genetic position according to the consensus map of this study.
30
Tables
Table 1. Statistics of the consensus map
Chrom MapLength
(cM) # markers # bins # conflicts
1H 146.30 655 226 4 2H 183.54 1116 357 18 3H 168.25 1036 355 2 4H 131.11 671 163 3 5H 191.15 1245 400 5 6H 137.68 811 235 0 7H 162.24 871 242 4 Total 1120.27 6405 1978 36
31
Table 2. Statistics of the comparison between consensus map, GenomeZipper and POPSEQ
Chrom.† No Hit‡ GenomeZipper POPSEQ # hit§ # hits to identical chrom. Collinearity (%) # hits to different chrom. No hit¶ # hits to identical chrom. Collinearity (%) # hits to different chrom. 1H 67 111 464 97.16 13 114 471 99.36 3 2H 114 244 734 95.90 24 156 836 98.81 10 3H 112 199 711 96.16 14 165 754 99.34 5 4H 74 124 465 97.63 8 113 481 99.38 3 5H 152 276 804 96.88 13 162 927 99.57 4 6H 94 189 519 95.74 9 111 603 99.50 3 7H 76 194 587 94.19 14 120 670 99.25 5 Total 689 1337 4284 96.24 95 941 4742 99.30 33
†Chromosome in the consensus map
‡Number of markers in the consensus map that did not show any hit to GenomeZipper or POPSEQ §Number of markers in the consensus map that did not show any hit to GenomeZipper
32
Table 3. Description of the seven loci employed for the microsyntenic comparisons and number of GenomeZipper loci present in the target intervals
Locus name Chrom. Population type† Interval in GZ (cM)
# GZ loci # targeted GZ loci # BOPA markers‡ # inferred loci§ # BOPA markers‡ # inferred loci§ L1 1HS HR 1.51 10 23 5 19 L2 1H LR 0.00 10 188 3 63 L3 2HL LR 7.49 20 85 4 22 L4 4H HR 1.38 9 30 2 24 L5 6HL LR 0.90 4 43 1 11 L6 7HS LR 0.64 3 27 1 14 L7 7HL LR 4.30 6 28 2 8 Total – – – 62 424 18 161
†HR (High Resolution) or LR (Low Resolution) mapping population
‡Number of GZ loci whose position coincides with a BOPA marker
33
Table 4. Description of the genome zipper–based markers developed for the seven target intervals
Locus name L1 L2 L3 L4 L5 L6 L7 Total
Chromosome 1HS 1H 2HL 4H 6HL 7HS 7HL –
# targeted gene models 24 66 26 26 12 15 10 179
None amplification 9 0 9 2 1 3 1 25 Monomorphic markers 7 55 4 18 0 0 1 85 # developed markers 8 11† 13 6 11 12 8 69 Presence/Absence 0 0 2 0 3 2 0 7 Size polymorphism 1 2 4 0 1 2 0 10 CAPS 7 4 7 6 7 8 8 47
†Five of these markers were not mapped previously (Yang et al., 2013) and accordingly they were not
34
Table 5. Statistics of the comparisons of zipper–based markers to POPSEQ and IBSC
Locus name L1 L2 L3 L4 L5 L6 L7 Total
Chromosome 1HS 1H 2HL 4H 6HL 7HS 7HL - Markers outside of target region 0 (0%) 1 (17%) 0 (0%) 0 (0%) 0 (0%) 4 (33%) 2 (25%) 7 (10.9%) Markers within the
target region 8 (100%) 5 (83%) 13 (100%) 6 (100%) 11 (100%) 8 (67%) 6 (75%) 57 (89.1%) Markers in collinearity with Genome Zipper 5 (62.5%) 4 (80%) 11 (84.6%) 6 (100%) 11 (100%) 6 (75%) 5 (83.3%) 48 (84.2%) Hits to POPSEQ 4 (50%) 4 (80%) 8 (61.5%) 3 (50%) 11 (100%) 8 (100%) 6 (100%) 44 (77.2%) Markers in correct collinearity with POPSEQ 4 (100%) 4 (100%) 7 (87.5%) 3 (100%) 11 (100%) 6 (75%) 6 (100%) 41 (93.2%) Hits to FPcontigs 8 (100%) 3 (60%) 8 (61.5%) 4 (66.6%) 10 (90.9%) 8 (100%) 3 (50%) 44(77.2%) N° identified FPcontigs 4 3 6 4 4 2 2 25
35
Supplementary material
Figure S1. Root mean square error (RMSE) values between the corresponding consensus map
derived by LPmerge and the individual maps for different interval parameters of LPmerge. Among the different potential consensus maps for different interval values, we selected the one yielding the smallest median RMSE and highlighted the corresponding box in blue.
Figure S2. Dot plot comparison of shared BOPA markers between the consensus map and the map
developed by Close et al., (2009) and employed as the framework for barley ‘zippers’ anchoring. Numbers at the top of the graphics show the Spearman's rank correlation coefficient. Numbers at the bottom right hand corner indicate the amount of common markers between both datasets.
Figure S3. Misaligned markers were plotted along the consensus map when the genetic positions
were discordant between Consensus/GenomeZipper and Consensus/POPSEQ map.
Table S1. Consensus genetic map
Table S2. Number of GenomeZipper loci comprised in the seven target intervals and their
orthology to any of three different reference genomes
Table S3. Markers mapping to the physical contigs of the 15,622 sequenced clones ( Muñoz-Amatriaín et al. 2015) and information of the respective physical contigs of the IBSC (2012).