Evolutionary Effects on Morphology and Agronomic Performance of Three Winter Wheat Composite Cross Populations Maintained for Six Years under Organic and Conventional Conditions

Three winter wheat (Triticum aestivum L.) composite cross populations (CCPs) that had been maintained in repeated parallel populations under organic and conventional conditions from the F5 to the F10 were compared in a two-year replicated field trial under organic conditions. The populations were compared to each other, to a mixture of the parental varieties used to establish the CCPs, and to three winter wheat varieties currently popular in organic farming. Foot and foliar diseases, straw length, ear length, yield parameters, and baking quality parameters were assessed. The overall performance of the CCPs differed clearly from each other due to differences in their parental genetics and not because of their conventional or organic history. The CCPs with high yielding background (YCCPs) also yielded higher than the CCPs with a high baking quality background (QCCPs; in the absence of extreme winter stress). The QCCPs performed equally well in comparison to the reference varieties, which were also of high baking quality. Compared to the parental mixture the CCPs proved to be highly resilient, recovering much better from winter kill in winter 2011/12. Nevertheless, they were out yielded by the references in that year. No such differences were seen in 2013, indicating that the CCPs are comparable with modern cultivars in yielding ability under organic conditions. We conclude that—especially when focusing on traits that are not directly influenced by natural selection (e.g. quality traits)—the choice of parents to establish a CCP is crucial. In the case of the QCCPs the establishment of a reliable high-quality population worked very well and quality traits were successfully maintained over time. However, in the YCCPs lack of winter hardiness in the YCCP parents also became clearly visible under relevant winter conditions.


Introduction
The challenges of climate change, increasing demand for finite resources, and population growth are calling for a paradigm shift in resource use [1,2] combined with new, different and efficient strategies to face the challenges of climate change [3,4].Diverse farming systems have shown to be more resilient in the face of perturbations and buffer extreme climatic events and adverse growing conditions to a wider extent than large monocultures do [3,5,6].Beneficial effects of crop genetic diversity on productivity, population recovery from disturbance, and other ecological processes have been reviewed by Finckh and Wolfe [7] and Dawson and Goldringer [8] and agrobiodiversity has been placed very high in the list of potential solutions to the growing demand for food.Since the early 20 th century trends in agriculture, plant breeding and breeding legislation have tended towards an increased use of genetically uniform varieties [9][10][11][12].As a consequence most crop varieties have been selected to cope well in monocultural high-input growing systems [13,14].This disregards the fact that genotypes selected for high performance under high-input conditions do not necessarily perform very well in marginal environments or in farming systems with lower inputs [15].It is also argued that such uniform and genetically 'stable' cultivars are inappropriate for dealing with unpredictable environmental changes because their response to environmental fluctuations is not buffered by genetic diversity and they have no capacity to react to novel stress factors [5,16,17].
Responding to the continuous restriction of genetic variability in plant breeding, Simmonds [18] and Allard and Hansche [19] called for mass reservoirs of genetic variability as supplements to conventional breeding that help broaden the genetic base of crops and are well suited for dynamic conservation of genes and genotypes.
For the self-pollinating cereals, evolutionary breeding based on the composite cross approach was developed.In evolutionary breeding, heterogeneous, segregating crop populations (composite cross populations, CCPs) [20] are subjected to natural selection.It is expected that the high level of genetic diversity allows adaptation to the prevailing growing conditions because plants with good adaptation to the local growing conditions will contribute more seed to the next generation than plants with lower fitness [16,20].
While genetic variability is expected to decrease in each population over time under the combined effects of drift and selection, overall diversity is supposed to be maintained through the differentiation among populations [21].Over time the populations adapt to the conditions under which they are grown and their resilience to stressful and variable growing conditions is seen as a major advantage under the predicted threats of climate change [16,17].This simple and efficient way of managing genetic resources in situ is a potent tool for the sustainable use of plant genetic resources on the one hand and can be a potent solution, especially under low-input growing conditions, on the other hand.
In 2001, three winter wheat CCPs suitable for European growing conditions were created in the UK by the John Innes Centre (JIC, Norwich, UK) in cooperation with the Organic Research Centre (Newbury, UK) [22].The parental varieties were successful European varieties, released between 1934 and 2000, with a focus on varieties of British origin, approximately representing the breeding progress at the beginning of the twenty-first century.Key criteria for selection were a diverse genetic base and potential for stable performance under low-input growing conditions.The parental varieties were grouped into three groups: one group containing twelve varieties with high baking quality (group Q), one group containing nine high yielding varieties (group Y), and the third group containing all 20 varieties (group YQ).
The variety 'Bezostaya', known as high yielding as well as high quality in Russia, was included in both groups Y and Q.A comprehensive analysis of the performance of the individual parental varieties was published by Jones et al. [23].The half diallels of the Q parents and the Y parents resulted in the QCCP and the YCCP, respectively.The intercross of the Y by Q parents in the YQCCP.The initial setting up and maintenance of the European composite cross populations established at the JIC in 2002 has been described by D öring et al. [24] in detail.
After two years of multiplication at two organic and two conventional sites in the south and east of the UK, F 4 seed of the four sites was bulked, and about 2 kg each was sent to the Department of Ecological Plant Protection, Faculty of Organic Agricultural Sciences, University of Kassel, Germany in autumn 2005, where they have been maintained since under contrasting agronomic conditions.Each F 4 population was divided into two and sown into an organically managed trial site and into a conventional trial site (resulting in three CCP org and three CCP conv ).
In autumn 2006, enough seeds were available to split the populations one more time.Since then, within each system two Y, two Q, and two YQ populations have been maintained as two parallel populations.This has enabled the comparison of changes in the populations over time within and between systems.Random changes and changes in the populations that occurred due to effects of the environment (e.g.organic vs. conventional growing conditions) can be distinguished.The populations are maintained in separated plots of minimum 100 m 2 to ensure that at least 5000 individual plants are grown, which is the effective population size (Ne) that should be sufficient to avoid genetic drift in the populations [21,25].
Thus, since the F 6 , a total of twelve CCPs (six CCP org and six CCP conv ) have been maintained at the two trial sites in the absence of fungicides and insecticides with no artificial selection applied apart from the removal of the tallest plants (> 130 cm) in the early generations to prevent the populations from gaining too much in plant height.Results from France show a disproportional advantage of tall plants in the populations due to competition for light and an overall increase in height over time [26,27].
In 2011/12 and 2012/13 a field trial was carried out at the University of Kassel comparing the total of twelve winter wheat CCPs in an organically managed field a) to each other and b) to three modern pure line varieties well suited for the local growing conditions.The main questions addressed in the field trial were: 1. What are the effects of organic versus conventional selection environments on population performance?2. What are the effects of genetic background on population performance?3. How do the populations perform compared to modern pure line wheat varieties currently popular in organic farming?To assess morphology and the agronomic performance of the CCPs, straw height, ear length, foot and foliar diseases, yield parameters and baking quality parameters were assessed.The results give an insight into the agro-nomic performance of CCPs that were shaped over several years in contrasting environments.

Field Site
The trial was carried out at the Research Station of the University of Kassel in Neu-Eichenberg, located 51 • 22' N and 9 • 54' E at an altitude of 247 m above sea level.Mean annual precipitation (2000-2013) is 684 mm, and mean annual temperature (2000-2013) 9.3 • C. The fields have been managed organically since 1984; no mineral fertilizers, fungicides, insecticides or herbicides were applied, and weeds were controlled mechanically through harrowing and/or hoeing at the tillering stage.The soil is a deep Haplic Luvisol with 76 soil points [28].

Experimental Design
In 2011, enough seed of the F 10 of all 12 CCPs was saved to allow for a two-year field trial.Therefore, in 2011/12 and in 2012/13, the F 11 of the six CCP org and the six CCP conv were compared to each other, to three reference varieties ('Achat', 'Akteur', 'Capo') and to an equal mixture of the 20 parental varieties (referred to as 'mixture' from now on) in a randomized complete block design with four replications.
The trials were carried out in an organic field, the precrop in 2011 was canola, in 2012 it was two years of grass-clover.The mean availability of mineral nitrogen (kg N/ha) measured in early spring (BBCH 20) in three layers of soil (0-30, 30-60 and 60-90 cm) was 83.7 kg/ha in total in spring 2012 and 84.0 kg/ha in total in spring 2013.At the flowering stage (BBCH 65) the soil could only be sampled down to a depth of 60 cm, due to very dry soil conditions.Mean availability of mineral nitrogen in total of both depths was 21.6 kg/ha in 2012 and 27.1 kg/ha in 2013.Soil samples were taken and analysed according to the standards of VDLUFA [29].
The sowing date in 2011 was the 31st of October, in 2012 it was the 10 th of October; plots were 11 m × 3 m which is the double width of a standard trial plot, allowing assessments and sampling on one side and leaving the other half for yield survey.Seed rate was 350 germinable seeds/m 2 and rows were spaced 30 cm to allow for hoeing.

Assessments
Growth stages were assessed regularly throughout the season.Straw height and ear length (cm) were measured in 50 randomly chosen stems per plot (BBCH 90) in order to evaluate morphological variation.Straw height was measured from the ground to the start of the ear, ear length was measured from the first full spikelet to the tip without awns.
Foliar diseases caused by fungal pathogens were assessed at BBCH stage 73/75.Non-green leaf area was estimated in % (1-100%).The three leaf levels of flag leaf (F), leaf below flag leaf (F-1) and leaf below F-1 (F-2) were assessed separately at six locations per plot.
To assess foot diseases (Fusarium spp., Pseudocercosporella herpotrichoides, Rhizoctonia cerealis), plant samples were taken at five to six points per plot (minimum 30 stems) with roots at BBCH 75.The lower stems were freed of soil and leaf sheaths and scored for foot rot symptoms based on the key of Bockmann [30] where 0 is healthy, 1 is symptoms on <50% of the stem perimeter, 2 is symptoms on 50-100% of the stem perimeter, 3 is stem brittle/rotten (P.herpotrichoides only).Based on a pictorial key of symptoms [31] Fusarium root rot, P. herpotrichoides and R. cerealis were assessed separately.
Grain yield on a plot basis was measured in t/ha at 14% moisture content, additionally the thousand kernel weight (TKW) was measured in g at 14% moisture.Ear bearing tillers/m 2 were calculated from three rows of 1 m length.Plants were cut shortly before harvest in order to assess morphological traits.
Protein content (%) was calculated from the nitrogen content of the seeds (N [%] × 5.7), which was analysed in ripe seeds that were dried for 72 h at 60 • C, milled (ultracentrifugal mill, Retsch, Type ZM 2) and analysed in the elemental analyzer vario MAX CHN (Elementar Analysesysteme GmbH, Hanau, DE).
Hagberg falling number (HFN; sec.; ICC Method no.107), sedimentation value (Zeleny; ml; ICC Method no.116), and wet gluten (%; ICC Method no.106/2) were analysed in the Aberham Laboratories, Großaitingen, DE.HFN was assessed in pooled samples in the first year of the trial and per plot in the second year.Sedimentation value and wet gluten were assessed in pooled samples from the four replications in both years.
Baking volume of test loaves (ml) was assessed using an internal method credited to Aberham Laboratories: test loaves were baked from wholemeal, no ascorbic acid was added but due to very high HFN of some samples the addition of malt flour was necessary to prevent the bread crust form liquefying. Baking volume was assessed per plot in the second trial year only.For a detailed rating system and its translation into a color code of the respective values see Table A1 in Appendix.

Data Processing and Statistical Analysis
Foliar disease severity per plot was calculated as the means per leaf level.Means were weighed 4:3:3 for the flag (F) leaves, the F-1 and F-2 leaves, respectively to account for the greater contribution of the flag leaf to the total dry matter of ripe seeds compared to the lower leaves [32].
A foot disease severity index (DI) was calculated based on the severity classes as: where x 1 . . .x 3 are the number of stems with disease scores 1 to 3, respectively, and n is the total number of stems as-sessed.The resulting index values fall between 0 and 100 and can be calculated for each of the three foot diseases separately or as an index of all three together.The statistical calculations were performed using IBM SPSS Statistics (Version 22).Data were tested for normal distribution of residuals (Shapiro-Wilk-Test and Q-Q-plots) and for homogeneity of variance (Levene test) and transformed if required.When data were normally distributed and variance was homogeneous, a univariate ANOVA with subsequent Tukey-B-Test was calculated where appropriate to find significant differences between group means at p < 0.05.
Where normal distribution was the case but not homogeneity of variance, the Games-Howell post hoc test was used (foliar diseases in both trial years, total incidence of foot diseases in 2011/12, and ear length in both trial years).Linear contrasts were calculated to compare i) the three groups of populations (YQCCP, QCCP and YCCP), ii) populations and the reference varieties 'Achat', 'Ak- teur', and 'Capo', iii) populations and the mixture, and iv) CCP org and CCP conv .

Weather Data
Average temperature during the wheat growing season 2011/12 was 9.7 • C, which is higher than the long-term average (2000-2013) of 9.3 • C and the long-term average (1977-1994) of 7.9 • C.During the growing season 2012/13 average temperature was between the two known longterm averages (8.5 • C).Apart from two divergences and extremes in February/March 2012 and in February/March 2013, temperatures measured during the two growing seasons of the experiment from September 2011 to August 2013 roughly followed the 14-year trend from 2000 to 2013 (Figure 1).
The distribution pattern of the monthly precipitation, however, showed strong deviations from the long-term average.The average total annual precipitation from 1977 to 1994 was 619 mm, from 2000 to 2013 it was 684 and in 2012 and 2013 it was 792 and 657 mm, respectively.There were very dry periods in November 2011, February and March 2012 and in spring 2013, and some extremely wet months in winter 2011, summer 2012 and May 2013 (Figure 1).
The combination of extremes in winter 2011/12 exposed the plots to a severe winter.After two unusually mild and wet winter months temperatures suddenly dropped at the end of January 2012.Three weeks of black frost with minimum temperatures reaching down to −15 • C resulted in soil frozen to a depth of about 50cm.Although the number of frost days (= daily minimum temperature below 0 • C) in February 2012 was not different than in other years, the number of days with daily maximum temperature below 0 • C was higher in 2012 than it was in 2011 or 2013.Also av- erage minimum and maximum temperatures (−9.9 • C and −5.7 • C) were considerably lower in February 2012 than in the years before and after (Table 1).The lack of snow left the plants unprotected from these extremes.
In mid-February, temperatures increased again and March was warm (average monthly temperature 7.5 • C which is 3.3 • C above the 14-year trend of 4.2 • C) and dry (precipitation was 15 mm, which is only 27% of the 14-year trend).These six relatively warm weeks of drought following the extreme cold worsened the effect of the cold and put surviving plants in the frozen soil under severe water stress.The CCP plots were noticeably damaged, but they recovered.However, most of the 20 parent varieties grown in 2011/12 next to the trial plots in two times replicated plots for seed multiplication, could not cope with the extreme climatic conditions and the severe winter resulted in winterkill in 16 out of the 20 varieties.On average only 33 plants/m 2 were left in the plots in April 2012 and only the four varieties 'Bezostaya', 'Monopol', 'Renan' and 'Hereward' survived with an average of more than 50 plants per m 2 (Figure 2).For winter wheat a density of 80 plants/m 2 or less is seen as an indicator for plowing the whole stand [33] and all plots of the parental varieties were abandoned.

Foliar and Foot Diseases
Disease pressure in both years was low.In both years the dominant disease was Septoria tritici.In 2012, the average infestation of plants on the three top leaf levels was 14% (BBCH stage 73/75), in 2013 it was even lower (10%).In 2012, infestation rates ranged from 12% (CY I) to 17% (OY II), in 2013 disease ranged from 7% ('Achat') to 10% (CA I).There were no relevant differences among treatments in both years (data not shown).
For foot diseases, total incidence and disease severity indices (DI) were slightly higher in 2013 (2012: 13; 2013: 20).The contribution of the two high infection severity classes to DI was, however, low in both years (data not shown) and therefore, overall the plants could be considered almost healthy.In both years Fusarium spp. was the dominating foot disease (DI 11in 2012; DI 16 in 2013), followed by Pseudocercosporella herpotrichoides (DI 2 in 2012; DI 4 in 2013), and Rhizoctonia cerealis ranged last in both years (DI 1 in 2012; DI 0.4 in 2013).There were only small differences among populations and references.A statistically significant difference in overall DI and Fusarium infestation between the CCP org and CCP conv is considered biologically not relevant and was disregarded (data not shown).

Morphological Traits-Straw and Ear Length
In 2012, overall straw length was considerably lower than in 2013 (77.2 cm vs. 90.5 cm, respectively).Overall, the CCPs were significantly shorter than the reference varieties in 2012 but not in 2013 and significantly taller than the mixture of the parental varieties in both years.The QCCPs were always significantly taller than the YCCPs (Table 2).
As expected, within-plot variation of straw length was in both years less for the references than for the CCPs and the mixture.As the references are pure line varieties, within-plot variation of plant height is very limited.The CCPs in contrast, originating from the intercrossing of several parental varieties of different height, show considerable variation in plant height.In 2012, the population CYQ II was tallest (85.0 cm), CY I was the shortest CCP (69.7 cm), and the mixture was even shorter (64.6 cm).CY I and CY II, although significantly taller than the mix of parents, were shorter than the other CCPs and references.All four YCCPs were shorter than the mean height of plants in the trial while all YQCCPs, QCCPs and the references were taller than the mean (Figure 3).
In 2013, 'Capo' was significantly tallest (99 cm), the mix of parental varieties was shortest (65 cm).The two other references were also very short ('Achat' and 'Akteur' with 86 and 87 cm respectively).While 'Capo' was tall or tallest in both years, 'Achat' and 'Akteur' changed in terms of their ranges in straw length values.While 'Achat' and 'Capo' were considerably shorter in 2012 than in the year after, absolute height of 'Akteur' changed only very little (83 cm in 2012 vs. 87 in 2013) and its change of position in the range of varieties and CCPs is only due to the overall taller plants in 2013.
In the group of CCPs, CY I was the shortest in 2013 (88 cm) as it was in 2012, followed by the three other YCCPs.Again, all YCCPs were shorter than the mean height of plants in the trial, forming a subgroup that was statistically distinguishable from the group of the taller YQCCPs and QCCPs (Figure 3, Table 2).
Variation in ear length of the references was similar to the variation in the CCPs.In 2012, ears varied between 8.1 cm (CQ II) and 9.9 cm ('Akteur'), with a mean of 8.8 cm in the trial and no statistically significant differences (data not shown).In 2013, ear length varied between 8.7 cm ('Capo') and 10.2 cm ('Akteur'), with a mean of 9.1 cm.In this year 'Achat' with 10.1 cm and less variance than 'Akteur' had the statistically longest ears.Overall, ear length of the references was significantly greater than that of the CCPs in both years (Table 2).The average number of ear-bearing tillers/m 2 was 130 in 2012, with the fewest tillers found in the mixture plots (107) followed by OQ I plots (121).Most tillers were growing in CQ I plots (140; Figure 4).In 2013, the average number of ear-bearing tillers/m 2 was higher (202), fewest tillers were counted in the 'Achat'-plots (172) and most tillers in OY I plots (229; Figure 4).While in the first experimental year no differences between groups could be found apart from a significant difference between CCPs and the mixture, some groups varied considerably in the second year.References formed significantly fewer ears than CCPs.The YCCPs (223 ears/m 2 ) produced significantly more ears than QCCPs and YQC-CPs (202 and 197ears/m 2 respectively).There were no differences between CCP org and CCP conv (Table 3).

Total Grain Yield
In 2012, average yield in the trial was 4.2 t/ha with 'Akteur' yielding significantly highest (5.5 t/ha) and the mixture yielding lowest (2.9 t/ha).For all four YCCPs yield was less than the average.In 2013, average yield in the trial was 6.1 t/ha, which was 1.9 t/ha more than in 2012, with CY I (C = conventional) yielding highest (6.7 t/ha) and CYQ II yielding lowest (5.4 t/ha).In this year, the YCCPs yielded above average or just about average while QCCPs and YQCCPs yielded lower or just average (with the exception of OYQ II (O = organic) which also yielded above average).Differences in yield were, however, not statistically significant in 2013 (Figure 4).
In 2012, the reference varieties yielded significantly higher than the CCPs while in 2013 there was no differ-ence.The mixture yielded significantly less than the CCPs in both years and in 2012 the YCCPs yields were significantly lower than the QCCPs and the YQCCPs.The six CCP org did not differ significantly from the six CCP conv (Table 3).

TKW
The average TKW was 49.6 g in 2012 (Figure 4) and 48.6 g in 2013 (Figure 4).In 2012, TKW of OYQ I was highest (52.0 g) and of CY II lowest (47.9 g), in 2013 'Achat' had the highest TKW (51.2 g) and the mixture the lowest (44.2 g).In both years, TKW of the CCP conv was 0.8 g lower than for the CCP org .In 2012, but not in 2013, the difference was statistically significant.Also, in 2012 TKW of the yield-group was significantly lower than the QCCPs and YQCCPs.TKW of references and populations did not differ (Table 3).In both years the TKW of the mix was significantly lower than that of the CCPs.

Baking Quality
For the Hagberg falling number (HFN) values <180 and >280 are considered poor with values in between 240-280 good and 180-239 moderate.The other quality parameters (protein content, sedimentation value, wet gluten, baking volume) are usually assigned to three to six class values.Where the rating is done in three classes, values are grouped into the classes good, moderate and poor; based on these classes the cells in the overview table (Table 4) are color coded, with green indicating good, yellow indicating moderate and red indicating poor, in addition to listing the measured values.More detailed ratings can be done for some parameters with classes ranging from very good to inacceptable, these classes are described in Table A1 in Appendix.HFN, which was done for pooled samples in 2012 and by replicate in 2013, was rather high in both years, with an average of 292 sec. in 2012 and 282 sec. in 2013.Sedimentation values were extremely good in 2012 (41 ml on average) and 32 ml in 2013, which is still good, although sedimentation values for several samples were lower (Table 4).Wet gluten was higher in 2012 (average: 28.5%; good) than in 2013 (average: 26.3%; satisfactory).The mean protein content [%] in the trial was medium in both years (12.1% in 2012 and 11.3% in 2013).Baking volume assessed in the second year of the trial was 383 ml on average, which is a good result for wholemeal test loaves.Volume ranged between 344 ml (OY II; satisfactory) and 428 ml (OQ I; very good; Table 4).
In general, it could be observed that in both years YC-CPs were clearly separate from the other populations and varieties with the YCCPs ranging lowest for all baking quality parameters tested.The QCCPs were in both similar to the reference varieties, which is also consistent for all parameters except protein content in 2013, where QCCPs had a significantly higher protein content than the references.The YQCCPs ranged in both years between the other groups of populations and varieties regarding all values tested and also the finding that CCP org and CCP conv did not differ is generally true for both years and all parameters tested (Table 5).The mixture of parents, which yielded very low in both years, showed much better results regarding baking quality parameters.
Values for protein content, HFN, baking volume, wet gluten as well as sedimentation value were close to the average of the trial in both years (Table 4).
When comparing groups (Table 5) the significantly higher baking volume of QCCPs was confirmed.YQC-CPs ranged in the middle and YCCPs had the lowest baking volumes.Comparing the CCPs with the references, volume of references was significantly higher.A comparison of CCP org and CCP conv yielded no relevant differences, also the difference between QCCPs and references is not significant.
For HFN in 2013, the comparison of CCP groups showed the statistically significant highest HFN for the group of QC-CPs (average HFN of group 310 sec.)followed by YQCCPs (average HFN 262 sec.), followed by the significantly lowest group of YCCPs (average HFN 205 sec.).While an average HFN of 310 sec. is considered poor (too high), 262 sec. is good, and 205 sec. is moderate.The references had a significantly higher HFN (average HFN 370 sec.)than the CCPs, which is extremely high and thus poor.
For protein content the comparison of groups showed in 2012 a significantly higher protein content of the CCPs vs. references and higher protein content of QCCPs compared to YCCCPs, YCCPs, and references.In 2013, protein content of QCCPs was higher than the group of YCCPs and the group of references (Table 5).

Discussion
Overall, differences due to the parental background of the CCPs and not due to their conventional or organic history were clearly evident in the trials.Compared to the parental mixtures, the CCPs proved to be highly resilient, recovering much better from winter kill in 2012.Nevertheless, they were outyielded by the references in 2012 but not in 2013.
In contrast, baking quality of the QCCPs was not different from that of the high baking quality reference varieties.

Foliar and Foot Diseases
Disease pressure was low and thus did not play a role for the performance of the CCPs or the references during the two experimental years.Overall, there was neither an influence of the choice of parents nor of the growing system visible.Parents were chosen with the focus on yield and baking quality and not in order to represent different disease resistances, therefore it is unlikely that the CCPs initially differed very much regarding their resistances.Disease pressure in the growing environment where the populations evolved was moderate and did not differ much between the organic and conventional growing area, this meant a strong differentiation of populations was not expected.Higher disease pressure might have resulted in a different picture as the results of other experiments indicate.Observations of powdery mildew (Blumeria graminis f. sp.tritici) in wheat CCPs revealed that the frequency of B. graminis-resistance genes evolved differently according to the respective disease pressure [34][35][36] and Webster et al. [37] found that frequencies of Rhynchosporium secalisresistance genes in a composite cross of barley changed between F 5 and F 45 in accordance with the respective disease pressure.In years when high pressure was recorded the frequency of the resistance genes rose, in years with low pressure, it fell.
Observations in stripe rust (Puccinia striiformis) in a wheat experimental population in France documented that the resistance gene Yr17, which provided complete resistance to stripe rust until 1997 and was thus suspected to be under strong selection, was indeed selected between generations 5 and 10 [38].
Since 2011, new races of stripe rust have made a dra-matic appearance throughout Europe [39] and the main foliar pathogen observed since 2014 in the trial site is stripe rust.In comparison to the susceptible varieties 'Akteur' and 'Naturastar', disease severity on the CCPs has been very low [40].

Morphology
The CCPs as well as the references could not reach their full height potential in the first year due to the extreme weather conditions.The same was reported from regional variety trials, where the average plant height of winter wheat grown without growth regulators in 2012 was reported to be only 87 cm [41].
The parents were equally short in both years as they were mostly dwarf types.In contrast, the CCPs were much taller indicating that the dwarfing genes have decreased in frequency.They might not have been eliminated completely though, as variation for this trait is still quite large.Nevertheless, the CCPs were within the normal height range; they were shorter than the references in the first experimental year and about the same height in the second year.
Findings of Goldringer et al. [27] and Le Boulc'h et al. [26] observing an increase in plant height cannot be confirmed.This could be due to the fact that the tallest plants (>130 cm) were removed from the populations in several successive years to limit their selective advantage.We conclude that the "good practice" of removing the tallest plants in an evolutionary population may improve their agronomic value.It might, however, have obscured any effects of natural selection on plant height.
Morphological characteristics of the parental varieties were documented in 2007 [42].In that year, height of the yield parents was 87.5 cm while the quality parents were 97.1 cm tall on average.Thus, the significantly shorter straw length of the YCCPs compared to the other CCPs are founded in the original composition of the CCPs and should not be understood as divergent developments of the populations over time.
Measurements in the F 5 -F 9 also showed these differences in plant height of the CCPs [43].Ear length has not previously been measured in the parental varieties.However, as the results show only marginal differences between ear length of references and populations, an influence of the parental vari-eties is unlikely.An influence of the two growing systems on straw height and ear length was not found.

Yield and Yield Components
Ear-bearing tillers were at the same low level for all CCPs, the mixture and the references without large variation in summer 2012, which shows that the winter conditions influenced all plots in a similar way resulting in overall low yields.Nevertheless, the resilience of the CCPs and reference varieties was remarkably higher than for most of the parents (Figure 2).Considering the poor survival of the parents in pure stands, the performance of the mixture in 2012 was impressive, demonstrating the general positive effects of mixtures over pure lines as has been shown on many occasions before [7,44].
Based on previous year's results [45] and because they were composed from high-yielding varieties, the YCCPs were expected to yield better that the other CCPs.However, in 2012 they yielded lowest of all CCPs.To explain this, the parental varieties used to create the CCPs have to be taken into account.Of the 20 parent varieties only the four varieties 'Bezostaya', 'Monopol', 'Renan' and 'Hereward' survived the winter reasonably well (Figure 2).As the CCPs were composed in the UK, 14 out of 20 parent varieties were of English origin and thus bred for a maritime climate.'Bezostaya', however, is of Ukrainian origin, has high grain yield and quality, good frost resistance and is often used in crossing where winter hardiness is a desired trait [46].'Monopol' comes from Germany and 'Renan' is French [47], only 'Hereward' is an English variety.
A closer look at the pedigree reveals also here a German winter wheat variety-'Disponent'-as a crossing partner [23] which most likely provided 'Hereward' with a certain degree of winter hardiness.Of these four varieties with good winterhardiness, only 'Bezostaya' was intercrossed into the YCCP, which most likely explains why the winter conditions affected the YCCPs more than the other populations.While it is possible that selection for greater winter hardiness occurred at the German site, this cannot be concretely concluded without direct comparison of early and late generations for this trait, or of populations that have undergone evolution in different climatic conditions.
The comparably good yield of 'Achat', 'Akteur', and 'Capo' in 2012 is most likely owed to their relatively good winter hardiness and to the fact that good winter hardiness was not one of the main traits in focus when establishing the CCPs.It remains to be seen if the CCPs respond better to freezing after having survived one especially cold winter.As we used the same seed in both years the winter effects did not affect the performance in the second year.Results from experiments investigating the effect of natural selection on the winter survival of barley CCPs indicate that natural selection did increase winter survival although not uniformly over different generations [48].In bulk populations of winter oats an improvement in winter hardiness could only be found in populations with low initial survival levels [49,50].Also, apparent advances made in winter survival in one year can reverse in later generations due to a lack of competitive ability of the hardy types later in the growing season [49], when non-hardy types that were not eliminated resurface and restore themselves as major components in the population [48].This shows that complex traits such as winter hardiness, that were not a main focus when establishing CCPs, are hard to achieve through natural selection only.
In 2013, yield of the YCCPs corresponded with expectations being 0.3 t/ha higher than the QCCPs and YQCCPs.These differences were, however, not statistically significant.Yield of the CCP org and CCP conv varied minimally with no indication that their maintenance in different growing systems has led to strong variation between the two groups of populations regarding yield performance.Higher numbers of ears of the YCCPs was related to the higher yielding capacities of these populations.In contrast, the high yields of the references 'Capo' and 'Akteur' were due to high TKW and high number of seeds per head, respectively.This is in contrast to what was previously published by the seed producing industry.'Capo' is known as a density type realizing yields through many tillers and 'Akteur' as a single ear type, forming many seeds per ear with high TKW [51].
A higher TKW was the only parameter that separated the CCP org from the CCP conv in 2012.In the second year, absolute differences where at the same-low--level, the difference was, however, not statistically significant.Apart from this observation there was no field evidence that the differing environments of an organic and a conventional farming system could have shaped the CCPs in different ways.However, a study using hydroponics and bioassays to investigate early vigour and allelopathy in the F 6 and F 11 of the CCP org and CCP conv , documented systems' effects on the CCPs.
Characteristics for early vigour were improved after five years in the organically managed CCPs in comparison to the conventionally managed CCPs.The changes towards early vigour in the organic CCPs are thought to be due to the combined effects of selection for higher nitrogen uptake under low-input conditions, and increased competition for light and larger seeds, rather than a direct adaptation to higher weed pressure [52].

Baking Quality
As baking tests are rather costly and time-consuming, various indirect parameters such as sedimentation value, wet gluten, protein content and falling number are often used to predict the baking properties of wheat flour.It has been assumed that protein and wet gluten content strongly correlate with the baking volume determined in the RMT.This is, however, not always the case [53].In whole-meal-baking tests protein content, sedimentation value and wet gluten content often only have a very limited influence on the baking volume [54].
In our study indirect baking quality parameters were analyzed in both years while baking tests could only be conducted in 2013.The results for protein content and HFN in 2013 were in accordance with the outcomes of the baking tests while wet gluten and sedimentation value were less suitable to predict the baking test outcome.The results show a clear differentiation of groups based on the original composition of the populations for all parameters, except wet gluten.
Baking tests are usually done with the rapid mix test (RMT), which is the usual procedure when testing superfine flour.The RMT is, however, not optimized for the processing of organically produced wheat [55] and considering this, the baking test done in 2013 to assess baking volume of the CCPs was done with wholemeal test loaves.
For a wholemeal baking test the average volume of loaves of 383 ml is a good result.Baking with wholemeal flour, lower volumes are the norm and a volume of 400 ml or above is considered very good, 350 to 400 ml is good, below 350 ml is moderate and 330 ml and below is poor (pers.comm.Dr. R. Aberham).
In the test, all CCPs and references except OY II ranged above 350 ml.The strong differences between varieties that can be observed with white flour are less pronounced when testing with wholemeal flour [56].In this way the results are more likely to correlate with results bakers producing organic bakery products would achieve.The high volumes of the QCCPs compared with YCCPs or YQCCPs indicate that the original choice of parental varieties still has an effect, while adaptation to the farming systems seems to have had no effect on baking volume.The same was true for protein content, falling number, and sedimentation values.In contrast, for wet gluten the influence of parents is not as clearly visible as for the other baking quality parameters.Overall, the QCCPs that were specifically created for good baking quality, are as good (baking volume) or better (protein content, HFN) than modern elite wheat varieties.
While yield is a trait that is subject to natural selection [15,57], quality traits are not directly influenced by natural selection [15].Without the genetic base of high-quality parents the breeding objective of high baking quality cannot be reached [58].Including a parent with low baking quality in the setting up of a high quality CCP can be enough to counteract the high quality parents as some individuals with low quality will prevent the population as a whole from sustaining high quality [15].Results from trials with variety mixtures show other patterns, however.In a mixture of two wheat varieties a higher total aerial biomass was achieved than was produced by each variety grown in a pure stand.This increase resulted in a grain yield similar to that one of the higher-yielding variety and an improved protein content was measured [59].
The crossing design of the CCPs developed by the John Innes Centre and Elm Farm Research Centre took it into account that quality traits are not subject to natural selection.As opposed to the early composite cross populations of wheat and barley [60,61], which were established with the aim of representing the major wheat or barley growing areas of the world in order to assemble genotypes appropriate for each cultural practice in the respective agro-climatic zone [15], the focus was narrowed to yield or quality as key characteristics of the CCPs.The results show that the quality traits were successfully inherited and maintained over time and that acceptable yield levels were also achieved not only in the populations designed to be high-yielding, but also in the high-quality populations which were not much different in yield from the high-yielding populations in the second experimental year.By using seed of the same generation in both years, these genetic effects could be clearly separated from the lack of winter hardiness in the YCCP parentage.
Looking at the yield and quality achieved by the mixture of parents a contrast of low yield in both years, but good quality becomes visible.The CCPs out yielded the parental variety mixture in both years.Here the populations seem to have a clear advantage over the mixture.The overall higher diversity and/or natural selection and adaptation over time may be responsible for this.For the quality aspect natural selection played -as mentioned above -a minor role and QCCPs and parents continued to perform similarly after a decade of selection.

Conlusions
The concept of evolutionary breeding can be one of the new, different and efficient strategies urgently required to face the challenges of climate change, population growth and use of finite resources.The overall question if the growing conditions on either organic or conventional fields influence the agronomic performance of the populations, cannot be answered conclusively.The two years were very different, especially regarding the climatic conditions, and many differences were not consistent over both years of the trial.
The parental selection for the CCPs has a much greater influence on their performance than the growing and management conditions to which the populations are subjected.This can be observed with regards to baking quality traits, as well as with morphological parameters, grain yield and yield parameters.
The choice of parents to establish a CCP is crucial, especially when focusing on traits which are not directly influenced by natural selection (for example, quality traits).In the case of the QCCPs the establishment of a reliable high-quality population worked very well and quality traits were successfully maintained over time.
The results clearly indicate that the intercrossing of several pure line varieties does not strongly disconnect their carefully selected traits and much of the originally exhibited characteristics remain (including lack of winter hardiness, for example).The traits present in the parental varieties determine the performance of the CCPs to a considerable degree, even after several years of adaptation to specific growing conditions, so the initial choice of parents suitable for the intended growing conditions should not be underestimated.
As the populations only evolve slowly or not at all in the absence of high selection pressure, which was illustrated by the reactions to foot and foliar diseases, they might be in danger of being outperformed by newly bred wheat varieties after a decade of maintenance and evolution.The frequent integration of well adapted, modern breeding lines into existing CCPs might help to overcome this constraint.Another strategy could be to apply additional human selection such as mass selection for vigour or disease resistance in the context of participatory breeding approaches.

Figure 2 .
Figure 2. A: Top: Number of plants/m 2 in 20 winter wheat varieties (parent varieties of the CCPs, replicated twice in plots for seed multiplication) counted on the 19 th of April 2012.Error bars denote the standard deviation for each variety (n = 2).B: CCPs straight after the frost, photo taken on March 1 st 2012.C: CCPs (left) and parent varieties (right) six weeks later (photo taken on April 16 th 2012).

Figure 3 .
Figure 3. Straw length, 1 st and 2 nd trial year.n = 200.Shown are median, signifying upper and lower quartiles, minimum and maximum, and, where required, outliner (o = outliner between 1.5 between 1.5× interquartile range and 3× interquartile range; * = extreme value >3× interquartile range).Horizontal line indicates the mean length in the trials.Populations/varieties with the same letter do not differ at p ≤ 0.05 according to Tukey-B test.

Figure 4 .
Figure 4. Number of ear-bearing tillers, grain yield, and TKW in both trial years (n = 4).Horizontal lines indicating average values in the trial, populations/varieties with the same letter do not differ at p ≤ 0.05.

Table 1 .
Number of frost days in February and average minimum and maximum temperatures.

Table 2 .
Straw and ear length.Within the years means of a-priori defined groups were compared using linear contrasts.

Table 3 .
Ear-bearing tiller/m 2 , grain yield [t/ha], and TKW [g] of populations and reference varieties in both trial years.

Table 4 .
HFN, protein content, wet gluten, sedimentation value, baking volume (data from pooled samples).Populations/varieties with the same letter do not differ at p ≤ 0.05 (Tukey-B test).Green = good, yellow = moderate, red = poor.
* Data from pooled samples.† Data from replicated samples (n = 4; protein contents 2013 not significant.