Organic Farming | 2015 | Volume 1 | Issue 1 | Pages 3‒18
DOI: 10.12924/of2015.01010003
Research Article
Strategies towards Evaluation beyond Scientific Impact.
Pathways not only for Agricultural Research
Birge Wolf
1,*
, Anna-Maria Häring
2
and Jürgen Heß
1
1
University of Kassel, Faculty of Organic Agricultural Sciences, Organic Farming & Cropping Systems,
Nordbahnhofstr. 1a, 37214 Witzenhausen, Germany; E-Mail: [email protected] (JH)
2
Eberswalde University for Sustainable Development, Department Policy and Markets in the Agro-Food Sector,
Schicklerstr. 5, 16225 Eberswalde, Germany; E-Mail: [email protected]
* Corresponding Author: E-Mail: [email protected]; Tel.: +49 5542981536; Fax: +49 5542981568
Submitted: 21 July 2014 | In revised form: 20 February 2015 | Accepted: 23 February 2015 |
Published: 15 April 2015
Abstract: Various research fields, like organic agricultural research, are dedicated to solving
real-world problems and contributing to sustainable development. Therefore, systems research
and the application of interdisciplinary and transdisciplinary approaches are increasingly
endorsed. However, research performance depends not only on self-conception, but also on
framework conditions of the scientific system, which are not always of benefit to such research
fields. Recently, science and its framework conditions have been under increasing scrutiny as
regards their ability to serve societal benefit. This provides opportunities for (organic)
agricultural research to engage in the development of a research system that will serve its
needs. This article focuses on possible strategies for facilitating a balanced research evaluation
that recognises scientific quality as well as societal relevance and applicability. These strategies
are (a) to strengthen the general support for evaluation beyond scientific impact, and (b) to
provide accessible data for such evaluations. Synergies of interest are found between open
access movements and research communities focusing on global challenges and sustainability.
As both are committed to increasing the societal benefit of science, they may support
evaluation criteria such as knowledge production and dissemination tailored to societal needs,
and the use of open access. Additional synergies exist between all those who scrutinise current
research evaluation systems for their ability to serve scientific quality, which is also a
precondition for societal benefit. Here, digital communication technologies provide opportunities
to increase effectiveness, transparency, fairness and plurality in the dissemination of scientific
results, quality assurance and reputation. Furthermore, funders may support transdisciplinary
approaches and open access and improve data availability for evaluation beyond scientific
impact. If they begin to use current research information systems that include societal impact
© 2015 by the authors; licensee Librello, Switzerland. This open access article was published
under a Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/).
data while reducing the requirements for narrative reports, documentation burdens on
researchers may be relieved, with the funders themselves acting as data providers for
researchers, institutions and tailored dissemination beyond academia.
Keywords: interdisciplinary; research evaluation; societal impact; transdisciplinary
1. Introduction
A crucial aim of agricultural research is to address
sustainable development. Global challenges like climate
change [1] or the degradation of ecosystem services
have fundamental negative impacts on human health
and well-being [2]. Agriculture is both driving and
being affected by those developments ([2] p. 98), [3].
Such challenges require immediate and adequate ac-
tion on the part of the whole of society, but also the
contribution of relevant knowledge through research
([3] p. 3; [4] p. 322). However, whether research is able
to make that contribution depends primarily on the
conditions and incentives within the scientific system.
In this article, the focus will be on research eval-
uation, which can be an important driver for developing
science in the direction of scientifically robust, societally
relevant and applicable knowledge production. Cur-
rently, scientific quality assurance is mainly performed
through peer review of papers and project proposals,
while scientific impact is evaluated based on publication
output in peer-reviewed journals and citation-based
performance indicators (detailed in Section 2.3).
Citations of a publication are a measure of the ac-
knowledgement by the respective researcher's peers.
Citations are counted by and in peer-reviewed journals
that are indexed for citation counting. Furthermore, a
researcher's publication output and citation rates can
be subsumed in an index, e.g. the h-index [5].
Citations are also used as a measure of the recognition
of journals, where all citations of a journal within other
journals are counted, e.g. the Journal Impact Factor
(IF) used by Thompson Reuters [6]. Accordingly,
scientific impact is associated with high publication
output in high-impact journals and high citation rates in
other highly ranked journals. These measures assess,
at best, the impact of research on science itself. How-
ever, they neither assess societal impact nor serve as
proxies for it [7]. As a result, research which similarly
targets audiences outside academia may not be ade-
quately appreciated in research evaluation. The term
societal impact is used here to sum up all the practical,
social, environmental, economic and other 'real-world'
impacts research may have for its target groups and
society as a whole.
To overcome shortcomings in current research eval-
uation practices, several alternative evaluation con-
cepts which take societal impacts into account have
been developed over the past few years (see Section
3.2). However, such an evaluation of societal impact
faces some inherent challenges, including time and
attribution gaps. The term 'time gap' describes the
problem that if impact occurs, it is in most cases with
some delay after completion of the research. Sec-
ondly, the 'attribution gap' means that impacts are not
easily attributed to a particular research activity like a
project or publication. For example, the adoption of a
particular agricultural innovation may be the result of
several research activities combined with policy chang-
es and other influences. Accordingly, the state of the
art of societal impact assessment focuses on the
contribution of research in complex innovation systems,
instead of attributing the impacts linearly in terms of
cause and effect [8]. Furthermore, proxies are often
employed, instead of direct measures of impact. One
example is the concept of 'productive interactions',
defined as direct, indirect or financial interactions with
stakeholders that support the use of research results
and make an impact likely [9].
With bibliometric data it is possible to analyse inter-
disciplinary publications via references from and cita-
tions in different fields [10], as well as interactions
between basic and applied research. By contrast, the
assessment of societal impact (or corresponding prox-
ies) cannot be built on bibliometric analysis, and in
most cases there are no other sources with easy-to-
use data available either. Thus the effort involved in
data assessment for documentary analysis or inter-
views, for example, inhibits the frequent use of such
evaluation approaches.
Starting from these observations, the aim of this
paper is to discuss two possible strategies to facilitate
research evaluation that is more balanced, both with
regard to scientific quality and impact, and to societal
relevance and applicability. The first strategy is to
strengthen general support for such evaluation be-
yond scientific impact; the second is to reduce the
effort of societal impact evaluations by improving data
availability.
Section 2 below introduces the relevant movements
and focuses on shared interests as a base for broader
support of evaluation beyond scientific impact. Section
3 then provides concrete measures for such support,
including possibilities for improving data availability for
evaluation beyond scientific impact. In each section
the paper shows how agricultural research that is
oriented towards sustainability and real-world impact,
with a special focus on organic agricultural research,
could be involved in these developments in order to
create good conditions for its fields of research. We will
4
conclude with an overview of the actions that may be
undertaken jointly by various actors.
2. Multiple Voices Call for Changes in Know-
ledge Production and Research Evaluation
Various societal groups are demanding changes in
knowledge production and research evaluation, for
example researchers and funding agencies engaged in
sustainability, global challenges and transdisciplinary
approaches, the open access movements, and re-
searchers who scrutinise current research evaluation
systems for their ability to serve scientific quality.
2.1. Research Engaged in Sustainability, Global
Challenges and Transdisciplinary Approaches
2.1.1. Sustainable Development Requires the
Support of Interdisciplinary and Transdisciplinary
Research Approaches
Several international assessments synthesise scientific
and non-scientific knowledge via multiple-stakeholder
processes involving science, governments, NGOs,
international organisations and the private sector, for
example the Millennium Ecosystem Assessment (MA)
[2], the International Assessment of Agricultural Know-
ledge, Science and Technology for Development
(IAASTD) [3] and the World Health Summit ([11] pp.
86‒87). These assessments, and some scientific groups
that give policy advice, such as the WBGU (German
Advisory Council on Global Change) [4], point out that
there is considerable pressure on society to tackle
pressing challenges adequately, which in turn requires
knowledge to be produced, accessed and used in ways
that assist such adequate action and are conducive to
sustainable development.
However, the transfer of existing knowledge and
technologies faces several challenges. On the one
hand, the balance of power and conflicting interests
impede the use of research evidence ([2] p. 92). The
reduction in greenhouse gas emissions, for example, is
still not sufficient, although the IPCC has been trans-
ferring the state of the art regarding climate change to
politics for 20 years now. [1]. On the other hand, the
need to increase access, clarity and relevance of
research evidence for politics has been discussed [12].
Furthermore, concepts for the transfer of knowledge
and technology should reflect on possible risks. Instead
of merely assuming the superiority of external know-
ledge and novel technologies, they should be tested
beforehand under actual conditions of use ([3] p. 72)
or evaluated in sustainability assessments [13].
The challenges in knowledge transfer also lead to a
demand for changes in knowledge production in order
to increase the applicability and sustainable benefits
of knowledge. The reasons for such demands are
firstly that technological development is fast and may
have deep, in some cases irreversible impacts on our
ecological, economic or social environment ([14] pp.
87‒93). Secondly, post-modern societies consist of
complex subsystems that function according to their
own inherent rules and often fail to deal with impacts
that occur in more than one of them at the same time
([14] pp. 61‒63, 87‒93). Thus, knowledge production
also needs to cut across specialised areas and societal
subsystems ([15] p. 544; [4] p. 322) and should
support transformative processes ([4] p. 322), [11].
Thirdly, true participation of stakeholders in research
processes is required to support practical applicability,
ownership of solutions and sustainable impact of
knowledge ([2] p. 98; [3] pp. 72‒73; [4] p. 322).
Accordingly, recommendations cover enhanced know-
ledge exchange among disciplines, between basic and
applied research ([4] p. 322) and between science and
politics [12], ([16] p. 9) and the involvement of stake-
holders, including the integration of traditional and local
knowledge ([2] p. 98; [3] pp. 72‒73; [4] p. 322). Such
transdisciplinary processes may also be supported by
involving 'knowledge brokers' as intermediaries to
facilitate knowledge exchange [12], ([17] p. 17). Addi-
tionally, joint agenda setting, including science, politics,
the economy and in particular civil society organisations
is recommended for research regarding sustainability
([4] p. 322) and agriculture ([17] p. 17) and is, in some
cases, already practised [18‒20]. This corresponds to
the aim of civil society organisations to strengthen their
influence in research policy, for example [21].
The recommendations specified in this section are
well subsumed in the terms co-design, co-production,
co-delivery and co interpretation used by the project‐
VisionRD4SD [22]. These recommendations show that
concepts for inter- and transdisciplinary research (e.g.
[23‒26]) and approaches of 'systems of innovation',
understanding innovation as a set of complex proc-
esses involving multiple actors beyond science (e.g.
[27]), are now well accepted in policy advice. Like-
wise, several research funders have started to support
sustainability and transdisciplinarity explicitly in re-
search programming ([14] pp. 202‒214), [28,29].
2.1.2. Current Incentive Systems Are Criticised
Apart from the promising developments mentioned
above, current incentive systems are considered inap-
propriate for encouraging researchers to focus their
research on sustainable development.
Reputation-building processes based on publications in
high-ranking scientific journals and third-party funding
are often governed by disciplinary perceptions and fail
to acknowledge interdisciplinary and systemic ap-
proaches ([4] p. 351). Interdisciplinary research usually
has to match the standards of different disciplines in
peer review processes, which adversely affects publi-
cation success [10], ([15] p. 547) and the evaluation of
multidisciplinary institutions [30]. Audits based on bi-
bliometric performance indicators [15] and, explicitly,
the use of journal rankings [10] have been shown to
5
be biased negatively against inter- and multi-disci-
plinary research.
Some authors discuss consequences such as poorer
career prospects, orientation of research away from
complex social questions, reduction in cognitive diver-
sity within a given discipline or the entire science sys-
tem [10], and an increasing relevance gap between
knowledge producers and knowledge users [15]. Simi-
larly, Schneidewind et al. highlight the diversity of the
sciences in objectives and theories as a base for soci-
etal discussion processes ([14] pp. 30‒33) and good
scientific policy advice ([14] p. 63).
Thus, researchers, institutions and funding agencies
that move towards joint knowledge production for sus-
tainable development may often feel contradicted by
the current incentives within scientific reputation sys-
tems. Accordingly, the indication is that it is necessary
to improve current evaluation practices in general and
apply evaluation criteria beyond scientific impact.
2.1.3. Opportunities for (Organic) Agricultural
Research
Broader support for changes in knowledge production
and research evaluation provides multifarious oppor-
tunities for agricultural research. As organic and sus-
tainable farming addresses and works within the com-
plexity of ecological systems, and farmers' knowledge
and practices are key to building resilient agricultural
production systems, the approaches highlighted in
Section 2.1.1 have, since their early days, been ad-
vocated in agroecology [31] and organic agricultural
research ([19] pp. 15‒16), [32,33], Agricultural re-
searchers are often already in contact with actors
along the whole value chain of agriculture, and ap-
proaches are reflected in diverse concepts for trans-
disciplinarity e.g. [34‒36], and systems of innovation
e.g. [37]. Researchers' experiences, and their aware-
ness of the challenges posed by such approaches e.g.
([19] p. 61), [38], promote their adequate advance-
ment via mutual learning with other research com-
munities. Furthermore, the competence of (organic)
agricultural research to develop applicable solutions
with substantial value in the context of some pressing
social and ecological challenges may become more
visible.
Research evaluation that goes beyond conventional
performance indicators and involves stakeholders is
seen as necessary for agricultural research too ([3]
pp. 72‒73; [17] pp. 81‒84; [19] p. 56). Such research
evaluation may facilitate the application of transdis-
ciplinary and related research approaches without dis-
advantages for researchers' reputations. The necessity
of such incentive effects is supported by various
statements, e.g. "European agricultural research is cur-
rently not delivering the full complement of knowledge
needed by the agricultural sector and in rural com-
munities" ([19] p. 57). Similarly, the evaluation of an
organic agricultural research programme in Sweden re-
sulted in the verdict 'excellent' by scientific peers, while
the agricultural advisors indicated too little relevance to
pressing problems [39]. The DAFA position paper "As-
sessment of applied research" considers it necessary to
build a consensus about possible indicators, make a
commitment to their rigorous application and improve
documentation for practice impact [40]. Thus, (organic)
agricultural research may use its commonalities with
sustainability research in order to jointly advance inter-
disciplinary and transdisciplinary research approaches
and to advocate their adequate support in funding and
appreciation in research evaluation.
2.2. Open Access with Focus on Benefit for Society
Open access movements also aim to increase the ben-
efit of research results for science and society. More
than ten years ago, the Berlin declaration called for
open access for original research results, raw data,
metadata, source materials, digital representations of
pictorial and graphical materials and scholarly multi-
media [41]. Arguments in favour of open access are
for example a) to regard publicly funded knowledge
as public property, b) to enhance the transfer, visibility
and benefit of knowledge, which is now easily pos-
sible via digital technologies and reasonable because
of the increased scientific literacy of the public, and c)
to support participation in democratic societies [41,42].
Furthermore, the open access movements provide
concepts for increased collaboration and interaction in
the creation of research results and pluralisation and
transparency in the evaluation of publications, and
support the full use of technological developments in
data processing (see Section 3.1).
However, the inadequate exchange, use, relevance
and ownership of scientific knowledge in politics,
practice and society indicate that open access alone
does not suffice to create benefits of knowledge. Thus
co-design, co-production, co-interpretation and co-
delivery are necessary on one hand to serve societal
benefit, whilst on the other the dissemination of openly
accessible research outputs tailored to target groups
within and beyond science is also a requirement. Such
a comprehensive view of the benefits of research for
society increases the credibility of the arguments and
supports the view that the corresponding changes in
evaluation criteria can be promoted jointly by open
access movements and research that is concerned with
sustainable development. In our view, (organic) agri-
cultural research is well placed to become a proficient
actor in the process of combining the tasks of these
two groups. The (organic) agricultural research com-
munity is experienced in knowledge transfer and inter-
and trans-disciplinary approaches within the diverse
agricultural sector and is aware of 'open-access issues',
for example interrelations between agriculture and
public goods ([3] pp. 24, 30, 73).
6
2.3. Improve Current Scientific Impact Evaluation
Procedures
In general, evaluation procedures that support scientific
quality are required for both basic and applied research
as foundations for evidence-based decisions. However,
as detailed below, current scientific impact evaluation
procedures are shown to have potential negative con-
sequences for scientific quality. Knowledge of these
consequences and possibilities for improvement is help-
ful for strengthening scientific quality, increasing aware-
ness of the general effects of evaluation processes, and
generating some 'open space' to introduce criteria
related to societal impact.
2.3.1. Challenges of Peer Review as a Socially
Embedded Process
Several criteria are used by the scientific community
to assess scientific quality. The most common are the
novelty and originality of the approach, the rigour of
the methodology, the reliability, validity and falsifi-
ability of results and the logic of the arguments pres-
ented in their interpretation. Peer review processes
are broadly perceived as functioning self-control of the
scientific community towards scientific quality in
publications and third-party funding. Correspondingly,
reviewers trust the fairness and legitimacy of their
own review decisions [43].
Nevertheless, peer review processes also reflect
hierarchy and power within science as a social system.
Editors and peers appear as 'gatekeepers', who not
only maintain quality but also uphold existing para-
digms and decide which of the many high-quality
research papers submitted will be allowed to enter the
limited space available in the journal concerned [44,
45]. Evaluative processes are found to involve not
only expertise, but also interactions and emotions of
peers [46] in ([43] p. 210). Instead of erroneously
assuming that a "set of objective criteria is applied
consistently by various reviewers", it is necessary to
focus on what factors promote fair peer review
processes ([43] p. 210).
Undesired decision processes such as strategic
voting may occur on peer review panels; it has been
suggested that fairness is improved if peers rate rather
than rank proposals and give advice to funders instead
of deciding about funding [43]. Furthermore, in single-
blind reviews, knowledge of the author's person,
gender and institutional affiliation may influence peer
review [43,47‒50]. Double-blind and triple-blind re-
views, the latter including editor-blindness, partly
reduce bias [45], but advantages for native speakers,
preferences for the familiar and insufficient reliability
of reviewer recommendations do remain ([43] p.
210), [48,50]. For example, the agreement between
peers with and without experience in organic agri-
cultural research has been found to be poor with
regard to reviewers' assessment of scientific quality in
organic farming research proposals [51]. In some
cases peer review fails to identify fraud, statistical
flaws, plagiarism or repetitive publication [47,50]. Re-
cently, trials on the submission of fake papers have
revealed alarmingly high acceptance rates, in high-
ranked subscription journals [52] and open access
journals [53]. The latter study includes some pub-
lishers who were already on Beall's list of 'predatory
publishers', which identifies open access publishers of
low quality [54], [55].
Accordingly, further possibilities for improving peer
review processes are being discussed. They focus on
increasing efficacy and transparency in research dis-
semination and quality assurance via the full use of
technological developments in connection with open
access (see Section 3.1).
2.3.2. Self-Reinforcing Dynamics of Bibliometric
Indicators
Bibliometric indicators (Table 1) are also results of
socially embedded processes because, firstly, publi-
cation in a certain journal reflects the decisions of re-
viewers and editors, and secondly, citation-based per-
formance indicators subsume the decisions of many
scientists as to whether to cite or not. In general, the
publication of research evidence is influenced by re-
searcher bias (the observer expectancy effect), which
results in a higher likelihood of false positive findings
and publication bias, meaning that "surprising and
novel effects are more likely to be published than
studies showing no effect" ([56] p. 3). Accordingly, "the
strength of evidence for a particular finding often de-
clines over time". This is also known as the decline ef-
fect ([56] p. 3). Moreover, non-significant results often
remain unpublished. This phenomenon, known as the
file-drawer effect, distorts the perception of evidence
and reduces research reliability and efficacy [57].
The fact that peer decisions are often influenced by
metrics also has to be taken into account: Merton
describes the cumulative processes of citation rates as
the Matthew effect, which follows the principle that
"success breeds success" and results in higher cita-
tions being overestimated and lower citations under-
estimated [58]. Such dynamics are enforced by in-
creasing scarcity of time resources and an augmented
need to filter a large amount of accessible information
[59]. Evidence of the Matthew effect, also called ac-
cumulative advantage, is frequently detected in science
[60] and considered by scientists to be the major bias
in proposal evaluation ([48] pp. 38‒39).
A further interaction occurs between metrics and
strategic behaviour: as person-related indicators of
productivity (publication output) and impact (citation-
based indicators) influence funding or career options
[61], dividing results into the 'least publishable unit'
[62], increasing the number of authors, or citing 'hot
papers' are strategies for boosting scientists' per-
formance indicators [45].
7
Furthermore, indices may hide information. The
popular h-index combines publication output and
citation rates in one number. It reduces the dispro-
portionate valuation of highly cited and non-cited pub-
lications, with the result that researchers with quite
different productivity and citation patterns may obtain
the same h-index. This has been criticised, and the
recommendation is to use several (complementary)
indicators to measure scientific performance, in par-
ticular separate ones for productivity and impact [63].
The relevance and use of journal-related metrics
are also subjects of intense debate. A review of
several empirical studies about the significance of the
Journal Impact Factor (IF) concluded that "the lit-
erature contains evidence for associations between
journal rank and measures of scientific impact (e.g.
citations, importance and unread articles), but also
contains at least equally strong, consistent effects of
journal rank predicting scientific unreliability (e.g.
retractions, effect size, sample size, replicability, fraud/
misconduct, and methodology)" ([56] p. 7). For ex-
ample, a correlation was detected between decline
effect and the IF: initial findings with a strong effect
are more likely to be published in journals with a high
IF, followed by replication studies with a weaker ef-
fect, which are more likely to be published in lower-
ranked journals [56].
Moreover, the IF and other journal-based metrics
are increasingly considered inappropriate for com-
paring the scientific output of individuals and insti-
tutions. This is indicated by the San Francisco Decla-
ration on Research Assessment (DORA), currently
signed by nearly 500 notable organisations and
11,000 individuals [64]. DORA substantiates this
statement with findings which show that a) citation
distributions within journals are highly skewed; b) the
properties of the IF are field-specific: it is a composite
of multiple, highly diverse article types, including
primary research papers and reviews; c) IFs can be
manipulated (or 'gamed') by editorial policy; and d)
data used to calculate the IF are neither transparent
nor openly available to the public [65]. Gaming of the
IF is, for example, possible by increasing the pro-
portion of editorials and news-and-views articles,
which are cited in other journals although they do not
count as citable items in the calculation of the IF [66].
Thus, journal-based metrics are not only found to
be unreliable indicators of research quality; the pres-
sure to publish in high-ranked journals may also com-
promise scientific quality. Furthermore the latter
"slows down the dissemination of science (...) by iter-
ations of submissions and rejections cascading down
the hierarchy of journal rank" ([56] p. 5) which also
enormously increases the burden on reviewers, au-
thors and editors [67].
In agricultural research, some scepticism about jour-
nal-related metrics is already evident: the Agricultural
Economics Associations of Germany and Austria, for
example, perform 'survey-based journal ranking', be-
cause this was perceived to be more adequate than
using the IF [68].
Apart from current criticism, efforts in indicator de-
velopment should be acknowledged. In article-based
metrics, the weighting of co-authoring and highly
cited papers, excluding self-citations, leverage of time
frames and inclusion of the citation value (rank of the
citing journal) aim to assess scientific impact more
precisely. Similarly, the further development of jour-
nal-based metrics (see Table 1) involves the exclusion
of self-citations and inclusion of citation value, the
weighting of field-specific citation patterns, the inclu-
sion of network analyses of citations or weighting the
propinquity of the citing journals to one another [69].
Nevertheless, the self-reinforcing dynamics of biblio-
metric indicators and their interactions with the cred-
ibility of science are not taken into account in these
indicator variations. For example, the weighting of
citation value may even increase accumulative advan-
tage.
To sum up, it seems appropriate to improve peer-
review processes, to reject certain indicators, and
crucially, to apply a broad set of indicators, because
scientific performance is a multi-dimensional concept
and indicators always contain the risk that scientists
will respond directly to them rather than to the value
the indicator is supposed to measure ([10] p. 7).
Explicitly, DORA recommends that funding agencies
and institutions should "consider a broad range of
impact measures including qualitative indicators of
research impact, such as influence on policy and
practice" [65]. As societal benefit requires scientific
quality as a base for evidence, but also goes beyond it,
needing a high degree of applicability and positive
application impacts, these are in fact supplements, not
opponents. Therefore, enriching scientific performance
with societal impact indicators can result in decisions
and incentives in the scientific system that are more
reliable and more beneficial to society.
Table 1. Indicators that are frequently used for scientific impact evaluation.
Citation count In general, the number of citations received by a paper is counted. They can be summed
up for all publications of an institution or person, or calculated relative to the average
citation rate of the journal or respective field over a certain period (usually three years)
[70,71].
Citation data are counted (except examples provided in Section 3.1) for and in journals
indexed in the Journal Citation Report by Thompson Reuters or in the SCImago database
by Elsevier [69]. Citations are generally assessed in papers, letters, corrections and
retractions, editorials, and other items of a journal.
8
h-index The h-index combines publication output and impact in one index: h = N publications with
at least N citations, (where the time span for calculation can be selected). For the h-index,
there are some derivatives that include the number of years of scientific activity, excluding
self-citations, and weighting co-authoring and highly cited papers [5].
IF and journal-
based metrics
built on
Thompson
Reuters
database
The Journal Impact Factor (IF) is calculated by dividing the number of current-year
citations to the source items published in that journal during the previous two years by the
number of citable items. It can also be calculated for five years and exclude journal self-
citations [6]. Example:
IF =
Number of citations ϵ 2014 for articles of journal A published ∈2012∧2013
Number of citeable itemsϵ journal A published ϵ 2012∧2013
Another metric is Article Influence, in which the citation time frame is five years, journal self-
citation is excluded and the citation value (impact factor of the citing journal) is weighted
[69].
Eigenfactor Eigenfactor also uses Thompson Reuters citation data to calculate journal importance with
several weightings. It includes network analysis of citations, weighting citation value and
field-specific citation patterns [72].
Journal-based
metrics built on
Elsevier's
Scopus
database
All indicators are calculated within a citation time frame of three years. The Source Impact
Normalized per Paper (SNIP) is calculated in a similar way to the IF. The Scimago Journal
Ranks (SRJ and SJR2) limit journal self-citation and weight citation value. SJR2 includes a
closeness weight of the citing journals, meaning that citation in a related field is calculated
as being of higher value, because citing peers are assumed to have a higher capacity to
evaluate it [69].
3. Concrete Strategies to Support Evaluation
beyond Scientific Impact
While Section 2 introduced relevant movements and
pointed to shared interest as a base for further coop-
eration, this section will describe concrete measures
for facilitating evaluation beyond scientific impact. As
seen in the previous section, evaluation beyond sci-
entific impact may introduce criteria for various as-
pects of knowledge production (Figure 1).
3.1. Open Access and Technical Development
Provide some Solutions to Improve Current
Evaluation Practices
Although the quality of peer reviews and self-rein-
forcing dynamics affect open and subscribed publi-
cation models, several possibilities for increasing effi-
cacy in dissemination and quality assurance via digital
communication technologies are discussed in the con-
text of open access. For peer review processes, in-
creased transparency is the core issue [73]. Open re-
view, meaning that reviews are published with the pre-
print or the final paper, is possible with different
degrees of openness and interactivity [42], though
some aspects are discussed controversially. Disclosure
of authors' identities entails the risk of increasing bias
as in single-blind reviews [74], while disclosure of
reviewers' identities is shown to preserve a high quality
of reviews [75], though suspicions do remain that this
may inhibit criticism and make it more difficult to find
reviewers [47,76]. However, the publishing of reviews,
enabling interactions between reviewers and authors
and increasing the basis of feedback and valuation via
comment, forum and rating functions for readers, is
commonly expected to increase transparency, fairness
and scientific progress [44,67,73]. Some applied
examples are the Journal BMJ [42], Peereva-
luation.org [77] or arXiv.org. At arXiv.org the pub-
lication of manuscripts accelerates dissemination and
reduces the filedrawer effect; in case of revisions and
publication in a journal, the updated versions are ad-
ded [44,78]. Another possibility is to guarantee
publication (except in cases of fraud), but not until
there has been a double-blind review of the man-
uscript focusing solely on scientific quality [67]. Re-
views and revised versions may be used for suggested
new publication concepts with a modified role for
editors [67] or even without journals [56], but also for
the current system, where they can serve to assist in
publication decisions made on the editorial boards of
individual journals.
9
Figure 1. Possible criteria for evaluation beyond scientific impact regarding various aspects of knowledge
production.
Additionally, review approaches should allow the
engagement of peers in research evaluation to be re-
warded [67] and the quality of peer review activities
to be assessed [77].
Open access to data is supported by several actors
[79]. It enables verification, re-analysis and meta-
analysis and reduces publication bias, thus safeguard-
ing scientific quality and societal benefit [80]. Ac-
cordingly, it is suggested that the full dissemination of
research and re-use of original datasets by external
researchers should be implemented as additional per-
formance metrics [80].
Diverse citation and usage data can be accessed via
the Internet for all objects with a digital object iden-
tifier (DOI) or other standard identifiers [81]. Thus, ci-
tation counting beyond Thomson Reuters or Scopus
databases is possible, e.g. via Google Scholar, CrossRef,
or within Open Access Repositories [42]. Furthermore,
responses to papers can be filtered with various Web
2.0 tools (e.g. Altmetrics.com [82]), which are often
combined with platforms to share and discuss diverse
scholarly outputs (e.g. Impactstory.org). Such data are
also tested for the evaluation of the societal use of
research [83]. Consequently, the call for open metrics
includes open access to citation data in existing citation
databases and all upcoming metrics that record cita-
tions and utilisation data [42].
In conclusion, there are many opportunities for
increasing transparency and interaction in review pro-
cesses, facilitating and acknowledging cooperative
behaviour and including a higher diversity of scientific
products and ways of recognising them in research
evaluation processes. This may help to improve cur-
rent evaluation systems. Until now, these approaches
have mostly been restricted to scientific outputs, but
they may likewise be used to disseminate outputs and
implement feedback functions tailored to diverse user
communities outside academia. For example, en-
hanced data assessment and communication tools are
also found to support the concept of citizen science
[84], where citizens carry out research or collect data
as volunteers [85].
3.2. Science Politics towards Changed Incentive
Systems
Science politics, funding procedures and applied eva-
luation criteria are important drivers of research fo-
cuses, and therefore determine what knowledge will
exist to face future societal challenges. As seen al-
ready in Section 2.1.1, research funders are increas-
ingly interested in supporting transdisciplinarity and
related research approaches and they also support
open access. For example, the most recent European
research programme, "Horizon 2020" [86,87] highlights
the need for multi-stakeholder approaches and the
support of "systems of innovation" via European Inno-
vation Partnerships [88]. It also makes open access to
scientific peer-reviewed publications obligatory and
tests open data approaches in certain core areas [89].
Adequate measures to support "Research and De-
velopment for Sustainable Development" via research
programming are provided by VisionRD4SD, a col-
laboration process between European research fund-
10
ers. It identifies measures for the whole programme
cycle, presents them in a prototype resource tool and
recommends a European or international platform to
support networking, dialogue and learning processes
on this subject [90]. Likewise, a guide for policy-
relevant sustainability research is directed at funding
agencies, researchers and policymakers [91].
Institutions and funders who are interested in
applying concepts of research evaluation beyond
scientific impact (see criteria in Figure 1) can build on
existing approaches. Evaluation concepts are developed
for interdisciplinary and transdisciplinary research and
for societal impact assessment used by research agen-
cies, research institutions or for policy analysis (reviews
may be found in [92‒94]). Examples of regularly ap-
plied evaluation procedures including societal outputs
are the Standard Evaluation Protocol for Universities in
the Netherlands ([95] p. 5) (see below) and the
Research Excellence Framework in the UK [96].
In the section that follows, we will suggest meas-
ures to ensure, that evaluation beyond scientific im-
pact is effectively. First, steps should be taken to
ensure that societal impact criteria are applied by
reviewers, although these indicators may be felt to be
outside of reviewers' realm of disciplinary expertise
[97] or of lesser importance to them ([48] pp. 32‒35).
Interestingly, in one study ([48] pp. 32‒35), societal
impact indicators such as relevance for global societal
challenges or citizens' concerns, public outreach,
contribution to science education and usefulness for
political decision-makers were ranked higher in agri-
cultural research than in other fields, and they were
ranked higher by students than by professors. Such
results suggest that not only peers, but also knowledge
users ([15] p. 548), [97] should be involved in
evaluation. To increase the ability of scientists and
others to judge societal impacts, data on the societal
impact of research and their proxies (hereinafter
subsumed as societal impact data) could provide a
transparent and reliable basis for such judgement.
Furthermore, the experiences documented in
Section 2.3 suggest avoiding narrow indicator sets and
their use for competitive benchmarking or metrics-
based resource allocation. Instead, broad indicator
sets and fair and interactive processes which support
organisational development [30] or learning processes
[98] need to be applied. One example is the above-
mentioned Standard Evaluation Protocol in the
Netherlands, where "the research unit's own strategy
and targets are guiding principles when designing the
assessment process" ([95] p. 5).
However, when funders or institutions begin to apply
evaluation beyond scientific impact, they should focus
on increasing the acknowledgement of societal impact
within the scientific reputation system in general. This
is necessary to ensure that their incentives are effective
and do not merely increase researchers' trade-offs
between contributing to scientific and societal impact.
Adequate measures adopted by funders could be
additional funding or distinctions of particularly suc-
cessful projects as "take-home values" for researchers.
Moreover, research institutions and research funders
should become active in improving data availability.
Only with reliable and easy-to-use data beyond scien-
tific impact can balanced research evaluation be con-
ducted frequently enough to provide the desired in-
centives within the scientific system.
Until now, research funding agencies have often
demanded detailed reporting on the dissemination and
exploitation of results. In German federal research,
exploitation plans are required as text documents for
proposals and reports [99]. Proposals for Horizon 2020
include plans for dissemination and exploitation ([100]
p. 17), but the need to improve digital data assessment
for evaluation purposes is also emphasised ([101] p.
47). However, texts with societal impact descriptions
cannot be analysed with ease, and the facilities they
offer in terms of filtering and cross-referencing are also
poor, so they have little value for research evaluation or
for the sharing of the information within the scientific
system. Likewise, the use of digital systems is only
valuable if they allow multiple reuse of data.
4. Improve Data Availability for Evaluation
beyond Scientific Impact
To improve the availability of data for societal impact
evaluation, we recommend uniting the interests of
institutions and funders in such data and giving them
more leverage by making use of the current state of
interoperability in e-infrastructures, especially research
information systems and publication metadata.
Interoperability, in general, enables the exchange,
aggregation and use of information for electronic data
processing between different systems. Its functionality
depends on system structures and exchange formats
(entities and attributes), federated identifiers (for
persons, institutions, projects, publications and other
objects) and shared (or even mapped) vocabularies
and semantics [102]. Thus, interoperability includes,
besides technical aspects, cooperation to reach
agreement.
The interests of institutions and funders in societal
impact data may be served by the possibilities of
Current Research Information Systems (CRIS). These
are used increasingly by research institutions as a tool
to manage, provide access to and disseminate re-
search information. Standardisation of CRIS aims to
enable automated data input, e.g. via connection to
publication databases, and ensure it is only necessary
for data to be input manually once but can be used
many times (e.g. for automated CVs, bibliographies,
project participation lists, institutional web page gen-
eration, etc.) [103]. Standardisation is promoted by
euroCRIS via the CERIF standard (Common European
Research Information Format) [103] and CASRAI
(Consortia Advancing Standards in Research Admin-
istration Information) via the development of data
11
profiles and semantics [104], and is embedded in
diverse collaborations with initiatives related to inter-
operability and open access [105].
The CERIF standard is explicitly convenient for
enabling interoperability between research institutions
and funders, because research outputs can be as-
signed to projects, persons and organisational units.
In the UK, interface management between the re-
search councils and higher education institutions is
already established, and societal outputs and impacts
are part of the data assessment [106,107]. The aim is
to develop these systems further by applying the cur-
rent CERIF standard in order to increase interoper-
ability with institutional CRIS. It has been shown that
output and impact types used in the UK can be
implemented in the current CERIF standard [108].
Accordingly, research funders should engage in the
development and use of CERIF-CRIS that (a) include
data related to interactions with, and benefit for,
practice and society, and (b) partly replace written
documents in the process of application and reporting.
They should (c) act as data providers by making data
available, e.g. via interface management with re-
search institutions, file transfer for individual scientists
and re-use of data for subsequent proposals and re-
ports. Thus, funders can contribute to the provision of
comprehensive societal impact data without increasing
the documentation effort for scientists. In doing so,
they also help to corroborate and ensure the quality
of such data.
To facilitate these aims, several measures can be
applied. Regarding (a), it is necessary to develop
shared vocabularies for societal impact related to out-
puts and outcomes. Compiling societal impact data
(based on existing evaluation concepts and docu-
mentation tools) and structuring them in coherence
with CRIS standards (e.g. CERIF, CASRAI) is one task
in the project 'Practice Impact II' [109]. Furthermore,
funders, researchers and their associations that are
interested in societal impact could formulate a man-
date to CASRAI and euroCRIS to further develop
shared vocabularies for types and attributes of output,
outcome and impact towards society and stay in-
volved in this process. Such a commitment would also
facilitate the integration of societal impact data in their
CRIS by different providers, and this would create a
base for data transfer between funders and institutions
with regard to (c).
Regarding (b), it is necessary to build a closer con-
nection between those data and the documentation
requirements in proposals and reports. The above-
mentioned research project, "Practice Impact II", is
developing this with a focus on German federal re-
search in the realm of organic and sustainable agri-
culture. The project integrates the user perspectives of
scientists, research funding agencies and evaluators in
its development and testing [109,110], in order to
achieve the required usability and reduction in effort,
with regard to (c), above.
Figure 2. Possibilities for using and developing Current Research Information Systems (CRIS) for inter-
operable data transfer between funders and institutions to assess and use societal impact data without
additional effort.
Regarding (c), there are further possibilities besides
the interoperability between funders and institutions.
CRIS, with their function as repositories, are also tools
for presenting research results to the public. Research
funders could use them to support open access
dissemination tailored to specific target groups within
and beyond academia. Furthermore, closer con-
nections between societal impact data and scientific
publications might be established.
For bibliographic metadata of publications, such as
authors, title, year, interoperability has already been
developed further than it has for other research out-
puts. Common vocabularies for publication types, ad-
vancement of standards and mapping between dif-
12