Julian Tonti-Filippini, Cameron Neylon, Kathryn Napier


Research Impact Evaluation System

COKI team member Julian Tonti-Filippini led the construction of a pilot Research Impact Evaluation System (RIES) demonstrating the feasibility of conducting an on-demand, ERA-like analysis for research-active institutions (locally and globally), using journal-level metadata from the Australian Research Council and article-level metadata from publicly available datasets.

Excellence in Research for Australia (ERA) is a periodic assessment that is conducted by the Australian Research Council (ARC). The assessment focuses on the activity of 42 Australian higher education providers (HEPs) across 236 ANZSRC fields of research (FoR). Performance is assessed (per HEP and FoR) by comparing research outputs to local and world benchmarks. Analysis has a citation-focus and draws from publication metadata provided by the participating HEPs.

An ERA report is usually compiled for release every three to five years and employs citation-focused methodology in the analysis of research output data, self-reported by the participating institutions. However, on August 26 this year, Education Minister Jason Clare put ERA 2023 on hold due to the significant reporting burden the process imposes on the sector. There has long been an interest in automating parts of this process to reduce this burden, with a 2021 consultation on the ERA process also noting an interest in enhancing transparency regarding the construction of benchmarks and performance measures. 

The Curtin Open Knowledge Initiative (COKI) aggregates bibliometric and bibliographic data from publicly available sources such as Crossref, Unpaywall, OpenCitations, Microsoft Academic Graph, and OpenAlex. The resultant BigQuery database contains metadata for over 120 million research publications and forms the foundation for further analysis by the COKI team.

We developed RIES to demonstrate how the COKI database may be used to run an ERA-like analysis. The methodology is guided by published ERA methods and makes use of journal-level metadata from the ERA 2023 Journal List. The workflows can be extended, to include any institution (with a ROR identifier) and any research-topic vocabulary that has been assigned to research articles (eg, via machine-learning classifiers). This flexibility, combined with the on-demand capabilities of the system will help us to model and test the approaches proposed for new national research assessments.

Figure 1: An interactive, 3d network plot assembled from computed FoR co-assignment data. Fields are colour coded by theme: physics & chemistry (cyan), mathematics & engineering (orange), earth & biology (dark green), health & human biology (light green), art & design (magenta), finance & economics (yellow), law & philosophy (grey), and culture & society (red). Individual nodes may be selected (eg, 4905 – Statistics).

Figure 2: For each field of research (colour-coded by theme), an interactive time-trace plot shows the number of papers published in 2020 versus the number of citations accrued to date. Time-trace lines show the migration of three selected data points between 2000 and 2020.

The code is publicly available on the project’s github repository, with a subset of the COKI dataset made available via Google Cloud Storage to enable interested users the ability to explore the code and run the workflows in full on this subset of data.

The full COKI dataset is recompiled weekly by the Academic Observatory Workflows, running on the Academic Observatory Platform. The underlying infrastructure requires significant resourcing and we do not currently make the data resource freely available (whereas the codebases are FOSS).

For sustainable development and continuation of this project, our medium-term goal is to establish an institutional membership model. We are seeking expressions of interest from institutions that are interested in working together to build a community-managed system that is responsive to our sector’s needs. Such systems could be used for planning, testing of new evaluation systems and tracking  how Australian HEPs perform against other institutions with a focus on Open Access publication. The report of the ERA Transition Working Group will provide some clear directions for future development and needs. 

We are holding a webinar early in 2023 to further drill down into the detail of what the RIES system can do. If you are interested then sign up here! We are keen to discuss possible collaboration opportunities, analysis services or access models with interested individuals and institutions.