Author Archives: Natasha Simons

About Natasha Simons

Natasha Simons is a Research Data Management Specialist with the Australian National Data Service. Located at Griffith University in Brisbane, Natasha serves on the Council of Australian University Librarians Research Advisory Committee and is an ORCID Ambassador. She is an author and reviewer of papers related to library and information management and co-authored a 2013 book on digital repositories. Natasha was the Senior Project Manager for the Griffith Research Hub, which won awards from Stanford University and VALA. She is an advocate for open data, open repositories and ORCID.

Citizen Science: changing the shape of scholarly communication

The rise of citizen science is one of the trends that are changing scholarly communication today. Fuelled by new digital technologies and an online world, given impetus by the open government movement, everyday Joe’s and Josephine’s across the globe are participating in large-scale science projects in the safety of their own backyard. They are observing nature, collecting samples, taking photographs and videos, measuring things, analyzing and computing data and then contributing these to a myriad of specific science project websites. It’s research, Jim, but not as we know it.

For decades, the general public has had a stake in research: as taxpayers whose governments fund research; as potential beneficiaries of research outcomes; to see research benefit society as a whole. But direct involvement of the public in research projects has generally been limited to volunteer participants in research studies – and that has changed.Through citizen science amateur or nonprofessional scientists are involved in conducting scientific research and there has been an explosion of the number and variety of these projects over recent years. One of the best known is probably GalaxyZoo but you could check out this list of the top 13 CS projects for 2013 or look at one of my favourite’s, Redmap. Even NASA is in on the act. According to NASA, “Citizen Scientists have helped to answer serious scientific questions, provide vital data to the astronomical community, and have discovered thousands of objects including nebulas, supernovas, and gamma ray bursts”. NASA also involves the (mostly technical section of the) general public in hackathons such as the 2015 International Space Apps Challenge that involved over 13, 000 participants in over 130 locations.

Involving the general public in research through helping collect and analyse data and generate apps using the data is a smart way of furthering research and engaging the public (who in turn are likely to be happier funding research projects). The benefits are outlined nicely in many resources such as this briefing paper from the Digital Curation Centre and this edition of the ABC Science Show. Of course there are risks in any citizen science project, such as data reliability, but there are also ways to mitigate the risks (e.g. with scientists checking and verifying the data) and perhaps a level of acceptable risk in light of overwhelming benefits.

The sum of all this is that if everyday Joe’s and Jospehine’s are more involved in research, then they are more likely to be an audience for the outcomes of research – and a very important one if we are to ensure the continuing benefits of citizen science. But how well is research communicated to the general public? Consider some of the challenges:

  • Researchers are focused on conducting research and engaging with other researchers – is it really their job to communicate research outcomes to the general public?
  • Will the public understand scholarly content in its current form (eg. an academic journal article) or is a layperson’s summary required?
  • How frustrated will the public be when they discover so much of the research they have funded, or been involved in, remains locked behind scholarly paywalls?
  • Should governments, institutions and researchers be looking to really harness the power of social media to engage the public in publicly funded research?
  • Can a model like The Conversation serve to communicate research to the general public or are there others models and mediums?

Expanding on the last point, consider the rise of The Conversation. TC was initiated because academics were dissatisfied with the way the media published their stories. The idea behind TC was to build a new media model that pairs up editors with a researcher and lets them collaborate on a story together. Lisa Watts, whom I heard talk at the Digital Science Showcase in Melbourne earlier this year, noted that TC has more than 2.5 million readers with branches in the USA, UK, Australia and Africa. With readability indexed at 16 years old, an open access model with creative commons licensing, perhaps this is the perfect medium to communicate research with the general public. Indeed, the audience for TC is primarily non-academic (80%) encompassing policy makers, government employees, and teachers.

Scholarly communication and citizen science was a hot topic at this year’s Force2015 Research Communications and eScholarship Conference at the University of Oxford. Let’s make it one at The Sydney Conference as well. I’ll leave you with a parting thought: can citizen science be applied to research in the arts and humanities? My guess is that it can, and most likely is, so let’s talk about that too.

Bring your ideas to #Thead5 Dive in and out of communications (multi dimensional)


How do we make data count?

Data generated through the course of research is as valuable an asset as research publications. Access to research data enables the validation and verification of published results, allows the data to be reused in different ways, helps to prevent duplication of research effort, enables expansion on prior research and therefore increases the returns from investment. Yet the quality and quantity of a researcher’s publications continue to provide the key measure of their research productivity. Sharing data, it seems, still does not count for nearly enough.

In recent years there have been a proliferation of policies strongly encouraging and sometimes even requiring researchers to share their data for the reasons outlined above. This includes policies from governments (e.g. USA, Australia), publishers (e.g. PLOS, Nature), and research funders (e.g. NIH, ARC). These policies are certainly opening up more data but even more research data remains locked away and therefore undiscoverable. So how do we unlock more data? One of the ways is to figure out how to make data count so that researchers have more incentives to undertake the extra (and in the main, unfunded) work required to share their data.

A 2013 study by Heather Piwowar and Todd Vision looked into the link between open data and citation counts. They found that the citation benefit intensified over time: with publications from 2004 and 2005 cited 30 per cent more often if their data was freely available; every 100 papers with open data prompted 150 “data reuse papers” within five years; original authors tended to use their data for only two years, but others re-used it for up to six years. More studies like this one are needed to demonstrate and track over time the link between opening up data and making it count, in this case in the form of citations which – like it or not – is still the primary measure of research impact.

Counting data citations – whether to gather citation metrics or alternative metrics (altmetrics) – is challenging in and of itself because data is cited very differently to publications. Data can be cited within an article text rather than in the references section, which means the article must be open access in order for the citation to be discovered. Sometimes the article that referenced the data is cited rather than the data itself even where the reference applies only to the data. Reference managers don’t tend to recognise datasets and therefore don’t record the Digital Object Identifier (DOI), which creates difficulties since DOIs make it so much easier to track citations. There are also many self-citations, where researchers are citing their own data, and so it difficult to distinguish an article that has cited another person’s data. And there are likely to be differences between how data is cited in the sciences as compared to the humanities.

Fortunately, California Digital Libraries, PLOS and DataONE have partnered in an NSF-funded project called Make Data Count. The project will “design and develop metrics that track and measure data use i.e data-level metrics”. The findings promise to be highly valuable and may also shape future recommendations for the way data should be cited in order for it to be counted.

Sharing impact stories of data reuse is perhaps another way that can help make data count. A number of organisations around the world that promote better data management have been collecting data reuse stories (e.g. DataONE, ANDS). Some researchers may see these stories as a negative because they show that “someone else might get the scoop on ‘my’ data”. But these stories can also inspire researchers to spend the extra effort to make their data available when they feel they are ready to. The rewards may not only be in the metrics but in the unexpected ‘buzz’ of seeing ‘your’ data have a longer life and be reused in ways you had not even imagined. Are there other ways that we can help make data count? It’s worth thinking about because “data sharing is good for science, good for you”.

#Thead5 Dive in and out of communications (multi dimensional)