Cybersecurity researchers from Napier University in Edinburgh have announced a ransomware data set to support ‘cutting-edge’ research into early detection.
Ransomware has become a popular and lucrative method of cyberattack. However, the newly created ‘NapierOne’ can help test and evaluate new detection methods, amid concerns that previous sets of data used in digital forensics have become outdated.
The data set will improve consistency by using standard formats allowing earlier studies to be replicated. This means it can improve the pace and direction of research into ransomware and could help find solutions to the threats it poses.
NapierOne’s creators also believe it is generic enough to support many other fields of research that require a varied mix of common files.
In a new paper published in Forensic Science International: Digital Investigation, Edinburgh Napier PhD research student Simon Davies, Senior Computing Academics Professor Bill Buchanan, and Associate Professor Rich Macfarlane detail the creation of NapierOne as a complement to the decades old Govdocs1 data set.
Davies commented: “It is hoped that the adoption of the NapierOne data set into the implementation, development and testing lifecycles of new ransomware detection techniques will streamline and accelerate the development of more robust and effective detection techniques, allowing independent researchers to reproduce and validate proposed detection methods quickly.”
The most well-known publicly available data set, Govdocs1, was originally designed to reproduce forensic research, but doubts have now emerged about how well it fits under modern scrutiny, with some increasingly popular file types not being well represented.
Additionally, where there has been a lack of useful data sets available to researchers, they have often developed their own and have not distributed them once their work is complete.
In their paper, the researchers identified popular file formats for inclusion as they set about creating a data set containing more than 500,000 unique files distributed between 100 separate data sets and subsets.
The research describes how specific file types were selected, how examples were sourced and how researchers can gain free, unlimited access to the data.
NapierOne is being seen as a “starting point” for an ongoing project which will grow and develop as other researchers provide additional data sets that can be incorporated into it.
Professor Macfarlane said: “Ransomware has been around for many years – encrypting and deleting users’ files and demanding a ransom from the victim. It has become increasingly common, and its sophistication has increased significantly, leading to it currently being the biggest cybersecurity problem globally.
“This work aims to provide a research data set allowing scientific rigour in research towards fighting the ransomware problem. The data set has been created and successfully used in our ransomware detection research.
“Containing over half a million unique files representing real world file types, it is broad and diverse enough to be used in a range of cybersecurity and forensic research areas. We hope the data set will have the same global research impact as the Govdocs1 work.”
Professor Buchanan added: “There are few areas of cybersecurity that need more of a scientific base than in digital investigations, and thus there exists a need to make sure investigators have appropriate tools that have been verified and properly evaluated.
“This data set provides a foundation for researchers to prove their new methods, and thus further support innovation in the area.
“The UK is becoming an international leader in the field of safe technology – which involves the development of tools to support digital investigations and threat detection – and this research showcases the development of a strong scientific base.”