New Features in the CodeMeta Generator
At Software Heritage, the universal source code archive, metadata is the software’s identity card. It provides information that can be used to identify, describe and curate software. To make it easier for users to discover software projects among the millions we archive, having accurate metadata is crucial.
However, there are many different ways for discribing software and capturing this metadata. To ensure uniformity and consistency in descriptive metadata, Software Heritage has adopted the CodeMeta format as discussed in this article. The CodeMeta vocabulary is used when indexing metadata and it is recommended to include a codemeta.json file in all research software repositories to make their metadata machine-readable and easily discoverable.
We recognize the immense value of effectively describing software projects, and we are excited to share the latest developments in the CodeMeta vocabulary and the CodeMeta generator tool.
A recap: What is CodeMeta?
There are numerous metadata vocabularies for describing software projects. CodeMeta addresses this complexity by providing a standardized “Rosetta stone” for translating between different vocabularies. The vocabulary is an extension of Schema.org. CodeMeta allows software metadata to be represented in a consistent JSON format, known as codemeta.json.
From CodeMeta v2.0 to CodeMeta v3.0
The CodeMeta description format is constantly evolving, in order to meet the needs of the research software ecosystem and scholarly infrastructures users as closely as possible. In 2023, version 3.0 has been published, adding the following new vocabulary elements:
review
, which allows you to give review information about the software, in this casereviewAspect
andreviewBody
.role
, which, associated with anauthor
or acontributor
, allows you to define the function (roleName
) that this person has held, and for what period of time (startDate
andendDate
).hasSourceCode
adding a link that states where the software code is for a given software.isSourceCodeOf
adding a link that states where software application is built from a given source code. This is the reverse property of ‘hasSourceCode’.
Some properties also changed name for clarification:
contIntegration
becamecontinuousIntegration
embargoDate
becameembargoEndDate
Just as there are translation files between different metadata description formats, there is the translation file from format v2.0 to format v3.0.
You can find an example of CodeMeta v3.0 for the codemeta project.
Features of the CodeMeta generator
Software Heritage maintains a tool for helping users to create the corresponding codemeta.json
file, the CodeMeta generator. It consists of a simple form that users can fill in to generate a valid file. This file can then be added at the root of the software code repository.
Additionally, the generator now supports creating codemeta.json files in both v2.0 and v3.0 formats, and the form has been redesigned to include new functionalities, such as the Review box and role management.
The form has been reorganized, to include the new Review box.
The new Review box
You can now add roles to authors and contributors.
The new Role functionnality
License(s) field values are suggested and completed from the SPDX licences list.
The SPDX licence list suggestions
That is not all: you can import an already existing codemeta.json
file (in v2.0 or v3.0). The form will be updated with the values found in the codemeta.json
text area (where you pasted your file).
Finally, a little cherry on top, you can directly download the file from the tool!
CodeMeta generator actions
Metadata for citation
With the latest advancements in software metadata, you will soon have the ability to cite source code directly from the Software Heritage archive in BibTeX format, making it easier to reference software in academic work.
Get Involved with CodeMeta!
The CodeMeta project is an evolving community-driven initiative. We invite developers, researchers, and enthusiasts to join discussions, suggest changes, or contribute Pull Requests directly to the CodeMeta generator on GitHub. By collaborating, we can continue to improve the tool and advance software metadata practices worldwide.
Stay updated on the latest developments by following our contributions and announcements!