UpdateSWH: check and update archival of a repository
This handy browser extension allows you to seamlessly check if a repository that you are browsing is archived and up to date in Software Heritage.
Getting the extension
You can find and install the Updateswh extension by clicking on the image of the browser you are interested in among the following
Understanding the color code
The color coded button on the right of the browser carries information on archival status and allows to trigger appropriate actions, as follows:
- : the achived copy is uptodate, and clicking on the button opens a tab on the corresponding page of the archive (very practical if you want to get a SWHID permalink)
- : the archived copy is not up to date, and clicking on the button triggers a save code now request to update the archive; right clicking brings you to the page of the last successful archival
- : the archived copy is not up to date, and the last try to archive it failed; clicking on the button triggers a save code now request to update the archive, but there may be technical issues that prevent this repository from being archived at the moment; right clicking brings you to the page of the last archival, that may be failed
- : this repository is not found in the archive (at least not from this origin), and clicking on the button triggers a save code now request to archive it for the first time
- : you have used up your quota of API calls for the moment, either on the SWH API or on the forge API: wait a bit before trying again, or get a token to raise the rate limit (more info below)
- : the API requests to Software Heritage did not succeed. This may happen for several reasons:
- in most of the cases, the repository you are visiting is private, and hence cannot be archived
- nobody is perfect, you tripped on an error in the extension: you may enable debug mode in the options panel, and report the issue on the extension development repository
Tooltips recall all this, so you do not need to memorize all the above, and they provide also additional information (e.g. date of last modifications on the forge and on the archive).
Supported code hosting platforms
The extension supports out of the box BitBucket, GitHub, GitLab.com and GitLab instances whose domain names is of the form gitlab.*.* It also supports Gitea instances whose domain names is of the form gitea.*.* as well as codeberg.org.
You can add other GitLab instances in the options panel (see below).
Support for other code hosting platforms and technologies may be added in the future.
Time delays
When a change takes place in a repository (e.g. a new commit is pushed) or in the Software Heritage archive (e.g. a (new version of a) repository has been ingested in the archive), it may take some time for the event to show up in the API. On the Software Heritage side, it is usually a matter of a few minutes, but on some code hosting platforms we have seen delays in the order of hours.
To determine the right color to show, the extension calls the code hosting API to get the date of last update of a repository, and the Software Heritage API to get the date of the latest archival, so the color shown does reflect exactly what is exposed by these APIs, but this may not be the actual status for a little while.
Choosing your options
The option panel can be opened by clicking on the icon in the browser bar (if it is not in the bar, look for it in the list of active extensions).
This allows you to choose whether to enable the following options:
- Show save request: if selected, this tells the extension to open up a new tab to inspect the save request whenever you issue one by clicking on a yellow or grey button (you can access this very same page anyway by clicking on the button when it turns light green).
- SWH Debug mode: if selected, detailed information on what the extension does is logged in the console of the browser (usually accessible via F12); this is useful for debugging, do not turn it on unless you need it
- GitHub API access token: this text area allows you to paste in a GitHub API access token, useful if you use up the standard GitHub API request quota (which is quite low: 60 requests every 30 minutes only) . If your button starts getting orange after a while, this is your case: please follow the GitHub instruction and create a new token (do not grant it any rights, the extension does not need them!) and paste it here.
- SWH API access token: this text area allows you to paste in your SWH API access token, useful if you use up the standard API request quota. If your button starts getting orange after a while, this is your case: please login or register on the Software Heritage archive first, then create an access token for your extension and paste it here.
- Additional GitLab instances: add here the domain names of other GitLab instances that you want the extension to recognize. One domain name per line, with no prefix or suffix (e.g.: write “src.koda.cnrs.fr”, not “https://src.koda.cnrs.fr/”, see the picture).
- Additional Gitea instances: add here the domain names of other gitea instances that you want the extension to recognize. One domain name per line, with no prefix or suffix.
Be careful: if you paste garbage in the text fields, the extension will not work properly! If in doubt, select all the content of the text areas, erase it to get back to normal operations, and then try again.
Why the install warning “This extension requires access to your data on all websites”?
When you install the extension for the first time, you see a warning along these lines: “This extension requires access to your data on all websites“. Let’s explain the meaning of this message and why you should not be scared by it in this particular case.
The updateswh extension needs access to the source code of the website of a software forge for two reasons:
- adding the colored button to the HTML, as a ‘<div>’ element added to the DOM
- monitoring changes to the URL of the webpage when the website manipulates the DOM behind the scenes (e.g. GitHub)
Unfortunately, there is no way to specify that updateswh does only this, so we need to use a match directive with ‘<any_url>’ in it for the scripts that the extension runs, and this triggers the warning message.
In order to get a less scary message, we could restrict the access only to known code hosting platforms, but this would mean preventing you from using the extension on the platforms of your choice (there are thousands of GitLab instances out there!).
So, we decided to go for the scary message in order to give you more functionality, but you can check in the source code that nothing fishy is going on (we do not use javascript distillers or code obfuscators, and you can read the comments).
Contributing
This extension is Open Source, distributed under the terms of the MIT license. The source code is available in the development repository. Contributions are welcome.