GitHub took a snapshot of all projects posted on the repository. About 2TB of data is stored on film reels, which are expected to last for 1,000 years. (Image Credit: GitHub)


Every project posted on GitHub before February 2, 2020 has been successfully stored away in an Arctic vault for 1,000 years. Last year, the code-hosting platform first revealed its plan as part of the Archiving Program. However, the pandemic set those plans back, but GitHub announced that the code was deposited on July 8 in a decommissioned coal mine.


The code-hosting platform took a snapshot of all active public repositories on February 2. GitHub states that 21TB of repository data and important dormant ones have been written on 186 digital photosensitive archival piqlFilm. These reels, expected to last 1,000 years, were transported to Svalbard, Norway, and were deposited in the Arctic Code Vault on July 8.  The reels are buried in a chamber deep inside hundreds of meters of permafrost. In the future, GitHub plans on laser-engraving all active repositories on quartz glass platters, lasting for 10,000 years.


The snapshot includes the HEAD of the default branch of every repository, minus binaries that exceed 100kb.  Every repository is packaged as a TAR file, and to keep it efficient, most of the data is stored as QR codes. Github also engraved the reels with a guide that defines the principles of software. Before explaining the context of each project, GitHub added a disclaimer that shows the specifications needed for each project to work.


GitHub is also rolling out a special badge for developers in the highlights section of their profile. This is mainly aimed for those who contributed their projects to the Arctic vault. Their contributions can be displayed by hovering over the badge.  


So, why exactly is GitHub doing this? Even though software development rapidly progresses, GitHub believes that lost technologies may have benefited the world. For example, abandoned tech such as Roman concrete and anti-malarial DFDT found unexpected new uses. Additionally, the archive program doesn’t exist solely for future generations, but also for currently unforeseeable futures as well.


Although the code is stored there, how about digitizing it from the source to use again? It’s like storing floppy disks but not drives. Even if you have drives, will connectors exist in 500 years to go from IDE to whatever is standard? I suppose people can always hand-scribe all that code.



Have a story tip? Message me at: cabe(at)element14(dot)com