Create Versioned Backups with Git

0
120
Create Versioned Backups with Git

Dit bericht verscheen eerder bij FOSSlife

If you use the Eclipse development environment, the JGit tool, written in Java, is included. Unlike the conventional Git client, JGit also handles the Amazon Simple Storage Service (S3) protocol. If you want to store your repository somewhere instead of, or in addition to, the regular Git server, you can use JGit to move directly into an S3-compatible object store.

Distributing Local Git

For version control, Git creates the versioned repository in the .git subdirectory within the folder to be backed up. However, this does not work for the backup scenario here. The key to success is the ‑‑separate‑git‑dir parameter when creating the repository, which tells Git not to create the object store for versioning in the directory, but on a different path. In this case, .git is not the directory with the files, but a text file with the link to the external directory (i.e., it acts independently from the operating system and filesystem).

In this example, you will be creating a Git repository inside the regular user directory for documents and using a USB drive with driver letter H: for the object store:

mkdirh:git_backup
cd %HOMEPATH%Documents
git init ‑‑separate‑git‑dir=h:git_backup
git add ‑A

If you have not used Git on the system before, the tool will ask you for global variables such as your username and email address; then, it creates the object directory on the USB drive. Depending on the volume of data in the document directory, this step can take some time. In this setup, Git writes its data at a rate of about 1.2GB per minute. The speed also depends on whether Git needs to write small or large files. During the write process you will see a number of warnings, such as:

warning: LF will be replaced by CRLF in <filename>

This message refers to the old dilemma that *nix systems define the end of line (LF, line feed) in a text file differently from Windows (CRLF, carriage return-line feed). Because Git works independent of the operating system, it always saves text files within the object memory in *nix format but leaves the original file unchanged. This automatic conversion should not bother you unless you are restoring a text file stored in Git without the Git tool. For example, if you download a text file from your repository over HTTP from a Git server with a web UI, no CRLF reconversion takes place.

Now that Git has copied all the data to the local repository, you can create a snapshot of the current state with a commit:

git commit ‑m "Repository created"

Now, if you make changes to files, Git tracks them and saves them for the next commit. Each commit is given a unique ID and, for clarity, a name that you pass in with the ‑m parameter – in this case Repository created. Later, you can see in the Git history exactly which files changed in which commit, but if you add new files to the directory, Git does not automatically include them in the repository. You need to run a git add ‑A again before committing. To automate the process, create a suitable batch file:

set commitname=U
  %date:~‑4%U
  %date:~‑7,2%U
  %date:~‑10,2%‑U
  %time:~‑11,2%U
  %time:~‑8,2%
set gitdir=%HOMEPATH%Documents
cd %gitdir%
git add ‑A
git commit ‑m "%commitname%"

Now you can create a shortcut on the desktop and trigger the backup at the push of a button. Alternatively, create an automatic task that performs the backup regularly (e.g., every hour), but keep in mind that Git, like any other program, cannot include open files in the snapshot. Of course, the script can be prettified – for example, to check first whether the USB target disk is connected to the system before triggering the backup and initiating an upload to the server once a day.

Now, to restore a single file to a previous state after an accidental change, use git restore as in Listing 1.

Listing 1: git restore

type mysql_rev1.json
Unfortunately overwritten
git log mysql_rev1.json
commit 780...
Author: Andreas ....
      2200404-1300
...
git restore --source 780... mysql_rev1.json
type mysql_rev1.json
{
    "__inputs": [
      {
         "name": "MySQL",
         "label": "MySQL",
         "description": "MySQL Data Source",
...

Git Server Selection

One of the most popular self-hosted Git servers is GitLab, which of course also runs on the service at gitlab.com. However, the massive GitLab, written in Ruby, requires quite a bit of performance on the part of the server hardware. In return, GitLab delivers a wide range of functions, such as a wiki, a bug tracker, and an integrated container image repository. If you only use a Git server as a backup target, you don’t need all of these features and are better off with a simple, but agile, Git server like Gitea, which handles many databases and runs in containers with low resource requirements (Figure 3). For administrators with a predominantly Windows background, the free Bonobo Git Server in .NET integrates directly with Internet Information Services (IIS).

Dit bericht verscheen eerder bij FOSSlife

Vorig artikelPodcast: Containers, Kubernetes, data protection and compliance
Volgend artikelCloud remains IT investment priority for UK enterprises despite testing economic times