AndreJSON AndreJSON - 3 months ago 11
Git Question

Cycle through several branches with submodules for read only purpose

I have an application that needs to take a snapshot of the file structure in a git repo every 15 minutes. The repo is huge, so I don't want to have multiple copies of it. The repo contains several branches that are of interest to me and they all contain quite a few submodules each.

What I need is a series of git commands that will let me switch to a branch and get the file structure correct so that I can read it. Thereafter switch to another branch and do the same thing.

So far everything I've tried seems to brake at some point (Sometimes runs well for days until it breaks). I just can't get the commands right to be able to handle any type of remote update to the branches or submodules.

So far I've tried a lot of commands sequences, many of which are probably completely wrong. I'm not very used to dealing with huge and complicated repo's.

The one I currently use (gathered from various stack overflow threads):

git checkout [branchname]
git fetch
git reset --hard #Feels weird to do but helped at some point
git pull
git submodule update --init --recursive
<take snapshot>
git clean -d -f -f
<restart chain with new branchname>


The types of errors I typically get is that I can't check out a branch due to tracked or untracked files having changes.

EDIT:
Clarifying some things:

I make no changes to the repo and so I just want to force a checkout, saving nothing.

There are gitignore files but since I'm only pulling and reading these shouldn't matter right?

Regarding the submodules, as I said I'm not too savvy on them so I am not 100% able to answer these questions you had, but the behaviour I wish to have is the same as 'git submodule update --init --recursive' would give me when executed in the desiered branch in the repo.

Finally, the repo contains hundreds of branches of which I only want to check out like five-ten-ish.

Answer

This worked for me on simple example.

Provide the required branches as one string with space-separated branch names.

#!/bin/bash

#set -x  # enable this to see the executed commands, e.g. for debugging

ORIG_BRANCH=`git rev-parse --abbrev-ref HEAD`
BRANCHES=$1

for B in $BRANCHES; do
    git reset
    git clean -dfx # x makes it ignore the .gitignore rules
    git checkout $B
    git pull

    git submodule update --init --recursive
    git submodule foreach --recursive git reset --hard
    git submodule foreach --recursive git clean -dfx

    echo "Taking snapshot of branch $B"
    #<take snapshot>
done

# restore the original state
git checkout $ORIG_BRANCH
git pull
git submodule update --init --recursive

Give it a try and tell me how it goes for your specific case