Deanna Thomas
15.11.2022
2 min read
Share on:
Cleaning up a dirty Git history
Sometimes we commit things and don’t notice the consequences. If we’re lucky, we recognize what we’ve done and have the chance to undo it right away. Other times, we can go years without noticing our mistakes and it becomes more and more difficult to clean up the skeletons in our closet. In case I wasn’t clear, I’m talking about Git.
Why would you want to rewrite history in the first place?
Prepare for trouble
It’s [unwanted data] clobberin’ time
- filter-branch lets you rewrite Git revision history by rewriting the branches mentioned in the <rev-list options>, applying custom filters on each revision. Those filters can modify each tree (e.g. removing a file or running a perl rewrite on all files) or information about each commit. Otherwise, all information (including original commit times or merge information) will be preserved.
- This will get us all the files within the directory in the entry variable
- Force remove. From the docs, “Git filter-branch refuses to start with an existing temporary directory or when there are already refs starting with refs/original/, unless forced… The original refs, if different from the rewritten ones, will be stored in the namespace refs/original/."
- This is the filter to rewrite the index file of the Git repository. There are other options, like –tree-filter, but this actually will check out each revision; taking substantially more time to run
Movin’ on up
- This basically says to “keep all files which do not match the following path”. Instead of running it one file at a time like in the above command, we can filter an entire path at once.
- This is the path we want to remove. It there are multiple paths, each one can be added to the command (i.e. --path tests/path1 --path tests/path2 etc.)