How to remove Big Files and Folders from Git or How To Fix Github’s Maximum File Size Problem

I have recently had the problem that I wanted to move my Git repositories from a self-hosted repository to Github. One of those repositories contained a fair amount of external libraries (to be precise lots of Cocoa pods). When I tried to push the repository to Github, I would get to see warnings like this:

~/D/north-kites-ios (develop↑1|✔) $ git push github develop

remote: warning: File Pods/GoogleMaps/Subspecs/Maps/Frameworks/GoogleMapsCore.framework/Versions/A/GoogleMapsCore is 74.85 MB; this is larger than GitHub's recommended maximum file size of 50.00 MB

error: failed to push some refs to ''

I have been looking around in the Internet for a solution on how to remove this big file from the repository because it is not really necessary (I can add the library manually after cloning the repository). What I found were manuals on how to remove a big file from the repository like this. However, the first difference in comparison to the linked post was that the problem is rather caused by a folder than a file (BTW, there were warnings for more folders by Github). Further to this, I did not know in which revisions the files and folders were added. I did a quick search in the git logs and came up with the appropriate hash for the folder shown above but found it too tedious to search in the logs for each folder rejected by Github.

So here is the solution to my problem. I took the git command line command from here:

git filter-branch --index-filter 'git rm --cached --ignore-unmatch dumpfile.sql' merge-point..HEAD

This command searches in the git history for each occurence of a certain file and removes this file from each commit (thus, rewriting the git history) since the commit with the hash merge-point. For each folder rejected by Gihub, I exchanged merge-point with the hash from the earliest commit in the branch. I furthermore exchanged the file name with the folder path which I wanted to be completely removed from the git history. In order to remove the GoogleMaps pod from my branch, I modified the command like this:

git filter-branch -f --index-filter 'git rm -r --cached --ignore-unmatch Pods/GoogleMaps' 4122bf43..HEAD

4122bf43 is the hash of the earliest commit that I found in the current branch. I recursively delete the GoogleMaps folder by adding the -r parameter to git rm. I furthermore force the filtering by setting the -f option to git filter-branch.

I adapted and executed the command above for each folder rejected by Github. Moreover, I had to repeat the rewriting of the Git history for each branch that I wanted to push to Github (fortunately, I cleaned up the old branches in advance and was only left with develop and master).

BEWARE: You do change the Git history heavily. The code from lots of older commits will most possibly not work properly anymore, as you have to add the libraries again. But re-adding those libraries should not cause that much of a problem

Posted in git