When creating a diff patch with Git Shell in Windows (when using GitHub for Windows), the character encoding of the patch will be UCS-2 Little Endian according to Notepad++ (see the screenshots below).
How can I change this behavior, and force git to create patches with ANSI or UTF-8 without BOM character encoding?
It causes a problem because UCS-2 Little Endian encoded patches can not be applied, I have to manually convert it to ANSI. If I don't, I get "fatal: unrecognized input" error.
I'm not a Windows user, so take my answer with a grain of salt. According to the Windows PowerShell Cookbook, PowerShell preprocesses the output of
git diff, splitting it in lines. Documentation of the
Out-File Cmdlet suggests, that
> is the same as
| Out-File without parameters. We also find this comment in the PowerShell documentation:
The results of using the Out-File cmdlet may not be what you expect if you are used to traditional output redirection. To understand its behavior, you must be aware of the context in which the Out-File cmdlet operates.
By default, the Out-File cmdlet creates a Unicode file. This is the best default in the long run, but it means that tools that expect ASCII files will not work correctly with the default output format. You can change the default output format to ASCII by using the Encoding parameter:
Out-file formats file contents to look like console output. This causes the output to be truncated just as it is in a console window in most circumstances. [...]
To get output that does not force line wraps to match the screen width, you can use the Width parameter to specify line width.
So, apparently it is not Git which chooses the character encoding, but
Out-File. This suggests a) that PowerShell redirection really should only be used for text and b) that
| Out-File -encoding ASCII -Width 2147483647 my.patch
will avoid the encoding problems. However, this still does not solve the problem with Windows vs. Unix line-endings . There are Cmdlets (see the PowerShell Community Extensions) to do conversion of line-endings.
However, all this recoding does not increase my confidence in a patch (which has no encoding itself, but is just a string of bytes). The aforementioned Cookbook contains a script Invoke-BinaryProcess, which can be used redirect the output of a command unmodified.
To sidestep this whole issue, an alternative would be to use
git format-patch instead of
format-patch writes directly to a file (and not to stdout), so its output is not recoded. However, it can only create patches from commits, not arbitrary diffs.
format-patch takes a commit range (e.g.
master^10..master^5) or a single commit (e.g. X, meaning X..HEAD) and creates patch files of the form NNNN-SUBJECT.patch, where NNNN is an increasing 4-digit number and subject is the (mangled) subject of the patch. An output directory can be specified with