Simant Simant - 2 months ago 16
C# Question

!File.Exists is not working as expected for filenames containing UTF-8 characters

My console application (C#) is working perfectly for the filenames which don't contain any UTF-8 characters but when the filenames contain any UTF-8 character, my condition if(!File.Exists(destFilePath)) is not working as expected.

I need to delete those files which are only present in the destination but not in the source. When there some special characters in my file name, for example,

file

C:\A00000001\20162350775-Étienne Geoffroy Saint-Hilaire, 1772-1844 a visionary naturalist. Hervé Le Guyader.pdf

destFilePath

D:\A00000001\20162350775-Étienne Geoffroy Saint-Hilaire, 1772-1844 a visionary naturalist. Hervé Le Guyader.pdf

The filename in the above case should not be deleted as both source and destination have the same filename but it did. But for normal filenames, there is no issue. My code sample is as below:

public void SynchronizeSourceAndDestination(string dir)
{
foreach (string file in Directory.GetFiles(dir))
{
string destFilePath = file.Replace(BackupDirectory, LookupDirectory);

if (!File.Exists(destFilePath))
{
// Delete file from Backup
File.Delete(file);
}
}

foreach (string directory in Directory.GetDirectories(dir))
{
string destinationDirectory = directory.Replace(BackupDirectory, LookupDirectory);

if (!Directory.Exists(destinationDirectory))
{
Directory.Delete(directory, true);
continue;
}
SynchronizeSourceAndDestination(directory);
}
}


Note: The asp.net web application has the setting globalization culture="en-US" uiCulture="en-US" requestEncoding="UTF-8" responseEncoding="UTF-8" fileEncoding="UTF-8" in the web.config file. The above code is C# console application to process the files saved by the web application. There is no issue with the filenames in my local machine but when the code is in the server, it is not working.

Answer Source

To make my solution workable I changed extended ASCII character by pressing É (Alt + 144), é (Alt + 130). I think it was because the file creator did some copy and paste of the characters directly.