New
#1
Unreadable non-ANSI characters in Notepad
The problem:
People living in countries, with languages including non-ANSI characters and want a full English Windows environment.
If the user sets the System locale (Language for non-Unicode programs) to the country they live in, then many apps will check this setting and without giving the user any option, are installed with a localized interface, i.e. GUI based on the System locale, which might not be desirable.
The apparent resolution is to change the System locale to English (US), which solves the apps interface issue, but because we’re talking about Microsoft Windows there is (as always…) an exception. In this case is Notepad…
Notepad has ANSI (= ASCII & Extended ASCII) as its default setting for saving text files. If the text file contains non-ANSI characters then it gives a warning…which if you accidentally bypass and save the file with the ANSI encoding, all non-ANSI characters become unreadable.
Being such a user, I have an English (US) installation and to avoid the localized app interface, I have set the System locale to English (United States).
For some reason, which I haven’t found yet, before version 1803, I could save text documents with Greek (non-ANSI) characters and since I wasn’t getting the encoding warning (at least not that often) when saving the file, a lot of files with Greek characters were saved as ANSI and had no problem.
This encoding issue has become stricter in 1803. My guess is the “Beta: Use Unicode UTF-8 for worldwide language support” setting that has been added when you change system locale has something to do with it. Either way this is, as stated, still in Beta, thus it doesn’t work as it supposed to, yet!
So how to read all these text files with ANSI encoding, which contain non-ANSI characters, that are now unreadable?
The solution:
Step 1
Go to: Settings > Time & Language > Region & Language > Related Settings > Administrative Language Settings (opens Control Panel) > System locale (Language for non-Unicode programs)
Alternatively, for short, type in Windows search/Cortana:
control.exe /NAME Microsoft.RegionalAndLanguageOptions /PAGE /p:"Administrative"
and change the “System locale (Language for non-Unicode programs)” to the locale of the country you live in (Greece in my case).
The system will need to reboot. Click Restart.
Step 2
Download the UnicodeConverter.zip, save and extract it on your Desktop. The zip file contains three scripts:
CheckFileEncoding.ps1
ConvertFilesToUnicode.ps1
ConvertFilesToUnicode_NoBOM.ps1 (for advanced users)
Step 3
Open an elevated PowerShell and type the command:
Then type the following command (provided that you have saved the script in your Desktop):Code:Set-ExecutionPolicy Bypass -Scope Process -Force
The script will give you a list of all the ANSI text files, in all your user folders, as System.Text.ASCIIEncoding.Code:$env:USERPROFILE\Desktop\CheckFileEncoding.ps1
You can check some with non-ANSI characters and verify that they are readable. (They should, since your locale is a non-ANSI one).
Step 4
Now you can run the command:
The script will:Code:$env:USERPROFILE\Desktop\ConvertFilesToUnicode.ps1
1. Create a backup folder in C:\Backup\ASCIItxtBackup and will save a backup of all ANSI files you have in your user folders
2. Convert all ANSI files you have in your user folders to Unicode.
After that, you can do again Step 3, to verify that there are no ANSI files in your user folders.
Step 5
Go to: Settings > Time & Language > Region & Language > Related Settings > Administrative Language Settings (opens Control Panel) > System locale (Language for non-Unicode programs)
Alternatively, for short, type in Windows search/Cortana:
control.exe /NAME Microsoft.RegionalAndLanguageOptions /PAGE /p:"Administrative"
and change the “System locale (Language for non-Unicode programs)” to the English locale of your preference.
The system will need to reboot. Click Restart.
That was it. After your computer restarts and since all the text files are now saved in Unicode, they can be read with any System locale.
Important Note:
If you want to change either the backup location or the folders where the ANSI text files reside (e.g. search all C:\), open the script “ConvertFilesToUnicode.ps1” and as shown in the red box, in the image below, go to the section where we define the locations and change them according to your needs (e.g. $SourceDirectory = ‘C:\Personal\My Files’). Don’t forget to enclose the folder in quotes (e.g. ‘C:\Backup\My ASCII files’).
For Advanced Users
Microsoft Notepad, saves all Unicode files with BOM (Byte Order Mark). In case you don’t want to use BOM in your Unicode text files, use the “ConvertFilesToUnicode_NoBOM.ps1”. It will do exactly what the “ConvertFilesToUnicode.ps1” does, but instead it will save the text file in any Unicode encoding without the BOM.
Additionally, to change the Unicode encoding, to another format, in the convert section of the script change the Unicode in the “set-content $_.FullName -Encoding Unicode” part to any other of the available values:
‘ASCII’: Uses the encoding for the ASCII (7-bit) character set.
‘BigEndianUnicode’: Encodes in UTF-16 format using the big-endian byte order.
‘BigEndianUTF32’: Encodes in UTF-32 format using the big-endian byte order.
‘Default’: Encodes using the default value: ASCII.
‘Byte’: Encodes a set of characters into a sequence of bytes.
‘String’: Uses the encoding type for a string.
‘Unicode’: Encodes in UTF-16 format using the little-endian byte order.
‘UTF7:’ Encodes in UTF-7 format.
‘UTF8’: Encodes in UTF-8 format.
Credits:
The function Get-FileEncoding, 03-Feb-2015, by VertigoRay - Adjusted to use .NET's [System.Text.Encoding Class] (Encoding Class (System.Text))
Last edited by ddelo; 21 May 2018 at 08:34. Reason: Added credits