String Encodings


#1

Hello again,

I saw in the documentation that there is a lot encoding types available:

[quote]enum {
NSASCIIStringEncoding = 1,
NSNEXTSTEPStringEncoding = 2,
NSJapaneseEUCStringEncoding = 3,
NSUTF8StringEncoding = 4,
NSISOLatin1StringEncoding = 5,
NSSymbolStringEncoding = 6,
NSNonLossyASCIIStringEncoding = 7,
NSShiftJISStringEncoding = 8,
NSISOLatin2StringEncoding = 9,
NSUnicodeStringEncoding = 10,
NSWindowsCP1251StringEncoding = 11,
NSWindowsCP1252StringEncoding = 12,
NSWindowsCP1253StringEncoding = 13,
NSWindowsCP1254StringEncoding = 14,
NSWindowsCP1250StringEncoding = 15,
NSISO2022JPStringEncoding = 21,
NSMacOSRomanStringEncoding = 30,
NSUTF16StringEncoding = NSUnicodeStringEncoding,
NSUTF16BigEndianStringEncoding = 0x90000100,
NSUTF16LittleEndianStringEncoding = 0x94000100,
NSUTF32StringEncoding = 0x8c000100,
NSUTF32BigEndianStringEncoding = 0x98000100,
NSUTF32LittleEndianStringEncoding = 0x9c000100,
NSProprietaryStringEncoding = 65536
};[/quote]
Is there a way to know what type of encoding is used in a file before requesting to read it ?

Thanks by advance


#2

In a word? No.

You are expected to know the encoding of the text file you are attempting to read - plain text files don’t include metadata indicating their encoding.

Guessing encoding is difficult and somewhat inaccurate - web browsers actually have heuristics that count the relative occurrences of certain letters and use that to guess the language/encoding.