locked
why TCHAR('\xfeff') generates error C2022: '65279' : too big for character when UNICODE is defined? RRS feed

  • Question

  • isn't TCHAR supposed to accomodate values of that range when UNICODE is defined?

    • Edited by ColdBackup Thursday, July 12, 2012 11:23 PM
    Thursday, July 12, 2012 11:23 PM

Answers

  • On 7/12/2012 7:23 PM, ColdBackup wrote:

    isn't TCHAR supposed to accomodate values of that range when UNICODE is defined?

    '\xfeff' is a literal of type char. The fact that you later cast it to TCHAR is irrelevant. You want L'\xfeff', which is a literal of type wchar_t.


    Igor Tandetnik

    • Marked as answer by ColdBackup Thursday, July 12, 2012 11:39 PM
    • Unmarked as answer by ColdBackup Thursday, July 12, 2012 11:52 PM
    • Marked as answer by ColdBackup Friday, July 13, 2012 12:44 AM
    Thursday, July 12, 2012 11:29 PM
  • On 7/12/2012 8:23 PM, ColdBackup wrote:

    It expands to "\xfeff" or L"\xfeff", depending on the build. The former produces the same C2022 error; the latter is fine.

    but why doesn't TCHAR expand to L'\xfeff'?

    Because TCHAR is not a macro.

    isn't it TEXT's counterpart but only for characters?

    Is not. TEXT is used both for character literals and for string literals.

    TCHAR test=TCHAR('关'); // compiles fine
    TCHAR test2='关'; // compiles fine

    I'm getting warning C4566 for those - don't you? Run the program, examine the values of test and test2 in the debugger - you will find that they are not equal to 0x5173.


    Igor Tandetnik

    • Marked as answer by ColdBackup Friday, July 13, 2012 12:42 AM
    Friday, July 13, 2012 12:28 AM

All replies

  • On 7/12/2012 7:23 PM, ColdBackup wrote:

    isn't TCHAR supposed to accomodate values of that range when UNICODE is defined?

    '\xfeff' is a literal of type char. The fact that you later cast it to TCHAR is irrelevant. You want L'\xfeff', which is a literal of type wchar_t.


    Igor Tandetnik

    • Marked as answer by ColdBackup Thursday, July 12, 2012 11:39 PM
    • Unmarked as answer by ColdBackup Thursday, July 12, 2012 11:52 PM
    • Marked as answer by ColdBackup Friday, July 13, 2012 12:44 AM
    Thursday, July 12, 2012 11:29 PM
  • wouldn't use of L''/L"" defeat the purpose of TCHAR and TEXT?

    I didn't know they were this useless when it comes to entering unicode characters

    looks like all they do is converting ASCII chars to 2 bytes in size

    too bad if UNICODE isn't specified, the code won't compile with L''/L""

    I'm probably going to use 16 bit integer instead

    • Edited by ColdBackup Thursday, July 12, 2012 11:38 PM
    Thursday, July 12, 2012 11:35 PM
  • this is strange

    when I write the following

    TCHAR test2=TCHAR('\feff');

    it compiles fine

    but when I use TCHAR('\xfeff') in a function call it can't compile


    • Edited by ColdBackup Thursday, July 12, 2012 11:45 PM
    Thursday, July 12, 2012 11:45 PM
  • after some more testing I figured out function's parameter type was the culprit

    LPVOID k=reinterpret_cast<LPVOID>(TCHAR('\xfeff'));

    this generates the same error

    any explanations would be appreciated

    Thursday, July 12, 2012 11:51 PM
  • On 7/12/2012 7:35 PM, ColdBackup wrote:

    wouldn't use of L''/L"" defeat the purpose of TCHAR and TEXT?

    This depends on what you are trying to achieve here, which is not at all clear to me. TCHAR and TEXT are used when you want to build both ANSI and Unicode builds off the same source. But what do you plan to do with U+FEFF character in an ANSI build? How do you plan to represent it? And if you only plan to build Unicode, why use TEXT or TCHAR?

    looks like all they do is converting ASCII chars to 2 bytes in size

    I'm not sure what you are talking about. TEXT('x') is a macro that expands to 'x' in ANSI build, and L'x' in Unicode build. TCHAR is a typedef, either for char or for wchar_t. Neither of them converts anything to anything else.

    too bad if UNICODE isn't specified, the code won't compile with L''/L""

    This line would compile just fine both in ANSI and in Unicode build:

    wchar_t c = L'\xFEFF';


    Igor Tandetnik

    Thursday, July 12, 2012 11:52 PM
  • And if you only plan to build Unicode, why use TEXT or TCHAR?

    #if defined(UNICODE)
    WriteFileEx(file,reinterpret_cast<LPVOID>(TCHAR('\xfeff')),sizeof(TCHAR),&information,NULL);
    #endif
    I don't plan to build unicode only
    '\xfeff' will be present only if unicode is specified
    • Edited by ColdBackup Thursday, July 12, 2012 11:57 PM
    Thursday, July 12, 2012 11:55 PM
  • On 7/12/2012 7:45 PM, ColdBackup wrote:

    this is strange
    when I write the following

    TCHAR test2=TCHAR('\feff');

    it compiles fine

    That's a multicharacter constant - a Microsoft-specific construct. Its type is int.

    http://msdn.microsoft.com/en-us/library/6aw8xdf2.aspx


    Igor Tandetnik

    Thursday, July 12, 2012 11:56 PM
  • On 7/12/2012 7:51 PM, ColdBackup wrote:

    after some more testing I figured out function's parameter type was the culprit

    LPVOID k=reinterpret_cast<LPVOID>(TCHAR('\xfeff'));

    this generates the same error

    Just simply writing

    '\xfeff';

    would generate the same error. The rest is irrelevant.


    Igor Tandetnik

    Thursday, July 12, 2012 11:58 PM
  • That's a multicharacter constant - a Microsoft-specific construct. Its type is int.


    you're right, I missed the "x"
    but why does this work then
    TCHAR test2=TCHAR('关');

    Friday, July 13, 2012 12:01 AM
  • On 7/12/2012 7:55 PM, ColdBackup wrote:

    And if you only plan to build Unicode, why use TEXT or TCHAR?

    [code]
    #if defined(UNICODE)
    WriteFileEx(file,reinterpret_cast<LPVOID>(TCHAR('\xfeff')),sizeof(TCHAR),&information,NULL);

    That's going to crash (if you manage to get it to compile). You are telling WriteFileEx to write two bytes located at adress 0x0000FEFF. But that's not a valid address, of course.

    You want something like this:

    wchar_t c = L'\xFEFF';
    WriteFileEx(file, &c, sizeof(c), &information, NULL);


    Igor Tandetnik

    Friday, July 13, 2012 12:03 AM
  • thanks

    but what happens to TEXT("\xfeff")?

    is character converted to multibyte?

    and how can successful compilation of TCHAR('关') be explained?

    Friday, July 13, 2012 12:10 AM
  • On 7/12/2012 8:01 PM, ColdBackup wrote:

    That's a multicharacter constant - a Microsoft-specific construct. Its type is int.

    you're right, I missed the "x" but why does this work then

    TCHAR test2=TCHAR('关');

    Define "works". I get a warning:

    warning C4566: character represented by universal-character-name '\u5173' cannot be represented in the current code page (1251)

    Moreover, test2 ends up holding '?'. Why the compiler produces this warning and not C2022 error in this case, I don't know.


    Igor Tandetnik

    Friday, July 13, 2012 12:11 AM
  • On 7/12/2012 8:10 PM, ColdBackup wrote:

    but what happens to TEXT("\xfeff")?

    It expands to "\xfeff" or L"\xfeff", depending on the build. The former produces the same C2022 error; the latter is fine.

    is character converted to multibyte?

    No.

    and how can successful compilation of TCHAR('关') be explained?

    It's not really successful.


    Igor Tandetnik

    Friday, July 13, 2012 12:16 AM
  • It expands to "\xfeff" or L"\xfeff", depending on the build. The former produces the same C2022 error; the latter is fine.

    but why doesn't TCHAR expand to L'\xfeff'? isn't it TEXT's counterpart but only for characters? and UNICODE is defined in my code, which should make it work

    It's not really successful.

    TCHAR test=TCHAR('关'); // compiles fine
    TCHAR test2='关'; // compiles fine
    maybe it's because the source file is saved as unicode?

    Friday, July 13, 2012 12:23 AM
  • On 7/12/2012 8:23 PM, ColdBackup wrote:

    It expands to "\xfeff" or L"\xfeff", depending on the build. The former produces the same C2022 error; the latter is fine.

    but why doesn't TCHAR expand to L'\xfeff'?

    Because TCHAR is not a macro.

    isn't it TEXT's counterpart but only for characters?

    Is not. TEXT is used both for character literals and for string literals.

    TCHAR test=TCHAR('关'); // compiles fine
    TCHAR test2='关'; // compiles fine

    I'm getting warning C4566 for those - don't you? Run the program, examine the values of test and test2 in the debugger - you will find that they are not equal to 0x5173.


    Igor Tandetnik

    • Marked as answer by ColdBackup Friday, July 13, 2012 12:42 AM
    Friday, July 13, 2012 12:28 AM
  • Define "works". I get a warning:

    warning C4566: character represented by universal-character-name '\u5173' cannot be represented in the current code page (1251)


    my fault, I compiled with zero warning level after I switched to level 4, got the same warning

    Moreover, test2 ends up holding '?'.

    this is right what I wanted to ask

    what happens to the character compiler gave the warning about

    thanks for answering

    • Edited by ColdBackup Friday, July 13, 2012 12:30 AM
    Friday, July 13, 2012 12:28 AM

  • Because TCHAR is not a macro.


    Is not. TEXT is used both for character literals and for string literals.


    this is what my problem was all along!
    thank you very much for helping, and sorry for asking many different questions in the same topic
    Friday, July 13, 2012 12:41 AM