none
Clang 3.7 with Microsoft CodeGen preprocessor expands stringuizing operator incorrectly

    Question

  • Hi,

    I am happy with the support MS is giving to the Clang 3.7 compiler, however there is a bug in it. It can be easily reproduced with this macro:

    #define PRINT_STRING(x) fwprintf(stderr, L"String contains %ls\n", L#x)

    The # operator sorrounds the x macro parameter with double quotations as it should, but then it leaves one space before the string and the preceeding L, like so:

    fwprintf((__acrt_iob_func(2)), L"String contains %ls\n", L "juan");

    I obtained this result by compiling with -E -dD compiler options.

    Please confirm and let me know if there is already a fix...

    Regards,

    Juan Dent


    Juan Dent

    Monday, May 30, 2016 7:29 PM

Answers

  • Looks for me like a known issue:
    Bug 10706 - Stringizing Operator (#) inserts space after macro expansion
    https://llvm.org/bugs/show_bug.cgi?id=10706
     
    There, patch seems to indicate special treatment for msvc, probably  
    -fms-compatibility      Enable Microsoft compatibility mode
    No warranty
    Wth kind regards 

    Friday, June 03, 2016 5:00 PM

All replies

  • Hi Juan Dent,

    Could you share us a simple sample and detailed steps about how we could repro this issue in our side?

    >>I obtained this result by compiling with -E -dD compiler options.

    How did you compile it using the -E -dD? I have the VS2015 Environment, I could install the Clang 3.7 with Microsoft CodeGen in my side. But not very sure that how I can really repro this issue. If possible, you could share us a simple sample using one drive and the detailed steps about how to repro it. I will check it in my side.

    In addition, since this issue is related to the Clang 3.7 with Microsoft CodeGen, like this blog here:

    https://blogs.msdn.microsoft.com/vcblog/2016/03/31/clang-with-microsoft-codegen-march-2016-released/

    Maybe you could select a better way and discuss this issue with the Clang team experts directly.

    Bug Reporting

    When submitting bug reports that are specific to Clang/C2 (i.e. they are not reproducible in Clang/LLVM), make sure to:

    • Submit your issues at https://connect.microsoft.com/VisualStudio
    • Prefix your bug report title with [Clang/C2]
    • When applicable, make sure to include preprocessed source(s) and associated run script(s) that are reported when ICE happens. Search for the string “PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:” in your build log.
    • You can also email us questions at clangc2 at microsoft.com.

    Best Regards,

    Jack


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Wednesday, June 01, 2016 7:40 AM
    Moderator
  • Hi,

    I downloaded the latest Clang 3.7 with Microsoft CodeGen (I believe it is May 2016 version).

    To reproduce, just create a console application that uses Unicode Character Set, choose platform toolset to be "Clang 3.7 with Microsoft CodeGen (v140_clang_3_7)" and add this code to the file containing main():

    #include "stdafx.h" #define PRINT_STRING(x) fwprintf(stderr, L"String contains %S\n", L#x)

    int main() { PRINT_STRING(juan); return 0; }


    when you compile it you will get the following error:

    >BugInClang3.7.cpp(11,2): error : use of undeclared identifier 'L'
    2>          PRINT_STRING(juan);
    2>          ^
    2>  BugInClang3.7.cpp(6,67) :  note: expanded from macro 'PRINT_STRING'
    2>  #define PRINT_STRING(x) fwprintf(stderr, L"String contains %S\n", L#x)
    2>                                                                    ^
    2>  1 error generated.

    This is caused by incorrect expansion of the macro PRINT_STRING(x) which inserts a space char between the L and the #x, like so:

    #define PRINT_STRING(x) fwprintf(stderr, L"String contains %S\n", L#x)
    
    int main()
    {
     fwprintf((__acrt_iob_func(2)), L"String contains %S\n", L "juan");
    
      return 0;
    }
    

    This output can be obtained by choosing in Configuration Properties for the project, the C/C++ section, the Command Line option: add to the Additional Options text box the following:

    -E -dD

    Then just compile the cpp file containing the code above and open the produced obj file, which due to these options actually contains the preprocessed code where you can clearly (at the bottom of the file) see the way the compiler is expanding the macro and causing the space character in between the L and the stringuized x, like so:

    L "juan"

    If you remove the space, it compiles correctly.

    Thus this is a very simple bug, incredible that it exists in the clang compiler.

    Regards,

    Juan Dent


    Juan Dent

    Wednesday, June 01, 2016 5:12 PM
  • Hi Juan Dent,

    >>When you compile it you will get the following error:

    >BugInClang3.7.cpp(11,2): error : use of undeclared identifier 'L'
    2>          PRINT_STRING(juan);

    I could repro this issue using the first sample in my side.

    >>L "juan"

    If you remove the space, it compiles correctly.

    But I couldn't compile the second sample correctly in my side.

    This is my steps:

    (1) Change the project property platform toolset to be "platform toolset to be "Clang 3.7 with Microsoft CodeGen (v140_clang_3_7)"".

    (2) Add the command line and remove the space like A and B:

    (3) Recompile this project, but I still get the same error like the first sample. Do I use the wrong steps as yours?

    >>Then just compile the cpp file containing the code above and open the produced obj file, which due to these options actually contains the preprocessed code where you can clearly.

    It seems that no obj file in my side.

    Best Regards,

    Jack


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Thursday, June 02, 2016 10:45 AM
    Moderator
  • I think you misinterpreted - I gave too much information.

    Let's simplify things: the bug is in the preprocessor because when encountering a macro like so:

    #define PRINT_STRING(x) fwprintf(stderr, L"String contains %S\n", L#x)

    it expands the part L#x incorrectly like so:

    fwprintf((__acrt_iob_func(2)), L"String contains %S\n", L "juan");
    

    which has a space in between the L and the string "juan" and it shouldn't!!

    That's it. It's that simple and it is a bug in the preprocessor.

    Please let me know of this now clarifies things. Also, forget about the -E -dD compile flags...

    Regards,

    Juan


    Juan Dent

    Thursday, June 02, 2016 7:12 PM
  • Hi Juan,

    >>A: PRINT_STRING(juan);

    >>B: fwprintf((__acrt_iob_func(2)), L"String contains %S\n", L "juan");

    Sorry for my misunderstanding.

    Not the real VC++ developing expert, actually I just remove "A" and add "B" in my code editor, but my understanding is that you get the code line "B" from "A" automatically, am I right? 

    Could you tell me how to generate the code line B automatically from A? Sorry for that if it is the easy question:)

    I really want to help you repro this issue in my side, and I could help you report this feedback.

    Sincerely,

    Jack


    We are trying to better understand customer views on social support experience, so your participation in this interview project would be greatly appreciated if you have time. Thanks for helping make community forums a great place.
    Click HERE to participate the survey.

    Friday, June 03, 2016 9:56 AM
    Moderator
  • Hi,

    The line marked A by you is the one that generates the compiler error. Line B is what the preprocessor creates so you can forget about it. The bug is produced in line A, and the problem is that it generates code that has an INCORRECT space character separating the L from the string (in the example above: it generates

              L "juan"

    instead of

              L"juan"

    and this is incorrect C++ syntax!!

    Thus, there is a bug in the compiler!

    Regards,

    Juan


    Juan Dent

    Friday, June 03, 2016 2:10 PM
  • Looks for me like a known issue:
    Bug 10706 - Stringizing Operator (#) inserts space after macro expansion
    https://llvm.org/bugs/show_bug.cgi?id=10706
     
    There, patch seems to indicate special treatment for msvc, probably  
    -fms-compatibility      Enable Microsoft compatibility mode
    No warranty
    Wth kind regards 

    Friday, June 03, 2016 5:00 PM