locked
.ToString() RRS feed

  • General discussion

  • I'm trying to find some information on the grammar rules of Formatting, but when look at the source code. All I get is

    [MethodImpl(MethodImplOptions.InternalCall)] public static extern string FormatInt32(int value, string format, NumberFormatInfo info);

    Consulting MSDN Standard Numeric Format String gives some information, but it does not define what it classifies as an alphabetic character. In English this would be 

    alphaChar ::= 'A'-'Z' | 'a'-'z'
    

    Question: Is there any documentation that specifies them?

    • Moved by Mike Danes Friday, September 5, 2014 7:20 AM BCL related
    • Changed type Fred Bao Monday, September 15, 2014 9:27 AM The discussion type is more suitable for this case since this issue is not much related with class libraries
    Thursday, September 4, 2014 5:52 PM

All replies

  • You are dealing with an internal Net library class.  the webpage you posted says the following :

    "A is a single alphabetic character called the format specifier. Any numeric format string that contains more than one alphabetic character, including white space, is interpreted as a custom numeric format string"

    The internal class contains an enumeration of acceptable strings.  Those values are shown in the first column of the table on the webpage.  Some enumeration like "D8" will except additional numbers/characters. 

    The method you posted above FormatInt32 is not source code put a definition of the method.  FormatInt32 is really an interface definition. The format string can be use either with the ToString() method or the Parse() method.  The definition of the characters are usually the same for each class (int, float, DataTime,...).


    jdweng


    Thursday, September 4, 2014 7:28 PM
  • All I'm asking for is the definition of "alphabetic character"?

    As it currently stands its definition isn't precise.

    For example if I said digit and provided a BNF

     digit::= '0' - '9'
    digits::= digit+

    Then the precise definition of digit \ digits is known.

    Unicode is a big alphabet of possible characters. A BNF definition of what is allowed as an Alphabetic Character.

    Thursday, September 4, 2014 10:06 PM
  • Each data type has it owns set of acceptable characters.  It can be any string character.  Number use d, and f, while datetime uses h,H,y,m,M,s, ...

    jdweng

    Friday, September 5, 2014 7:18 AM
  • There's no other documentation I know apart from the one that you already found. After all, MSDN is the official documentation for BCL.

    If it helps, the source of that function can be found in the old sscli/rotor code - characters means 'A'-'Z'|'a'-'z'. The same code can be seen in .NET Native's mscorlib with a IL decompiler.

    In general, common sense would dictate that the set of characters is restricted to a-z. Having a programmer from Japan to type a Greek character for a format specifier would be kind of lame.

    Friday, September 5, 2014 7:20 AM
  • The Standard Format String are easy because they specified. So I didn't need help with those.
    But any how thank you for you time.
    Friday, September 5, 2014 8:05 AM
  • I'm not sure I understand what you are saying. The "alphabetic character" issues is specific to standard numeric format strings. Other types do not use this terminology and tend to reject any character that is not a valid format specifier, alphabetic or not.

    Friday, September 5, 2014 8:22 AM
  • Thanks Mike for those research leads. I'll have a look and see what I can find.

    'A'-'Z'|'a'-'z' would make sense. 

    Friday, September 5, 2014 8:24 AM
  • I'm currently updating a Roslyn Code Diagnostic (String Format Diagnostics) to cover more use case. .ToString is one of those cases. 

    There is a condition (below) that determines whether to treat the format string as a standard format or a custom format

    • If the formatstring contains 2 or more alphabetic character (including space), then treat the format string as a Custom Format Strings. 
    Friday, September 5, 2014 9:52 AM
  • But that "alphabetic character" stuff is only found in numeric format strings. Other forma string do not mention "alphabetic", just character. For example you have this for DateTime format string:

    "Any date and time format string that contains more than one character, including white space, is interpreted as a custom date and time format string;"

    Some examples my clarify the difference:

    • 42.ToString("d") works and produces 42
    • 42.ToString("z") throws FormatException - z is an alphabetic character but it's not a valid standard format specifier
    • 42.ToString("Ö") works and produces Ö - Ö is "Other" category character in a custom numeric format string
    • DateTime.Now.ToString("d") works and produces something like 9/5/2014
    • DateTime.Now.ToString("z") throws FormatException - z is not a valid format specifier
    • DateTime.Now.ToString("Ö") also throws FormatException - contrast this with the integer case where it works
    Friday, September 5, 2014 11:01 AM
  • I know about them and have been implemented. Just need to do the custom format string parts.

    Which way I need to what is classed as an Alphabetic Character, as the numeric is the only that has to condition. At the moment that definition isn't well defined. 

    So I can only (currently) say 'A'-'Z' | 'a'-'z' are valid alphabetic characters. 

    For example the C# grammar explicitly tells you want characters are allow in identifiers. (See)


    Friday, September 5, 2014 11:48 AM
  • Is this the definition?

    Function IsAlphabeticOrWhitespace( c As Char ) As Boolean
      Return Char.IsLetter( c ) OrElse Char.IsWhitespace( c )
    End If

    Friday, September 5, 2014 12:13 PM
  • The definition of what exactly?

    I told you previously that the "alphabetic character" that is used in the documentation of standard numeric format string really means a-z|A-Z. Obviously, your definition doesn't match that because Char.IsLetter returns true for all Unicode characters that are considered letters, not only for a-z.

    Friday, September 5, 2014 12:30 PM
  • Custom Format strings allow unicode characters. 

    Eg A per mille character (‰ or \u2030)

    Some unicode character are classed as Letters (or are these not alphabetic?)

    Hence my definition. 

    Friday, September 5, 2014 12:47 PM
  • But if you're talking specifically about custom formats why are you dragging in this letter/alphabetic stuff?

    Any character can be included in a custom format string, not only letters. For example, "0!0-0β" is a valid numeric custom format string which produces the output "0!4-2ß". Obviously, ! is neither a letter nor whitespace and it doesn't fit your definition.

    Friday, September 5, 2014 1:08 PM
  • OK, I'll try again to state the issues.

    There two way to describe what a custom numeric format string is.

    1. Any numeric format string that contains more than one alphabetic character, including white space, is interpreted as a custom numeric format string.
    Question: What is an alphabetic character?
    2. Standard format string Axx
    3. A custom format string is any string that isn't a standard format string.


    But this further complicated by the possible thrown Exception. (From Standard Numeric format string)
    "Any other single character  ||  Unknown specifier ||   Result: Throws a FormatException at run time."
    Which sort contradicts the second description.
    Note also it doesn't say any other alphabetic character but any other character.

    (Which contracts the definition (in description 1   A is an Alphabetic Character )

    F$     Satisfies Description 
    ---------------------------------------------------------------------------------------------------------
    C      (1: No       , 2: Yes        3: No )
    C0     (1; No       , 2: Yes        3: No )
    C00    (1: No       , 2: Yes        3: No )
    C000   (1: No       , 2: No         3: Yes) custom numeric string via description 3
    AA     (1: Yes      , 2: £xception  3; Yes) 
    ^      (1: ??       , 2: Exception, 3: Yes) Should be a numeric custom string because is matches description 3
    ^^     (1: ??       , 2: Exception, 3: Yes)





    Friday, September 5, 2014 3:02 PM
  • "1. Any numeric format string that contains more than one alphabetic character, including white space, is interpreted as a custom numeric format string.
    Question: What is an alphabetic character?"

    That particular sentence should be ignored, it's redundant. It does tell you that something like "FF" is a custom format string. But we already know that because:

    "A standard numeric format string takes the form Axx, where:"

    There's no way that FF can be a standard numeric format string, it doesn't have the form Axx.

    ""Any other single character  ||  Unknown specifier ||   Result: Throws a FormatException at run time."
    Which sort contradicts the second description."

    Nope, there is no contradiction here. The text appears in the format specifier column. We know from previous text that a standard format specifier is an alphabetic character. And as I mentioned before this "alphabetic character" thing means a-z.

    "C000   (1: No       , 2: No         3: Yes) custom numeric string via description 3"

    That's not a custom format string. That xx from Axx is misleading, it is trying to say that the maximum value is 99 but it ends up giving the impression that xx are 2 digits. That's not the case, C000 is the same thing as C0. C099 is the same thing as C99. C100 is a custom format string because 100 > 99.

    "^      (1: ??       , 2: Exception, 3: Yes) Should be a numeric custom string because is matches description 3"

    Not should be, it is a custom format string. There is no way that ^ is a standard format string because ^ is not an alphabetic character.

    Friday, September 5, 2014 3:38 PM
  • Could you provide documentary evidence that alphabetic character means A-Z | a-z ?

    Search for Alphabetic Character 

    What is meant alphabetic character defers depending on context. in a RegEx its A-Z | a-z  in another it is Char.IsLetter

    Friday, September 5, 2014 3:59 PM
  • "Could you provide documentary evidence that alphabetic character means A-Z | a-z ?"

    I told you in my first post in the thread that there is no such documentation.

    Friday, September 5, 2014 4:21 PM