none
Show the invalid character RRS feed

  • Question

  • Dear,
    It seems there is invalid character like

    inside such file. Which tool can be used to show the invalid character?


    Many Thanks & Best Regards, Hua Min

    Friday, October 5, 2018 8:12 AM

Answers

  • Yes, I can also see such invalid character by Hex editor. How to correct the attached file (in above) to remove such invalid character?


    What invalid character? Which one is that?

    Do you mean you want to manually modify characters in the file? If so,
    maybe this is what you need:

    HxD - Freeware Hex Editor and Disk Editor
    https://mh-nexus.de/en/hxd/

    Or you can write a utility program yourself to copy the file while
    changing or deleting the characters as you please.

    - Wayne

    • Marked as answer by Jackson_1990 Friday, October 5, 2018 3:52 PM
    Friday, October 5, 2018 10:39 AM
  • Xmmm... 

    It's normal to have BOM in unicode document.

    If you want to convert unicode to normal ASCII - just open unicode in editor and save in ASCII.

    From what I known this can be done by Notepad. Also time to time in SSMS I do have unicode/ASCII improper encoding.

    If you like to do it by your self - it just one line:

    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, Encoding.Unicode.GetBytes(text));

    this supposed to convert unicode to ascii.

    Some symbols can't be represented in ASCII. Cyrillic, for example... or german with umlaut... or latvian with "garumzime" - if you looking in how to determine such symbols - they will be with code >7F or they will have non-00 lead, or there will be switch for encoding.


    Sincerely, Highly skilled coding monkey.

    • Marked as answer by Jackson_1990 Friday, October 5, 2018 3:52 PM
    Friday, October 5, 2018 10:57 AM

All replies

  • Dear,

    can we get some more informations? What encoding are you using? What kind of file? What are you expecting? ... 

    Without enough infos I have to guess. And here are some "guessing"-people haters. So please provide more infos.

    Greeting, Chris


    • Edited by DerChris88 Friday, October 5, 2018 11:43 AM
    Friday, October 5, 2018 8:18 AM
  • If you see the given file in above, are you able to any invalid character? Which tool can be used to identify any invalid character inside the file?

    Many Thanks & Best Regards, Hua Min

    Friday, October 5, 2018 8:45 AM
  • What do you mean? Which encoding do you use? What means invalid characters? What do you expect instead of the "invalid" characters? What is an invalid character for you?

    We need more infos. I opend your file and any character was shown correctly:

    H+LKCMBTSA+ARRI+181004+2359+Z1.0+++
    L+ZCSU+8738794+181004+2358+GTIN+EXP+F++++LKCMBTSA++LKCMBTSA+++++ZIM+ZIM++^^T^^^^N^^^^^^^^^^^^^^^^1^^^^
    L+TGBU+5770592+181004+2358+GTIN+EXP+F++++LKCMBTSA++LKCMBTSA+++++ZIM+ZIM++^^T^^^^N^^^^^^^^^^^^^^^^2^^^^

    Again, what do you expect?

    Greetings, Chris

    Friday, October 5, 2018 9:01 AM
  • Nope. The characters are all valid Unicode.

    Thus use an editor which can display Unicode. The problem may be, that the file has no BOM and you need to switch the used encoding manually.

    Friday, October 5, 2018 9:05 AM
  • Stefan,
    There is Mercator application which is detecting invalid characters like

    how can we achieve the same by any other tools?

    Many Thanks & Best Regards, Hua Min

    Friday, October 5, 2018 9:07 AM
  • Hello,

    Sorry, can't reach the file - blocked on firewall.

    But, from the text posted by DerChris88 all characters look fine - encoded as normal ASCII... :) 

    Please, provide more info about source and expectation.


    Sincerely, Highly skilled coding monkey.

    Friday, October 5, 2018 9:07 AM
  • Such file was shared in there. Or I can send you by Email.

    Many Thanks & Best Regards, Hua Min

    Friday, October 5, 2018 9:37 AM
  •  Which tool can be used to show the invalid character?

    By "tool" I assume you mean a utility program and not a programming 
    language function, library, etc.?

    In the case of on-screen display of special characters, what is shown
    often depends on the font being used, the natural language (English, 
    German, French, etc.) of the software (OS, utility, etc.) ...

    Using a file viewer capable of doing a hex display is often needed.
    An example is the integrated viewer that comes with Total Commander.
    It auto-detects that file as Unicode, and shows a character display
    like that posted by DerChris88:


    If we switch to a hex display, you can see the FF FE (BOM) in the first 
    two characters of the file, as well as the hex values of any characters
    that don't have recognizable symbols in the on-screen font being used:

    - Wayne

    Friday, October 5, 2018 9:56 AM
  • Yes, I can also see such invalid character by Hex editor. How to correct the attached file (in above) to remove such invalid character?

    Many Thanks & Best Regards, Hua Min

    Friday, October 5, 2018 10:30 AM
  • Yes, I can also see such invalid character by Hex editor. How to correct the attached file (in above) to remove such invalid character?


    What invalid character? Which one is that?

    Do you mean you want to manually modify characters in the file? If so,
    maybe this is what you need:

    HxD - Freeware Hex Editor and Disk Editor
    https://mh-nexus.de/en/hxd/

    Or you can write a utility program yourself to copy the file while
    changing or deleting the characters as you please.

    - Wayne

    • Marked as answer by Jackson_1990 Friday, October 5, 2018 3:52 PM
    Friday, October 5, 2018 10:39 AM
  • Xmmm... 

    It's normal to have BOM in unicode document.

    If you want to convert unicode to normal ASCII - just open unicode in editor and save in ASCII.

    From what I known this can be done by Notepad. Also time to time in SSMS I do have unicode/ASCII improper encoding.

    If you like to do it by your self - it just one line:

    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, Encoding.Unicode.GetBytes(text));

    this supposed to convert unicode to ascii.

    Some symbols can't be represented in ASCII. Cyrillic, for example... or german with umlaut... or latvian with "garumzime" - if you looking in how to determine such symbols - they will be with code >7F or they will have non-00 lead, or there will be switch for encoding.


    Sincerely, Highly skilled coding monkey.

    • Marked as answer by Jackson_1990 Friday, October 5, 2018 3:52 PM
    Friday, October 5, 2018 10:57 AM