Answered by:
Byte array to string

Question
-
Hello,I receive a byte array and have to convert it into a string.
I only need one character in my string.
How can I reach my goal?resToClient = (char)2 + "PING" + (char)4 + "WORK" + (char)4 + "ACK" + (char)3; //Current Output: Encoding.UTF8.GetString(buffer, 0, sizeReceive) "\u0002PING\u0004WORK\u0003" Encoding.ASCII.GetString(new byte[]{ 2 }); "\u0002" string //Should be: For ASCII 2 (STX) 3 (ETX) 4 (EOT) I need only one character inside my string not more.
Thanks in advance for your help.
Greetings Markus.
My attempts were.var buffer = new byte[10000]; sizeReceive = tcpClient.Client.Receive(buffer, 0, buffer.Length, SocketFlags.None); receiveBuffer += Encoding.UTF8.GetString(buffer, 0, sizeReceive).Replace('\u0002', '2').Replace('\u0003', '3').Replace('\u0004', '4'); byte[] chars = new byte[sizeReceive]; System.Buffer.BlockCopy(buffer, 0, chars, 0, sizeReceive); //receiveBuffer += new string(chars); foreach (var item in chars) { if (item < 30) { int t = (char)item; receiveBuffer += (char)t; } else receiveBuffer += Convert.ToChar(item); }
Monday, April 20, 2020 4:05 PM
Answers
-
Hi markus,
For the example string, we can split it like this:
var result = str.Split(new char[] { (char)4 });
Or
var result2 = str.Split(new string[] { "\u0004" },StringSplitOptions.None);
Even if it is a byte array, we can see where the character EOT(4) is.
We can split it into 3 new byte arrays based on the index, and then do something.
I hope I did not misunderstand you again.
Best Regards,
Timon
MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.- Edited by Timon YangMicrosoft contingent staff Wednesday, April 22, 2020 5:47 AM
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Wednesday, April 22, 2020 5:47 AM -
Hi Markus,
In the past, I've done something like this for processing data coming in from a TCP Listener (it sounds like that is what you're doing, or something similar):
private string ValidStart = new string(new char[] { (char)2 }); private string ValidEnd = new string(new char[] { (char)3 }); public List<string> ProcessDataReceived(ref string buffer, string dataReceived) { List<string> DataList = new List<string>(); try { //add received data to the buffer buffer += dataReceived; int whileLoopWatchdog = 0; while (buffer.Length > 0) { this.RemoveDataPrecedingStart(ref buffer); //verify that a complete message is present, if so process just that string Complete = this.ProcessCompleteMessage(ref buffer); if (Complete.IsNotNullOrEmpty()) { //string Xml; //if (this.IncludeDelimiters) // Xml = Complete; //else // Xml = this.ParseOutDelimiters(Complete); // //DataList.Add(Xml); DataList.Add(Complete); . //remove the data that was pulled out for processing //this will replace multiple identical occurrences but that //is fine buffer = buffer.Replace(Complete, ""); } //HACK: prevent endless while loop //(although 20+ complete sets of data at once probably wouldn't //happen, and if it did the remainder would be caught on //next data receive) whileLoopWatchdog++; if (whileLoopWatchdog > 20) { whileLoopWatchdog = 0; break; } } } catch (Exception ex) { LogOutput.WriteLine(ex, "DataParser Exception"); } return DataList; } protected virtual string ProcessCompleteMessage(ref string buffer) { string Complete = ""; if ((buffer.StartsWith(ValidStart)) && (buffer.Contains(ValidEnd)) && (buffer.IndexOf(ValidStart) < buffer.IndexOf(ValidEnd))) { Complete = buffer.Substring(buffer.IndexOf(ValidStart), buffer.IndexOf(ValidEnd) + ValidEnd.Length - buffer.IndexOf(ValidStart)); } return Complete; }
Hope that helps! =0)~~Bonnie DeWitt [C# MVP]
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Wednesday, April 22, 2020 3:04 PM -
But how would you split the byte array?It's an illusion, like Tim said.
What's an illusion?
The character shown as \u0004 (or \u0002 or \u0003) is a single byte, as Tim
said. Since it has no visual symbol to represent it in a display of characters
such as a string, it is shown as its escaped Unicode value. You can view it as
a hex value as well: 0x4 or 0x0004 etc. Or as a decimal value, as Timon said
and illustrated.
As to splitting the byte array, you haven't given enough details. What exactly,
if anything, is supposed to be done with the STX and ETX characters? Are they
to be stripped off? Or left attached to the substrings?
Or is the parsing to be dome only on the text bounded by the STX and ETX
characters? Given the string of bytes in your example, what *exactly* should
the substrings (of bytes) contain?
Are the substrings to be in byte arrays? An array of byte arrays? An array of
strings?
Finding the EOT in a byte array can be done like so:
byte[] bar = { (byte)'\u0002', (byte)'P', (byte)'I', (byte)'N', (byte)'G', (byte)'\u0004', (byte)'W', (byte)'O', (byte)'R', (byte)'K', (byte)'\u0003'}; int idx4 = Array.IndexOf(bar, (byte)0x4); // here idx4 == 5
With some length calculations for array sizing, you can selectively copy
parts of the source byte array into other byte arrays. e.g. -
// copy from start of source to EOT - 1 Array.Copy(bar, 0, bar1, 0, idx4); // copy from EOT + 1 to end of source Array.Copy(bar, idx4 + 1, bar2, 0, bar.Length - idx4 - 1);
E&OE
- Wayne
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Wednesday, April 22, 2020 11:44 PM -
are using for STX, ETX, EOT. For example:
byte[] ba_ex = { (byte)'\u0002', (byte)'\u0050', (byte)'\u0049', (byte)'\u004e', (byte)'\u0047', (byte)'\u0004', (byte)'\u0057', (byte)'\u004f', (byte)'\u0052', (byte)'\u004b', (byte)'\u0003'}; string strx = Encoding.UTF8.GetString(ba_ex);
Guess what strx contains?
See:
char (C# reference)
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/char
Excerpt:
"Literals
You can specify a char value with:
- a character literal.
- a Unicode escape sequence, which is \u followed by the four-symbol
hexadecimal representation of a character code.
- a hexadecimal escape sequence, which is \x followed by the hexadecimal
representation of a character code."
If you open the Character Map utility and hover over any symbol in the font
selected you will see the Unicode value for that symbol/character.
Example:
- Wayne
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Thursday, April 23, 2020 6:11 PM
All replies
-
The string "\u0002" contains exactly one character, with the Unicode value U+0002.
I suspect you are being fooled by the debugger, which is trying to show you special characters that cannot themselves be printed. It's exactly like the string "\r\n", which contains exactly two characters: return and linefeed.
If you are dealing with ASCII characters and special characters, then you are likely to confuse yourself by converting back and forth to Unicode. Why not just leave it as a byte array?
Tim Roberts | Driver MVP Emeritus | Providenza & Boekelheide, Inc.
Tuesday, April 21, 2020 5:35 AM -
Hi Markus,
Thank you for posting here.
Those characters: STX, ETX, EOT, etc. are called Control characters, they also have a name: Non-printable character.
We can't see that as "STX" in the string. If you want to see it, I am afraid you'd better type these letters manually.
Best Regards,
Timon
MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.Tuesday, April 21, 2020 7:51 AM -
Why not just leave it as a byte array?
I receive a string and have to analyze the elements.
If I have a string, I can use split, that's why I converted.
resToClient = (char)2 + "PING" + (char)4 + "WORK" + (char)4 + "ACK" + (char)3;
If I create it like this I have exactly 1 character in the string
"\u0002" here I have 5, I can't split with (char)4
2 P I N G 4
0 1 2 3 4 5
@Tim, Can you make a sample, how Do you split it with byte array?
- Start is STX
- End is ETX
- EOT is the separator
Thanks in advance.
Greetings Markus
Tuesday, April 21, 2020 10:33 AM -
Hi markus,
For the example string, we can split it like this:
var result = str.Split(new char[] { (char)4 });
Or
var result2 = str.Split(new string[] { "\u0004" },StringSplitOptions.None);
Even if it is a byte array, we can see where the character EOT(4) is.
We can split it into 3 new byte arrays based on the index, and then do something.
I hope I did not misunderstand you again.
Best Regards,
Timon
MSDN Community Support
Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.- Edited by Timon YangMicrosoft contingent staff Wednesday, April 22, 2020 5:47 AM
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Wednesday, April 22, 2020 5:47 AM -
Hi Markus,
In the past, I've done something like this for processing data coming in from a TCP Listener (it sounds like that is what you're doing, or something similar):
private string ValidStart = new string(new char[] { (char)2 }); private string ValidEnd = new string(new char[] { (char)3 }); public List<string> ProcessDataReceived(ref string buffer, string dataReceived) { List<string> DataList = new List<string>(); try { //add received data to the buffer buffer += dataReceived; int whileLoopWatchdog = 0; while (buffer.Length > 0) { this.RemoveDataPrecedingStart(ref buffer); //verify that a complete message is present, if so process just that string Complete = this.ProcessCompleteMessage(ref buffer); if (Complete.IsNotNullOrEmpty()) { //string Xml; //if (this.IncludeDelimiters) // Xml = Complete; //else // Xml = this.ParseOutDelimiters(Complete); // //DataList.Add(Xml); DataList.Add(Complete); . //remove the data that was pulled out for processing //this will replace multiple identical occurrences but that //is fine buffer = buffer.Replace(Complete, ""); } //HACK: prevent endless while loop //(although 20+ complete sets of data at once probably wouldn't //happen, and if it did the remainder would be caught on //next data receive) whileLoopWatchdog++; if (whileLoopWatchdog > 20) { whileLoopWatchdog = 0; break; } } } catch (Exception ex) { LogOutput.WriteLine(ex, "DataParser Exception"); } return DataList; } protected virtual string ProcessCompleteMessage(ref string buffer) { string Complete = ""; if ((buffer.StartsWith(ValidStart)) && (buffer.Contains(ValidEnd)) && (buffer.IndexOf(ValidStart) < buffer.IndexOf(ValidEnd))) { Complete = buffer.Substring(buffer.IndexOf(ValidStart), buffer.IndexOf(ValidEnd) + ValidEnd.Length - buffer.IndexOf(ValidStart)); } return Complete; }
Hope that helps! =0)~~Bonnie DeWitt [C# MVP]
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Wednesday, April 22, 2020 3:04 PM -
Hello Timon,OK, that's the same.
var result = str.Split(new char[] { (char)4 });
var result2 = str.Split(new string[] { "\u0004" },StringSplitOptions.None);
Is clear.>We can split it into 3 new byte arrays based on the index, and then do something.
>I hope I did not misunderstand you again.I think you understand what I need.
But how would you split the byte array?It's an illusion, like Tim said.
Greetings Markus- Edited by Markus Freitag Thursday, April 23, 2020 5:20 PM
Wednesday, April 22, 2020 3:50 PM -
But how would you split the byte array?It's an illusion, like Tim said.
What's an illusion?
The character shown as \u0004 (or \u0002 or \u0003) is a single byte, as Tim
said. Since it has no visual symbol to represent it in a display of characters
such as a string, it is shown as its escaped Unicode value. You can view it as
a hex value as well: 0x4 or 0x0004 etc. Or as a decimal value, as Timon said
and illustrated.
As to splitting the byte array, you haven't given enough details. What exactly,
if anything, is supposed to be done with the STX and ETX characters? Are they
to be stripped off? Or left attached to the substrings?
Or is the parsing to be dome only on the text bounded by the STX and ETX
characters? Given the string of bytes in your example, what *exactly* should
the substrings (of bytes) contain?
Are the substrings to be in byte arrays? An array of byte arrays? An array of
strings?
Finding the EOT in a byte array can be done like so:
byte[] bar = { (byte)'\u0002', (byte)'P', (byte)'I', (byte)'N', (byte)'G', (byte)'\u0004', (byte)'W', (byte)'O', (byte)'R', (byte)'K', (byte)'\u0003'}; int idx4 = Array.IndexOf(bar, (byte)0x4); // here idx4 == 5
With some length calculations for array sizing, you can selectively copy
parts of the source byte array into other byte arrays. e.g. -
// copy from start of source to EOT - 1 Array.Copy(bar, 0, bar1, 0, idx4); // copy from EOT + 1 to end of source Array.Copy(bar, idx4 + 1, bar2, 0, bar.Length - idx4 - 1);
E&OE
- Wayne
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Wednesday, April 22, 2020 11:44 PM -
Dear All,First, Thank you all for the responses.
>I'm confused by your confusion. 8=O
What do you mean by that? Oh, my God ;-)
>Are the substrings to be in byte arrays?
>An array of byte arrays?
>An array of strings?It's clear to me now.
The array can be small or large.<STX>Group<EOT>Element1<EOT>Element2<EOT>Element2 .... <EOT>ElementN<ETX>
Yes I can parse the array with IndexOf.
--> Without converting to a string.Do you have another good tip or example?
If yes very nice, I think I should be able to solve this, want to have a good structure.
Greetings MarkusThursday, April 23, 2020 5:22 PM -
are using for STX, ETX, EOT. For example:
byte[] ba_ex = { (byte)'\u0002', (byte)'\u0050', (byte)'\u0049', (byte)'\u004e', (byte)'\u0047', (byte)'\u0004', (byte)'\u0057', (byte)'\u004f', (byte)'\u0052', (byte)'\u004b', (byte)'\u0003'}; string strx = Encoding.UTF8.GetString(ba_ex);
Guess what strx contains?
See:
char (C# reference)
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/char
Excerpt:
"Literals
You can specify a char value with:
- a character literal.
- a Unicode escape sequence, which is \u followed by the four-symbol
hexadecimal representation of a character code.
- a hexadecimal escape sequence, which is \x followed by the hexadecimal
representation of a character code."
If you open the Character Map utility and hover over any symbol in the font
selected you will see the Unicode value for that symbol/character.
Example:
- Wayne
- Marked as answer by Markus Freitag Saturday, April 25, 2020 10:51 AM
Thursday, April 23, 2020 6:11 PM