none
why cannot TcpClient read data by packets, as Udpclient can do? RRS feed

  • Question

  • hi,

    I am reading a networking book aobut TCP/IP. I want to get more solid understanding of TCP and UDP by socket programming in C#. I find out that by calling UdpClient.Receive() I can receive the data packet by packet, but TcpClient doesn't provide counterpart method to do that. To receive Tcp data, you first get a NetworkStream instance by calling TcpClient.GetStream() and then call NetworkStream.Read(buffer, 0, buffer.Length);, what you get from the buffer may be one or more packets, or one and half packet, anyway it is not packet by packet reading, it is byte by byte reading. My confusion is why can't I read data in Tcp socket packet by packet?

    On the server side of the TCP layer, it receives packets sent by client side and put multiple packets into the reciever buffer. I think the Tcp layer can organize all these data by packets, rathen than just by bytes, so that when my C# code wants to get a packet from the receiver buffer, it gets the frist packet in the reciever buffer and returns it to my C# code. But my experiment shows that it doens't work in this way. Can you please give me an explanation about this? thanks

    the following picture shows the result of my experiment. Each time I input the data to send and press Enter, the client sends a TCP packet to the server. I verified this using WireShark. But how many bytes the C# code gets when calling NetworkStream.Read() is determined by how many bytes are there in the receiver buffer.


    • Edited by EasyApplication Saturday, April 12, 2014 3:30 AM explain in more details about my experiment code
    Saturday, April 12, 2014 3:16 AM

Answers

  • The TCP client doesn't do it for two reasons

    1) The TCP fragmentation can occur in any where in a multi-hop network.  So the client doesn't really know where the end of the original datagram occurred.  In theory datagrams can be split and recombined a million different ways as the datagram doesn't exceed the ~1500 bytes. The TCP specification is a little vague and different vendors implementations uses a different max size.  A vendor who send a 1508 datagrams and then gets forward through a vendor who uses a limit of 1500 may spit the 1508 into a datagram of 1500 and another of 8.

    2) Servers/Routes/Driver don't want to increase the latency time so they don't wait until the end.  Windows is a multi-processing operating system and each task runs until the operating system does a swap.  The swap can occur any time so the most logical method is do everything you can before getting swapped.  Not to wait until all the data is received or the buffer is full.

    With TCP receive the data is held for a slight amount of time until the ack message is sent and the packets are re-assembled in order.  The packet don't get received in the same order that they are sent so the receiver must buffer the data until all the earlier packets are received.

    The UDP transmitter isn't reliable and doesn't have to manage the resend/ack table.  UDP has a few different modes (including broadcast and multicast) so the code is written to be common for all the sub-modes.  These modes the Ethernet driver is just passing the data.


    jdweng

    Saturday, April 12, 2014 5:15 PM

All replies

  • There shouldn't be any difference between the UDP and TCP receive messages if you are sending the same amount of data.  Both are embedded in an IP packet(s).  There may be a slight difference in timing and your code must take that into consideration.

    There are basically two different type of communication applications

    1) Chat - A person is interpreting the receive messages and you don't need to synchronize the two ends of the connection.  Either end of the connection can send messages asynchronously.  The receiver just appends all the messages together and a person will interpret the results.

    2) Command and Control

    You must implement a master (client) / slave (server) relationship.  To be able to process commands the receiving end must know where the end of the message is located and must wait for the end before proceeding.  So in your application which is ascii text you need to add a '\n' to the end of each messages and then have the receiver append all receive data together until a return is found.  This will apply to both UDP and TCP.


    jdweng

    Saturday, April 12, 2014 3:36 AM
  • thanks for your quick replay to share the two types of applications, it inspires me to think more about tcp, but it doesn't clear my confusion.

    Here is the client side code for my experiment:

                TcpClient tcpClient = new TcpClient();

                tcpClient.Connect(IPAddress.Parse("192.168.1.103"), 10000);
                NetworkStream networkStream = tcpClient.GetStream();

                while (true)
                {
                    Console.Write("input data to send to server:");
                    string payload = Console.ReadLine();

                    byte[] payLoadBytes = Encoding.ASCII.GetBytes(payload);
                    networkStream.Write(payLoadBytes, 0, payLoadBytes.Length);
                }

    and the server side code:

                const int portNumber = 10000;
                TcpListener tcpListener = new TcpListener(IPAddress.Parse("192.168.1.103"), portNumber);
               
                tcpListener.Start();          

                TcpClient tcpClient = tcpListener.AcceptTcpClient();
                Console.WriteLine("Connection accepted.");

                NetworkStream networkStream = tcpClient.GetStream();

                while (true)
                {
                    byte[] buffer = new byte[5];

                    Console.ReadLine();
                    networkStream.Read(buffer, 0, buffer.Length);
                    Console.WriteLine("received data:" +  Encoding.ASCII.GetString(buffer));
                }

    According to you description, it seems that my application should be one of the Chat Application, is that right?
    My understanding is that it is the TCP and UDP protocol's implementation's reponsibility to take care of the boundary of each packet, rather than the C# developer's reponsibility to do this. This is the TCP built-in feature.
    If a developer stands on top of socket, what the application developer needs to care about is partition large application payload into small transport layer payloads, and to assemble multiple TCP packets get the application layer perspective payload. I draw the following picture to express my understanding.

    If a developer stands on top of higher level such as WCF, ASP.NET web service, the developer even doesn't need to care about how to assemble multiple TCP packets get the application layer perspective payload, that heavy stuff is done by microsoft's WCF/web service layer.

    Please correct me if my understanding is wrong.

    I google this topic. One article says that there are Dgram and Stream SocketType. Then I had a watch of the SocketType of my application, tcpClient.Client.SocketType is SocketType.Stream. I also watched the the SocketType of my UDP test application, udpClient.Client.SocketType is SocketType.Dgram. But I cannot understand why Tcp socket cannot use Dgram-like SocketType rather than Stream SocketType.


    Saturday, April 12, 2014 11:19 AM
  • TCP data is broken into a datagram max size of approximately 1500 bytes by the Ethernet driver.  Each datagram can be routed differently and get received in ransom order.  The Ethernet driver on the rx side of the connection reassembles the data in the correct order.  Routine can also break the 1500 datagrams any way they choose.

    In VS both the ends of a connection uses a stream to connect you VS application to the Ethernet driver.  Streams are process by a service on the PC which periodically checks if there is data.  The Ethernet driver is using Async to read/write the stream so data will be process in random size chunks.

    The solution is to use code like below.  WAIT for CR like I said previously before process the data.

     string rxString = ""
     while (true)
     {
          byte[] buffer = new byte[5];
    
          Console.ReadLine();
          networkStream.Read(buffer, 0, buffer.Length);
          rxString += Encoding.ASCII.GetString(buffer);
          if rxString.contains("\n");
          {         
             Console.WriteLine("received data:" +  Encoding.ASCII.GetString(buffer));
             rxString = "";
          }
    }
    


    jdweng

    Saturday, April 12, 2014 1:22 PM
  • I know your solution works. But I think this should be done by TcpClient internally.

    besides, different from TcpClient, UdpClient provides the packet by packet receiving. When the following udpServer.Receive() is executed, it returns exactly one TCP payload no matter how many payloads are there in the receiver buffer when this method is executed. How to explain this in UdpClient? thank you

                UdpClient udpServer = new UdpClient(11001);
               
                while (true)
                {
                    var remoteEP = new IPEndPoint(IPAddress.Any, 11001);
                    Console.ReadLine();
                    var payloadReceivedBinary = udpServer.Receive(ref remoteEP); // listen on port 11000
                    UTF8Encoding encoding = new UTF8Encoding();
                    string payloadReceivedString = encoding.GetString(payloadReceivedBinary);

                    Console.WriteLine("receive data from " + remoteEP.ToString() + ":" +  payloadReceivedString);

               }

    Saturday, April 12, 2014 2:07 PM
  • The TCP client doesn't do it for two reasons

    1) The TCP fragmentation can occur in any where in a multi-hop network.  So the client doesn't really know where the end of the original datagram occurred.  In theory datagrams can be split and recombined a million different ways as the datagram doesn't exceed the ~1500 bytes. The TCP specification is a little vague and different vendors implementations uses a different max size.  A vendor who send a 1508 datagrams and then gets forward through a vendor who uses a limit of 1500 may spit the 1508 into a datagram of 1500 and another of 8.

    2) Servers/Routes/Driver don't want to increase the latency time so they don't wait until the end.  Windows is a multi-processing operating system and each task runs until the operating system does a swap.  The swap can occur any time so the most logical method is do everything you can before getting swapped.  Not to wait until all the data is received or the buffer is full.

    With TCP receive the data is held for a slight amount of time until the ack message is sent and the packets are re-assembled in order.  The packet don't get received in the same order that they are sent so the receiver must buffer the data until all the earlier packets are received.

    The UDP transmitter isn't reliable and doesn't have to manage the resend/ack table.  UDP has a few different modes (including broadcast and multicast) so the code is written to be common for all the sub-modes.  These modes the Ethernet driver is just passing the data.


    jdweng

    Saturday, April 12, 2014 5:15 PM