Transferring data...
-
wrote on 4 Oct 2021, 11:44 last edited by AxelVienna 10 Apr 2021, 11:45
@SPlatten
Putting aside the TCP / UDP discussion for a minute, the simplicity-driven speed of UDP is basically limited by your physical environment. The copy_calc speed is based on the assumption that the given bandwidth is fully available between your traffic endpoints. USB traffic (w/o USB HUBs!) is the only use case where this assumption always holds true. You are sending UDP broad- or multicasts across your network. The first bottle neck is your server's ethernet adapter. Your max outbound speed is the server's adapter speed / number of clients. Next bottle neck is your router/hub/switch, which comes with a number of different potential factors which will slow down your traffic:(1) Physical limitations
Depending how new and fast your router/hub/switch is(2) Configuration
Protocols, ports and endpoints may or may not be prioritized.
Every package must be inspected to determine its priority.
So even if your UDP traffic enjoys top priority, the inspection itself slows it down.(3) Logical factors
How connections are rendered between two endpoints largely depends on internal algorithms in your router/hub/switch hardware. Some switches have UDP optimizers which accept basically a package and send it to all multi-/broadcast addresses at the same time. Others buffer and serialize.
The speed of optimized UDP fully unfolds only if it is truly unidirectional. Since your environment works with bi-directional features (basically a limited TCP re-implementation), this may slow down your hardware performance additionally. Since UDP optimizers take the risk of package loss for the sake of speed, your hardware configuration may be a trigger for your bi-directional features so you end up in a vicious circle: UDP-optimization => package loss => bi-directional features => more package loss => more bi-directional features etc.Having said that: While your case is certainly interesting, I am not surprised that you don't get the speed you want. Debugging your bi-directional features (e.g. requests to resend missing packages) will give you an idea where the troublemaker is to be located. Your hardware is most likely punishing you for using TCP like features on UDP. Moreover, you loose TCP based optimization features which your hardware may even provide.
This is probably not the solution you expected, however, I hope it sheds some light on the matter.
-
@SPlatten
Putting aside the TCP / UDP discussion for a minute, the simplicity-driven speed of UDP is basically limited by your physical environment. The copy_calc speed is based on the assumption that the given bandwidth is fully available between your traffic endpoints. USB traffic (w/o USB HUBs!) is the only use case where this assumption always holds true. You are sending UDP broad- or multicasts across your network. The first bottle neck is your server's ethernet adapter. Your max outbound speed is the server's adapter speed / number of clients. Next bottle neck is your router/hub/switch, which comes with a number of different potential factors which will slow down your traffic:(1) Physical limitations
Depending how new and fast your router/hub/switch is(2) Configuration
Protocols, ports and endpoints may or may not be prioritized.
Every package must be inspected to determine its priority.
So even if your UDP traffic enjoys top priority, the inspection itself slows it down.(3) Logical factors
How connections are rendered between two endpoints largely depends on internal algorithms in your router/hub/switch hardware. Some switches have UDP optimizers which accept basically a package and send it to all multi-/broadcast addresses at the same time. Others buffer and serialize.
The speed of optimized UDP fully unfolds only if it is truly unidirectional. Since your environment works with bi-directional features (basically a limited TCP re-implementation), this may slow down your hardware performance additionally. Since UDP optimizers take the risk of package loss for the sake of speed, your hardware configuration may be a trigger for your bi-directional features so you end up in a vicious circle: UDP-optimization => package loss => bi-directional features => more package loss => more bi-directional features etc.Having said that: While your case is certainly interesting, I am not surprised that you don't get the speed you want. Debugging your bi-directional features (e.g. requests to resend missing packages) will give you an idea where the troublemaker is to be located. Your hardware is most likely punishing you for using TCP like features on UDP. Moreover, you loose TCP based optimization features which your hardware may even provide.
This is probably not the solution you expected, however, I hope it sheds some light on the matter.
wrote on 4 Oct 2021, 11:47 last edited by@AxelVienna , thank you, I'm developing on a supplied laptop where presently the client and server are on the same system, this isn't how it will be when rolled out.
-
@AxelVienna , thank you, I'm developing on a supplied laptop where presently the client and server are on the same system, this isn't how it will be when rolled out.
wrote on 4 Oct 2021, 11:49 last edited by@SPlatten said in Transferring data...:
this isn't how it will be when rolled out.
I was going to ask about this earlier. One trouble I foresee is how you will know how your approach fares in another environment, particularly about required retries. With TCP you may not know the speed but you do know it will be reliable. With your UDP I don't know how you can anticipate its performance in a distributed environment.
-
@SPlatten said in Transferring data...:
this isn't how it will be when rolled out.
I was going to ask about this earlier. One trouble I foresee is how you will know how your approach fares in another environment, particularly about required retries. With TCP you may not know the speed but you do know it will be reliable. With your UDP I don't know how you can anticipate its performance in a distributed environment.
-
@JonB , I'm going to look into using QTCP now, is there any good quick start example I can look at or is it simply write the data to the Socket ?
wrote on 4 Oct 2021, 11:53 last edited by@SPlatten From the documentation I posted above:
https://doc.qt.io/qt-5/qtnetwork-fortuneclient-example.html
https://doc.qt.io/qt-5/qtnetwork-blockingfortuneclient-example.html
-
@JonB , I'm going to look into using QTCP now, is there any good quick start example I can look at or is it simply write the data to the Socket ?
-
@SPlatten From the documentation I posted above:
https://doc.qt.io/qt-5/qtnetwork-fortuneclient-example.html
https://doc.qt.io/qt-5/qtnetwork-blockingfortuneclient-example.html
-
-
@AxelVienna , thank you, I'm developing on a supplied laptop where presently the client and server are on the same system, this isn't how it will be when rolled out.
wrote on 4 Oct 2021, 12:02 last edited by@SPlatten said in Transferring data...:
@AxelVienna , thank you, I'm developing on a supplied laptop where presently the client and server are on the same system, this isn't how it will be when rolled out.
With client and server hosted on the same machine, the vicious circle I described is still likely to happen: Blocking code or performance issues on the client side, will cause package loss.
-
I have been working on a C++ class to transfer a large amount of data to a clients, presently I'm working on the server and client being on the same system which is a 100MB/s network. The file is 1.17GB, what is the best way to transfer this data as presently it takes around 20 minutes to transfer.
I've seen various online calculators which give 1 minute 45 seconds as the transfer time.
https://techinternets.com/copy_calc
I can't see how this can be.
@SPlatten said in Transferring data...:
presently it takes around 20 minutes to transfer.
Your implementation must be rather dubious, I'd say. A 10/100 network (the typical cat5(e) UTP without much noise on the channel) will easily give you ~10 MB/s transfer speed (in reality, not theoretical) over plain TCP, which should sum up to just about under 2 minutes.
-
@SPlatten said in Transferring data...:
presently it takes around 20 minutes to transfer.
Your implementation must be rather dubious, I'd say. A 10/100 network (the typical cat5(e) UTP without much noise on the channel) will easily give you ~10 MB/s transfer speed (in reality, not theoretical) over plain TCP, which should sum up to just about under 2 minutes.
wrote on 4 Oct 2021, 18:41 last edited by@kshegunov said in Transferring data...:
under 2 minutes
Yes, it looks the expected value... See this calculator for instance.
1.7 GB over 100 Mbps with 10% overhead -> 2m 33secInstead of developing your own file transfer protocol over UDP, have you consider TFTP for example?
It's quite used for initial remote file transfer/configuration of network devices (cable modems, IP phones, etc.)
I guess you can even have already implemented TFTP servers for free. -
@jsulm , all I can go on is the data thats in front of me. TCP packets can send 1.5K, UDP packets can send 64K, what isn't clear?
wrote on 4 Oct 2021, 21:32 last edited by@SPlatten said in Transferring data...:
@jsulm , all I can go on is the data thats in front of me. TCP packets can send 1.5K, UDP packets can send 64K, what isn't clear?
What isn't clear is why you keep making this claim.
TCP uses a window for flow control rather than a packet size, because a TCP stream represents a sequence of bytes. An implementation may send that sequence via one or more IP packets. The window field in the TCP header is 16 bits, allowing the sender to advertise 64 kilobytes of available space. Window scaling effectively extends it to 32 bits.
-
@SPlatten said in Transferring data...:
@jsulm , all I can go on is the data thats in front of me. TCP packets can send 1.5K, UDP packets can send 64K, what isn't clear?
What isn't clear is why you keep making this claim.
TCP uses a window for flow control rather than a packet size, because a TCP stream represents a sequence of bytes. An implementation may send that sequence via one or more IP packets. The window field in the TCP header is 16 bits, allowing the sender to advertise 64 kilobytes of available space. Window scaling effectively extends it to 32 bits.
wrote on 4 Oct 2021, 22:24 last edited by@jeremy_k The absolute limitation on TCP packet size is 64K (65535 bytes), but in practicality this is far larger than the size of any packet you will see, because the lower layers (e.g. ethernet) have lower packet sizes. The MTU (Maximum Transmission Unit) for Ethernet, for instance, is 1500 bytes.
-
@jeremy_k The absolute limitation on TCP packet size is 64K (65535 bytes), but in practicality this is far larger than the size of any packet you will see, because the lower layers (e.g. ethernet) have lower packet sizes. The MTU (Maximum Transmission Unit) for Ethernet, for instance, is 1500 bytes.
wrote on 4 Oct 2021, 22:36 last edited by jeremy_k 10 Apr 2021, 22:36@JoeCFD said in Transferring data...:
@jeremy_k The absolute limitation on TCP packet size is 64K (65535 bytes),
You seem to have missed the window scaling link.
but in practicality this is far larger than the size of any packet you will see, because the lower layers (e.g. ethernet) have lower packet sizes. The MTU (Maximum Transmission Unit) for Ethernet, for instance, is 1500 bytes.
This is at least the second time this conversation has occurred. https://forum.qt.io/topic/130769/qudpsocket-speeding-up/9
The same limitation will apply to UDP packets. IE, if @SPlatten says that a practical UDP packet can be 64k octets over a given interface, a single TCP packet could do the same.
-
@artwaw , thank you, I am using UDP as the protocol because of the larger packets it is capable of sending, however there is a two way transaction for every packet. In the system I've developed, the server sends a message to the clients, notifying them that a transfer is ready, this consists of a JSON packet:
{"DSID" : 1, /*Data Set ID */ "RDF" : "/folder/name.rdf", /*Path and name of file to transfer*/ "Filesize". : 1234, /*File size in bytes*/ "Totalblocks" : 1234} /*Number of blocks, where a block consists of N bytes */
Each client will then start to issue requests for each block, where a block will contain N bytes of the file as stored in a database. Client request:
{"DSID" : 1, /*Data Set ID */ "BlockNo" : 0} /*Block number to request 0 to Totalblocks-1*/
The server will respond to each request with:
{"DSID" : 1, /*Data Set ID */ "BlockNo" : 0} /*Block number to request 0 to Totalblocks-1*/ "Checksum" : 0x1234, /*Checksum of hex bytes for validation*/ "Chunk". : "hex bytes"}. /*String containing hex nibbles*/
The client requests each block until all blocks have been received. The client will also verify that the received data is correct by recalculating the checksum and comparing with the received checksum.
This process isn't quick and takes typical around 20 minutes to transmit a GB file.
wrote on 4 Oct 2021, 23:25 last edited byComing back to the original issue for a moment, ignoring the argument regarding TCP/UDP, the most obvious issue I see with this scheme is that the transfer encoding has doubled the number of bytes sent:
The server will respond to each request with:
{"DSID" : 1, /*Data Set ID */
"BlockNo" : 0} /Block number to request 0 to Totalblocks-1/
"Checksum" : 0x1234, /Checksum of hex bytes for validation/
"Chunk". : "hex bytes"}. /String containing hex nibbles/1000 bytes represented as a hex string needs 2000 bytes (plus the other overhead you see above). The useful throughput has been halved by this decision alone. Base64 encoding into the string would be a better with 4 bytes sent for each 3 bytes in.
The original post confuses megabytes per second (MBps) with megabits per second (Mbps), but provides a time estimate consistent with the megabits interpretation. At 100 megabit/second, a 1.17GB file encoded in hex will send ~2.34GB, taking around least 3.5 minutes according to the OP's calculator.
Using a half-duplex protocol on top of UDP further reduces throughput.
TCP packets are limited to 1.5K I can only assume it will take significantly longer.
I think you are confusing the maximum transmission unit (MTU) at the physical layer with the protocol layer (i.e. TCP, UDP etc). If you have, for example, an Ethernet connection with a 1500 byte MTU then any chunk of data sent over that interface will be broken into packets smaller than this regardless of their origin (UDP, TCP, ICMP or any other exotica). Your 64k maximum UDP datagram will be broken up in <=1500-byte physical packets just the same as a TCP stream of 64k will be. This fragmentation and reassembly is transparent to you (just as the sequencing, acknowledgement, retransmission and pipelining done for you by TCP is).
-
Coming back to the original issue for a moment, ignoring the argument regarding TCP/UDP, the most obvious issue I see with this scheme is that the transfer encoding has doubled the number of bytes sent:
The server will respond to each request with:
{"DSID" : 1, /*Data Set ID */
"BlockNo" : 0} /Block number to request 0 to Totalblocks-1/
"Checksum" : 0x1234, /Checksum of hex bytes for validation/
"Chunk". : "hex bytes"}. /String containing hex nibbles/1000 bytes represented as a hex string needs 2000 bytes (plus the other overhead you see above). The useful throughput has been halved by this decision alone. Base64 encoding into the string would be a better with 4 bytes sent for each 3 bytes in.
The original post confuses megabytes per second (MBps) with megabits per second (Mbps), but provides a time estimate consistent with the megabits interpretation. At 100 megabit/second, a 1.17GB file encoded in hex will send ~2.34GB, taking around least 3.5 minutes according to the OP's calculator.
Using a half-duplex protocol on top of UDP further reduces throughput.
TCP packets are limited to 1.5K I can only assume it will take significantly longer.
I think you are confusing the maximum transmission unit (MTU) at the physical layer with the protocol layer (i.e. TCP, UDP etc). If you have, for example, an Ethernet connection with a 1500 byte MTU then any chunk of data sent over that interface will be broken into packets smaller than this regardless of their origin (UDP, TCP, ICMP or any other exotica). Your 64k maximum UDP datagram will be broken up in <=1500-byte physical packets just the same as a TCP stream of 64k will be. This fragmentation and reassembly is transparent to you (just as the sequencing, acknowledgement, retransmission and pipelining done for you by TCP is).
wrote on 5 Oct 2021, 04:56 last edited by@ChrisW67 , thank you Chris, you hit the nail on the head, I was thinking that the 1.5K in TCP was the number of bytes it was capable of sending per second, which is why I couldn't see how it was then capable of transmitting such large amounts of data in a second.
-
wrote on 5 Oct 2021, 09:10 last edited by
You can also try changing the NIC adapter MTU to use Jumbo Frames (9K) but think that this will only work if your devices are connected through a switch that supports Jumbo Frames.
-
You can also try changing the NIC adapter MTU to use Jumbo Frames (9K) but think that this will only work if your devices are connected through a switch that supports Jumbo Frames.
@ollarch said in Transferring data...:
You can also try changing the NIC adapter MTU to use Jumbo Frames (9K) but think that this will only work if your devices are connected through a switch that supports Jumbo Frames.
Yes, however increasing the frame size isn't necessarily going to give you throughput. The MTU is chosen to be relatively small for a reason, as damaging a frame (e.g. TP noise leading to a failing CRC) means you need to resend it. Having larger packets means higher probability of a faulty bit and also resubmitting a larger packet means more time (and bytes) wasted. Yes, there's overhead in the smaller packets but also it's more versatile and somewhat economical considering you're not transmitting over an ideal channel.
-
wrote on 5 Oct 2021, 10:04 last edited by
Yes, of course it depens on physical conditions that influences to how many resends do it have in normal conditions.
In ideal conditions will be faster but if there are resends it could be worst.
21/30