← Back to team overview

p2psp team mailing list archive

Re: NAT Traversal Set of rules implementation

 

On Sun, Aug 2, 2015 at 11:00 PM Max Mertens <max.mail@xxxxxxxxxx> wrote:

> Hi Vicente,
>

Hello!


>
> sorry for the long delay; I thought I would gather more results and write
> them together.
>

It's OK.


>
> This week I did a lot of further tests, and finally determined the
> behaviour with SYMPP<->PRCN and SYMPP<->SYMPP connections (by tracking
> every packet with iptables logging):
> The SYMPP NAT (as specified by the iptables rules) sends an ICMP
> Destination Unreachable packet if there is no NAT entry assigned to this
> port. Apparently the PRCN and SYMPP NATs immediately close the NAT entry if
> such an ICMP packet is received, and somehow remember the address;port
> tuple as "not responding", so that further connection attempts will fail.
> This has the following consequences:
> In the SYMPP<->SYMPP scenario, the connection only works if the two peers
> send the first packet to each other "simultaneously" (*).
> In the SYMPP<->PRCN scenario, the connection works if the two peers send
> the first packet to each other "simultaneously" (*), or if the peer behind
> the SYMPP router sends its first packet before the other peer.
>
>
OK. Good find.


> (*) Supposed that peer 1 sends its first packet to peer 2,
> "simultaneously" here means that the NAT entry at peer 2 has to be created
> before the packet from peer 1 arrives, or the packet from peer 2 has to
> arrive at peer 1 before the ICMP packet, in order to mark the NAT entry at
> peer 1 as assured.
> The time difference between the first packets from each peer is determined
> by the packet jitter (delay variation) of the splitter plus the difference
> in delay between the two peers. This time difference has to be less than
> the sum of both peers' delay, in order to successfully connect. Apparently
> this applies in nearly all cases, just not in the simulation where the
> jitter (~5ms) is higher than the delay¹ (~3ms).
> So I ran the tests again with additional delay between the peers (via the
> tc command and the netem module [1]). With a delay at the peers of 4ms and
> above, the connection could be reliably established.
>
> To test real networking scenarios, I combined different delay and jitter
> values with a specific rate of packet loss, and modified the test script to
> run a specific number of test runs and output the percentage of successful
> runs. Please note the results attached below.
>
>
>
The results make sense.


>
> On 27.07.2015 09:24, Vicente Gonzalez wrote:
>
> For testing, we theoretically could just take the results from the PRCN
>> NAT, as the only difference between PRCN and SYMPP is the local source port
>> at the peer, which does not make a difference at all. Though it would be
>> great to know if this NAT behaviour actually exists in real NAT
>> implementations, and if this has to be addressed by sending hello packets
>> more often or in a burst-like style (e.g. send 100 packets with 20ms
>> between each, and then pause for 1 second).
>>
> To find out more information about this issue, the only thing that we can
> do is to test with different (real) NATs, and this is something that I
> would do at the testing time (when the development stage be finised).
> Therefore, for now, implement only the code neccesary to solve the problem
> in iptables.
>
> I tried the sending of packets in bursts, and sending packets in different
> intervals so that after some time two packets from both peers are sent
> simultaneously (with <1ms accuracy), but as noted above, apparently for
> this NAT type only the first packet counts.
> Testing on real NATs will be quite interesting. :)
>

Sure!


>
>
> Another thing I noticed is that the sequentially allocating NAT type
>> cannot be simulated by iptables rules. I thought that the rules currently
>> stated in the nts_doc branch would work like this, but apparently the
>> source port is only increased if a completely new socket is used for the
>> connection to another peer. So now we have a few options to solve this:
>> 1. find another possibility to simulate this NAT behaviour (e.g. a
>> specific router distribution running on the virtual machine); or I could
>> try to alter the iptables NAT code and build a kernel module reflecting the
>> wanted behaviour (iptables code seems not too complicated, though I do not
>> know if this is possible at all)
>> 2. change the peer code (just for testing) to somehow force the
>> allocation of the next port number (e.g. bind to another port or something
>> similar)
>> 3. test this situation on a real NAT with this behaviour: do you have
>> such a NAT or do you know where this could exist?
>> 4. do not test this NAT type, as the NAT type detection and source port
>> prediction is somehow trivial and easy to verify by reading the code
>> What do you think is the best step to take?
>>
>
> I imagine that although this kind of NAT there exists, we can not expect
> that we can find a "pure"  sequentially allocating NAT behaviour in real
> contexts, just because the use of this type of NAT in concurrency with many
> other users probably will produce that the NAT does not work in such way.
>
> I would select the choice 4 (at least at this moment, maybe in the testing
> stage we can do something different if we found a real NAT with this
> behaviour).
>
> Ok. I added a simple port prediction algorithm to the NTS code, you can
> have a look at the changes here [2]. Currently it determines if the
> difference between the source ports towards the splitter and the monitor is
> <10, and then takes this difference as the step for the port prediction.
> Other possibilities would be assuming a constant step of 1, or to let the
> peers send packets to all port,port+1,...,port+port_difference
> possibilities. Another approach would be to start another listening socket
> at the monitor and gather another source port difference, to detect the
> port allocation type more accurately.
> Which approach would you suggest?
>

For me, none of the approaches has a clear advantage. Try the simplest one.


> How do you think about sending ~10 packets per second to each peer in the
> connection attempt phase, is this too much?
>

No. The packets are small.


>
> Some NATs might be destination port-insensitive when allocating source
> ports, which could lead to a wrong NAT type detection (port preservation
> instead of sequential allocation) if splitter and monitor are on the same
> host. The only option here would be using either public STUN servers or
> having trusted peers with predictable NATs, to determine the source port
> difference for different destination addresses.
>

Well, there may be more than one monitor peer. Does this solve your
question?


>
>
> A NAT type combination (marked as "(yes)" in the tables in previous
> emails) that does not work yet is a SYMSP router at an existing peer and a
> port-restrictive NAT (any type except FCN and RCN) at the arriving peer: To
> handle this situation, the existing peer has to send UDP packets to a new
> port at the monitor or at the splitter, to get the currently allocated
> source port of the existing peer and predict its next port. What do you
> think about this?
>

The use of ports should be minimized, but I suppose that in this case this
use is fully justified. Please, try it.


>
>
> I updated the task list for the last two weeks and the todo list [3].
> After completing the NAT detection and traversal implementation, I will
> finish the documentation of the implemented techniques and test the
> software on as much different NAT types as possible.
>

Good job :-)

Regards,
Vicente.


>
> Thanks,
> Max
>
>
>
> [1] http://www.linuxfoundation.org/collaborate/workgroups/networking/netem
> [2]
> https://github.com/jellysheep/p2psp/commit/57c0294e762fefc4a190b2aa2b465f0ec6b870c2
> [3]
> https://github.com/jellysheep/p2psp/wiki/GSoC-2015:-NAT-traversal-using-UDP-hole-punching---Timeline
>
> ¹ To solve those rare cases, I tried to detect such NAT behaviour by
> connecting and sending packets to the NAT and detecting if the NAT replies
> with an ICMP packet, but the software would have to use raw sockets (and
> therefore need to have higher priviledges). So the only option would be for
> the SYMPP<->PRCN case that peer 1 sends its first packet before peer 2 to a
> temporary port number, then if that worked it sends its first packet before
> peer 2 to the "real" port number used for P2PSP packets, or if it did not
> work then peer 2 sends its first packet before peer 1.
>
>
>
> Test results in percent, 20 test runs each, for Splitter_NTS, Monitor_NTS,
> Peer_NTS (branch nts, commit 8b28afa):
>
> Configuration resembling my network at home (35ms delay ±5ms jitter at the
> peers, 3ms ±2ms jitter at the splitter, 1% packet loss at each host):
>
> Peer1\2 | rcn   | prcn  | sympp | symrp
> ========================================
> rcn     | 95    | 100   | 100   | 95
> prcn    | 100   | 100   | 90    | 0
> sympp   | 100   | 100   | 85    | 0
> symrp   | 100   | 0     | 0     | 0
>
> Hard networking conditions (15ms delay ±5ms jitter at the peers, 40ms
> ±20ms jitter at the splitter, 10% packet loss at each host):
>
> Peer1\2 | rcn   | prcn  | sympp | symrp
> ========================================
> rcn     | 95    | 65    | 75    | 90
> prcn    | 70    | 85    | 40    | 0
> sympp   | 75    | 55    | 30    | 0
> symrp   | 90    | 0     | 0     | 0
>
> Extremely hard networking conditions, just out of interest; this actually
> was a bit like fuzz testing, and I detected a bug not appearing at normal
> networking conditions, where I assumed the peer list to not be empty (10ms
> delay ±5ms jitter at the peers, 60ms ±30ms jitter at the splitter, 30%
> packet loss at each host):
>
> Peer1\2 | rcn   | prcn  | sympp | symrp
> ========================================
> rcn     | 5     | 5     | 5     | 0
> prcn    | 0     | 0     | 0     | 0
> sympp   | 0     | 0     | 0     | 0
> symrp   | 0     | 0     | 0     | 0
>
>
> --
-- 
Vicente González Ruiz
Depto de Informática
Escuela Técnica Superior de Ingeniería
Universidad de Almería

Carretera Sacramento S/N
04120, La Cañada de San Urbano
Almería, España

e-mail: vruiz@xxxxxx
http://www.ual.es/~vruiz
tel: +34 950 015711
fax: +34 950 015486

Follow ups

References