linux-stable-rt/net/dccp
Gerrit Renker 86739fb96e dccp: Do not let initial option overhead shrink the MPS
This fixes a problem caused by the overlap of the connection-setup and
established-state phases of DCCP connections.

During connection setup, the client retransmits Confirm Feature-Negotiation
options until a response from the server signals that it can move from the
half-established PARTOPEN into the OPEN state, whereupon the connection is
fully established on both ends (RFC 4340, 8.1.5).

However, since the client may already send data while it is in the PARTOPEN
state, consequences arise for the Maximum Packet Size: the problem is that the
initial option overhead is much higher than for the subsequent established
phase, as it involves potentially many variable-length list-type options
(server-priority options, RFC 4340, 6.4).

Applying the standard MPS is insufficient here: especially with larger
payloads this can lead to annoying, counter-intuitive EMSGSIZE errors.

On the other hand, reducing the MPS available for the established phase by
the added initial overhead is highly wasteful and inefficient.

The solution chosen therefore is a two-phase strategy:

   If the payload length of the DataAck in PARTOPEN is too large, an Ack is sent
   to carry the options, and the feature-negotiation list is then flushed.

   This means that the server gets two Acks for one Response. If both Acks get
   lost, it is probably better to restart the connection anyway and devising yet
   another special-case does not seem worth the extra complexity.

The result is a higher utilisation of the available packet space for the data
transmission phase (established state) of a connection.

The patch (over-)estimates the initial overhead to be 32*4 bytes -- commonly
seen values were around 90 bytes for initial feature-negotiation options.

It uses sizeof(u32) to mean "aligned units of 4 bytes".
For consistency, another use of 4-byte alignment is adapted.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-02 03:07:23 -08:00
..
ccids dccp ccid-3: Fix RFC reference 2009-01-11 00:17:22 -08:00
Kconfig dccp: Lockless integration of CCID congestion-control plugins 2009-01-04 21:42:53 -08:00
Makefile dccp: Integrate the TFRC library with DCCP 2009-01-04 21:45:33 -08:00
ackvec.c dccp: Set per-connection CCIDs via socket options 2008-11-23 16:02:31 -08:00
ackvec.h dccp: Minimise header option overhead in setting the MPS 2009-03-02 03:07:23 -08:00
ccid.c dccp: Integrate the TFRC library with DCCP 2009-01-04 21:45:33 -08:00
ccid.h dccp: Clean up ccid.c after integration of CCID plugins 2009-01-04 21:43:23 -08:00
dccp.h dccp: Do not let initial option overhead shrink the MPS 2009-03-02 03:07:23 -08:00
diag.c dccp_diag: LISTEN sockets don't have CCIDs 2008-12-17 16:08:01 -08:00
feat.c dccp: Debugging functions for feature negotiation 2009-01-21 14:34:05 -08:00
feat.h dccp: Debugging functions for feature negotiation 2009-01-21 14:34:05 -08:00
input.c dccp: Integrate the TFRC library with DCCP 2009-01-04 21:45:33 -08:00
ipv4.c net: Convert TCP & DCCP hash tables to use RCU / hlist_nulls 2008-11-16 19:40:17 -08:00
ipv6.c netns xfrm: lookup in netns 2008-11-25 17:35:18 -08:00
ipv6.h
minisocks.c dccp: Implement both feature-local and feature-remote Sequence Window feature 2009-01-21 14:34:04 -08:00
options.c dccp: Debugging functions for feature negotiation 2009-01-21 14:34:05 -08:00
output.c dccp: Do not let initial option overhead shrink the MPS 2009-03-02 03:07:23 -08:00
probe.c dccp: API to query the current TX/RX CCID 2008-11-23 16:04:59 -08:00
proto.c dccp: Implement both feature-local and feature-remote Sequence Window feature 2009-01-21 14:34:04 -08:00
sysctl.c dccp: Initialisation and type-checking of feature sysctls 2009-01-21 14:34:05 -08:00
timer.c dccp: Limit feature negotiation to connection setup phase 2008-11-12 00:42:58 -08:00