Introduces a new flag FIB_RULE_INVERT causing rules to apply
if the specified selector doesn't match.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the attribute policy for the non-specific attributes into
net/fib_rules.h and include it in the respective protocols.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move mark selector currently implemented per protocol into
the protocol independant part.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that all protocols have been made aware of the mark
field it can be moved out of the union thus simplyfing
its usage.
The config options in the IPv4/IPv6/DECnet subsystems
to enable respectively disable mark based routing only
obfuscate the code with ifdefs, the cost for the
additional comparison in the flow key is insignificant,
and most distributions have all these options enabled
by default anyway. Therefore it makes sense to remove
the config options and enable mark based routing by
default.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
nfmark is being used in various subsystems and has become
the defacto mark field for all kinds of packets. Therefore
it makes sense to rename it to `mark' and remove the
dependency on CONFIG_NETFILTER.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dn_neigh.c was converted from kmalloc to kzalloc in commit
0da974f4f3 it was missed that
dn_neigh_seq_open was actually clearing the allocation twice was
missed.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the selection of an SA for an outgoing packet to be at the same
context as the originating socket/flow. This eliminates the SELinux
policy's ability to use/sendto SAs with contexts other than the socket's.
With this patch applied, the SELinux policy will require one or more of the
following for a socket to be able to communicate with/without SAs:
1. To enable a socket to communicate without using labeled-IPSec SAs:
allow socket_t unlabeled_t:association { sendto recvfrom }
2. To enable a socket to communicate with labeled-IPSec SAs:
allow socket_t self:association { sendto };
allow socket_t peer_sa_t:association { recvfrom };
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Fix SO_PEERSEC for tcp sockets to return the security context of
the peer (as represented by the SA from the peer) as opposed to the
SA used by the local/source socket.
Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Signed-off-by: James Morris <jmorris@namei.org>
Weirdness: the third argument of socket() is net-endian
here. Oh, well - it's documented in packet(7).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
MAX_HEADER does not include the ipv6 header length in it,
so we need to add it in explicitly.
With help from YOSHIFUJI Hideaki.
Signed-off-by: David S. Miller <davem@davemloft.net>
This bit of old backwards compatibility cruft can be removed in 2.6.20.
If there is still an device that calls register_netdev()
with a zero or blank name, it will get -EINVAL from register_netdevice().
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
When scanning in debug mode, softmac is very chatty in that it puts
3 lines in the logs for each time it scans. This patch has only one
line containing all the information previously reported.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
bcm43xx and ipw2100 currently duplicate the same simplistic get_stats
handler. Additionally, zd1211rw requires the same handler to fix a
bug where all stats are reported as 0.
This patch adds a generic implementation to the ieee80211 layer,
which drivers are free to override.
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
And use kmemdup and kzalloc where applicable
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
SoftMAC contains a number of debug-type messages that continue to print
even when debugging is turned off. This patch substitutes dprintkl for
printkl for those lines.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In the SoftMAC version of the IEEE 802.11 stack, not all duplicate messages are
detected. For the most part, there is no difficulty; however for TKIP and CCMP
encryption, the duplicates result in a "replay detected" log message where the
received and previous values of the TSC are identical. This change adds a new
variable to the ieee80211_device structure that holds the 'seq_ctl' value for
the previous frame. When a new frame repeats the value, the frame is dropped and
the appropriate counter is incremented.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Hi
this patch allow to set the mtu between 1500 and 2304 (max octets in an
MSDU) for devices using ieee80211 linux stack.
Signed-off-by: Matthieu CASTET <castet.matthieu@free.fr>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
This patch adds a host_strip_iv_icv flag to ieee80211 which indicates that
ieee80211_rx should strip the IV/ICV/other security features from the payload.
This saves on some memmove() calls in the driver and seems like something that
belongs in the stack as it can be used by bcm43xx, ipw2200, and zd1211rw
I will submit the ipw2200 patch separately as it needs testing.
This patch also adds some sensible variable reuse (idx vs keyidx) in
ieee80211_rx
Signed-off-by: Daniel Drake <dsd@gentoo.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6:
[PATCH] Fix an offset error when reading the CS89x0 ADD_PORT register
[PATCH] spidernet: poor network performance
[PATCH] Spidernet: remove ETH_ZLEN check in earlier patch
[PATCH] bonding: fix an oops when slave device does not provide get_stats
[PATCH] drivers/net: SAA9730: Fix build error
Revert "[PATCH] zd1211rw: Removed unneeded packed attributes"
[PATCH] zd1211rw: Fix of a locking bug
[PATCH] softmac: remove netif_tx_disable when scanning
[PATCH] ieee80211: Fix kernel panic when QoS is enabled
Fix various .c/.h typos in comments (no code changes).
Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
On devices with hard_header_len > LL_MAX_HEADER ip_route_me_harder()
reallocates the skb, leading to memory corruption when using the stale
tcph pointer to update the checksum.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
All users of __{ip,nf}_conntrack_expect_find() don't expect that
it increments the reference count of expectation.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When NFA_NEST exceeds the skb size the protocol reference is leaked.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The found helper cannot be assigned to conntrack after unlocking
nf_conntrack_lock. This tries to find helper to assign again.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the bug which doesn't assign helper to newly created
conntrack via nf_conntrack_netlink.
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the scan section of ieee80211softmac, network transmits are
disabled. When SoftMAC re-enables transmits, it may override the
wishes of a driver that may have very good reasons for disabling
transmits. At least one failure in bcm43xx can be traced to this
problem. In addition, several unexplained problems may arise from
the unexpected enabling of transmits. Note that making this change
introduces a new bug that would allow transmits for the current session
to be transmitted on the wrong channel; however, the new bug is much
less severe than the one being fixed, as the new one only leads to
a few retransmits, whereas the old one can bring the interface down.
A fix that will not introduce new bugs is being investigated; however,
the current, more serious one should be fixed now.
Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
When application uses XFRM_MSG_GETSA to get state entry through
netlink socket and kernel has no matching one, the application expects
reply message with error status by kernel.
Kernel doesn't send the message back in the case of Mobile IPv6 route
optimization protocols (i.e. routing header or destination options
header). This is caused by incorrect return code "0" from
net/xfrm/xfrm_user.c(xfrm_user_state_lookup) and it makes kernel skip
to acknowledge at net/netlink/af_netlink.c(netlink_rcv_skb).
This patch fix to reply ESRCH to application.
Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: TAKAMIYA Noriaki <takamiya@po.ntts.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
The return value of kfifo_alloc() should be checked by IS_ERR().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make udp_encap_rcv use pskb_may_pull
IPsec with NAT-T breaks on some notebooks using the latest e1000 chipset,
when header split is enabled. When receiving sufficiently large packets, the
driver puts everything up to and including the UDP header into the header
portion of the skb, and the rest goes into the paged part. udp_encap_rcv
forgets to use pskb_may_pull, and fails to decapsulate it. Instead, it
passes it up it to the IKE daemon.
Signed-off-by: Olaf Kirch <okir@suse.de>
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
H.323 connection tracking code calls ip_ct_refresh_acct() when
processing RCFs and URQs but passes NULL as the skb.
When CONFIG_IP_NF_CT_ACCT is enabled, the connection tracking core tries
to derefence the skb, which results in an obvious panic.
A similar fix was applied on the SIP connection tracking code some time
ago.
Signed-off-by: Faidon Liambotis <paravoid@debian.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP and RAW do not have this issue. Closes Bug #7432.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "u16 *" derefs of skb->data need to be wrapped inside of
a get_unaligned().
Thanks to Gustavo Zacarias for the bug report.
Signed-off-by: David S. Miller <davem@davemloft.net>
I actually dont have a test case for these; i just found them by
inspection. Refer to patch "[XFRM]: Sub-policies broke policy events"
for more info
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
XFRM policy events are broken when sub-policy feature is turned on.
A simple test to verify this:
run ip xfrm mon on one window and add then delete a policy on another
window ..
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Any L2CAP connection in disconnecting state shall not response
to any further config requests from the remote side. So in case
such a request is received, ignore it.
Signed-off-by: Ville Tervo <ville.tervo@nokia.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
When sending a positive config response it shall include the actual
MTU to be used on this channel. This differs from the Bluetooth 1.1
specification where it was enough to acknowledge the config request.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
If the RFCOMM session is no longer attached to the TTY device, then it
makes no sense to go through with changing the termios settings.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
After an inquiry completed or got canceled the Bluetooth core should
check for any pending connect attempts.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
To receive uvents for the low-level ACL and SCO links, they must be
assigned to a subsystem. It is enough to attach them to the already
established Bluetooth bus.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
RFC4191 explicitly states that the procedures are applicable to
hosts only. We should not have changed behavior of routers.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Only routers in "FAILED" state should be considered unreachable.
Otherwise, we do not try to use speicific routes unless all least specific
routers are considered unreachable.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Fix up tcp_mem initial settings to take into account the size of the
hash entries (different on SMP and non-SMP systems).
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
nexthdr is NEXTHDR_FRAGMENT, the nexthdr value from the fragment header
is hp->nexthdr.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Based on patch by James D. Nurmi:
I've got some code very dependant on nfnetlink_queue, and turned up a
large number of warns coming from skb_trim. While it's quite possibly
my code, having not seen it on older kernels made me a bit suspect.
Anyhow, based on some googling I turned up this thread:
http://lkml.org/lkml/2006/8/13/56
And believe the issue to be related, so attached is a small patch to
the kernel -- not sure if this is completely correct, but for anyone
else hitting the WARN_ON(1) in skbuff.h, it might be helpful..
Signed-off-by: James D. Nurmi <jdnurmi@gmail.com>
Ported to ip6_queue and nfnetlink_queue and added return value
checks.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
NFULA_SEQ_GLOBAL should be in network byteorder.
Spotted by Al Viro.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 802.11 header length is affected by the wireless mode (WDS or not) and
type (QoS or not). We should use the variable hdr_len instead of the
hard coded IEEE80211_3ADDR_LEN, otherwise we may touch invalid memory.
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
- make sure port in FTP data is in network order (in fact it was looking
buggy for big endian boxes before Viro's changes)
- htonl -> htons for port
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here are some fixes to endianess problems spotted by Al Viro.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since pskb_copy tacks on the non-linear bits from the original
skb, it needs to count them in the truesize field of the new skb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise we can hit paths that (legally) do multiple deletes on the
same node and OOPS with the HLIST poison values there instead of
NULL.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes consideration of high memory when determining TCP
hash table sizes. Taking into account high memory results in tcp_mem
values that are too large.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
auth_domain_put() forgot to unlock acquired spinlock.
Cc: Olaf Kirch <okir@monad.swb.de>
Cc: Andy Adamson <andros@citi.umich.edu>
Cc: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Neil Brown <neilb@suse.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
> the build with the attached .config failed, make ends with:
> ...
> : undefined reference to `cipso_v4_sock_getattr'
> net/built-in.o: In function `netlbl_socket_getattr':
...
It looks like I was stupid and made NetLabel depend on CONFIG_NET and not
CONFIG_INET, the patch below should fix this by making NetLabel depend on
CONFIG_INET and CONFIG_SECURITY. Please review and apply for 2.6.19.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It would be nice to keep things working even with this built as a
module, it took me some time to realize my IPv6 tunnel was broken
because of the missing sit module. This module alias fixes things
until distributions have added an appropriate alias to modprobe.conf.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
If inet6_init() fails later than ndisc_init() call, or IPv6 module is
unloaded, ndisc_netdev_notifier call remains in the list and will follows in
oops later.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We have seen a couple of __alloc_pages() failures due to
fragmentation, there is plenty of free memory but no large order pages
available. I think the problem is in sock_alloc_send_pskb(), the
gfp_mask includes __GFP_REPEAT but its never used/passed to the page
allocator. Shouldnt the gfp_mask be passed to alloc_skb() ?
Signed-off-by: Larry Woodman <lwoodman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
open-coded variant there works only for little-endian
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
htons() is not needed (and no, it's not misspelled ntohs() -
userland expects net-endian here).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calculation of IPX checksum got buggered about 2.4.0. The old variant
mangled the packet; that got fixed, but calculation itself got buggered.
Restored the correct logics, fixed a subtle breakage we used to have even
back then: if the sum is 0 mod 0xffff, we want to return 0, not 0xffff.
The latter has special meaning for IPX (cheksum disabled). Observation
(and obvious fix) nicked from history of FreeBSD ipx_cksum.c...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/bridge/netfilter/ebtables.c: In function 'ebt_dev_check':
net/bridge/netfilter/ebtables.c:89: warning: initialization discards qualifiers from pointer target type
So make the char* a const char * and the warning is gone.
Signed-off-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: David S. Miller <davem@davemloft.net>
In theory these are opaque 32bit values. However, we end up
allocating them sequentially in host-endian and stick unchanged
on the wire.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The setting of the default congestion control was buried in
the sysctl code so it would not be done properly if SYSCTL was
not enabled.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The correct order is: NULL check before dereference
Spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "ieee80211: Workaround malformed 802.11 frames from AP" patch (see
http://kernel.org/git/?p=linux/kernel/git/linville/wireless-2.6.git;a=commit;h=f09fc44d8c25f22c4d985bb93857338ed02feac6 )
fixes the problem with some buggy APs but also converts debug message into
an error one. This floods the log with errors when you are near such AP (you
get a message for every beacon). This patch reverts the error message back
to the debug one.
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
I don't want my code to downgraded to GPLv3 because of
cut-n-pasted the comments. These files which I hold copyright
on were started before it was clear what GPLv3 was going to be.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There's a bug in the seqfile show operation for flowlabel objects, where
each hash chain is traversed cumulatively for each element. The following
function is called for each element of each chain:
static void ip6fl_fl_seq_show(struct seq_file *seq, struct ip6_flowlabel *fl)
{
while(fl) {
seq_printf...
fl = fl->next;
}
}
Thus, objects can appear mutliple times when reading
/proc/net/ip6_flowlabel, as the above is called for each element in the
chain.
The solution is to remove the while() loop from the above, and traverse
each chain exactly once, per the patch below. This also removes the
ip6fl_fl_seq_show() function, which does nothing else.
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when an application requests a lease for a flowlabel via the
IPV6_FLOWLABEL_MGR socket option, no error is returned if an invalid type
of destination address is supplied as part of the request, leading to a
silent failure. This patch ensures that EINVAL is returned to the
application in this case.
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Every time SCTP creates a temporary association, the stack hashes it,
puts it on a list of endpoint associations and increments the backlog.
However, the lifetime of a temporary association is the processing time
of a current packet and it's destroyed after that. In fact, we don't
really want anyone else finding this association. There is no reason to
do this extra work.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make SCTP 1-1 style and peeled-off associations behave like TCP when
setting IP id. In both cases, we set the inet_sk(sk)->daddr and initialize
inet_sk(sk)->id to a random value.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes two changes to protect applications from either removing or
tampering with the CIPSOv4 IP option on a socket. The first is the requirement
that applications have the CAP_NET_RAW capability to set an IPOPT_CIPSO option
on a socket; this prevents untrusted applications from setting their own
CIPSOv4 security attributes on the packets they send. The second change is to
SELinux and it prevents applications from setting any IPv4 options when there
is an IPOPT_CIPSO option already present on the socket; this prevents
applications from removing CIPSOv4 security attributes from the packets they
send.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes bug in iptables modules refcounting on compat error way.
As we are getting modules in check_compat_entry_size_and_hooks(), in case of
later error, we should put them all in translate_compat_table(), not in the
compat_copy_entry_from_user() or compat_copy_match_from_user(), as it is now.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-by: Vasily Averin <vvs@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add missing unlock in get_next_corpse() in nf_conntrack. It was missed
during the removal of listhelp.h . Also remove an unneeded use of
nf_ct_tuplehash_to_ctrack() in the same function.
Should be applied before 2.6.19 is released.
Signed-off-by: Martin Josefsson <gandalf@wlug.westbo.se>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds forgotten compat_flush_offset() call to error way of
translate_compat_table(). May lead to table corruption on the next
compat_do_replace().
Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Dmitry Mishin <dim@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a number of issues in parsing user-provided table in
translate_table(). Malicious user with CAP_NET_ADMIN may crash system by
passing special-crafted table to the *_tables.
The first issue is that mark_source_chains() function is called before entry
content checks. In case of standard target, mark_source_chains() function
uses t->verdict field in order to determine new position. But the check, that
this field leads no further, than the table end, is in check_entry(), which
is called later, than mark_source_chains().
The second issue, that there is no check that target_offset points inside
entry. If so, *_ITERATE_MATCH macro will follow further, than the entry
ends. As a result, we'll have oops or memory disclosure.
And the third issue, that there is no check that the target is completely
inside entry. Results are the same, as in previous issue.
Signed-off-by: Dmitry Mishin <dim@openvz.org>
Acked-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a bug in the seqfile handling for /proc/net/ip6_flowlabel, where,
after finding a flowlabel, the code will loop forever not finding any
further flowlabels, first traversing the rest of the hash bucket then just
looping.
This patch fixes the problem by breaking after the hash bucket has been
traversed.
Note that this bug can cause lockups and oopses, and is trivially invoked
by an unpriveleged user.
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
I was looking at a RHEL5 bug report involving Xen and SCTP
(https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=212550).
It turns out that SCTP wasn't written to handle skb fragments at
all. The absence of any calls to skb_may_pull is testament to
that.
It just so happens that Xen creates fragmented packets more often
than other scenarios (header & data split when going from domU to
dom0). That's what caused this bug to show up.
Until someone has the time sits down and audits the entire net/sctp
directory, here is a conservative and safe solution that simply
linearises all packets on input.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix printk format warnings:
build2.out:net/dccp/ccids/ccid2.c:355: warning: long long unsigned int format, u64 arg (arg 3)
build2.out:net/dccp/ccids/ccid2.c:360: warning: long long unsigned int format, u64 arg (arg 3)
build2.out:net/dccp/ccids/ccid2.c:482: warning: long long unsigned int format, u64 arg (arg 5)
build2.out:net/dccp/ccids/ccid2.c:639: warning: long long unsigned int format, u64 arg (arg 3)
build2.out:net/dccp/ccids/ccid2.c:639: warning: long long unsigned int format, u64 arg (arg 4)
build2.out:net/dccp/ccids/ccid2.c:674: warning: long long unsigned int format, u64 arg (arg 3)
build2.out:net/dccp/ccids/ccid2.c:720: warning: long long unsigned int format, u64 arg (arg 3)
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_segment fails to segment linear packets correctly because it
tries to write all linear parts of the original skb into each
segment. This will always panic as each segment only contains
enough space for one MSS.
This was not detected earlier because linear packets should be
rare for GSO. In fact it still remains to be seen what exactly
created the linear packets that triggered this bug. Basically
the only time this should happen is if someone enables GSO
emulation on an interface that does not support SG.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use memcpy() to move xfrm_address_t objects in and out
of netlink messages. The vast majority of xfrm_user was
doing this properly, except for copy_from_user_state()
and copy_to_user_state().
Signed-off-by: David S. Miller <davem@davemloft.net>
atrtr_find() can return NULL, so do not blindly dereference
rt->dev before we check for rt being NULL.
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent patch fixed a problem which would occur when the refcount on an
auth_domain reached zero. This problem has not been reported in practice
despite existing in two major kernel releases because the refcount can
never reach zero.
This patch fixes the problems that stop the refcount reaching zero.
1/ We were adding to the refcount when inserting in the hash table,
but only removing from the hashtable when the refcount reached zero.
Obviously it never would. So don't count the implied reference of
being in the hash table.
2/ There are two paths on which a socket can be destroyed. One called
svcauth_unix_info_release(). The other didn't. So when the other was
taken, we can lose a reference to an ip_map which in-turn holds a
reference to an auth_domain
So unify the exit paths into svc_sock_put. This highlights the fact
that svc_delete_socket has slightly odd semantics - it does not drop
a reference but probably should. Fixing this need a bit more
thought and testing.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When using H-TCP with a single flow on a 500Mbit connection (or less
actually), alpha can exceed 65000, so alpha needs to be a u32.
Signed-off-by: Gavin McCullagh <gavin.mccullagh@nuim.ie>
Signed-off-by: Doug Leith <doug.leith@nuim.ie>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doug Leith observed a discrepancy between the version of CUBIC described
in the papers and the version in 2.6.18. A math error related to scaling
causes Cubic to grow too slowly.
Patch is from "Sangtae Ha" <sha2@ncsu.edu>. I validated that
it does fix the problems.
See the following to show behavior over 500ms 100 Mbit link.
Sender (2.6.19-rc3) --- Bridge (2.6.18-rt7) ------- Receiver (2.6.19-rc3)
1G [netem] 100M
http://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-orig.pnghttp://developer.osdl.org/shemminger/tcp/2.6.19-rc3/cubic-fix.png
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CRYPTO_MANAGER is selected automatically by CONFIG_ECB and CONFIG_CBC.
config CRYPTO_ECB
tristate "ECB support"
select CRYPTO_BLKCIPHER
select CRYPTO_MANAGER
I've added CONFIG_ECB to the ones you mentioned and CONFIG_CBC to
gssapi.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Updates the references to spec documents throughout the code, taking into
account that
* the DCCP, CCID 2, and CCID 3 drafts all became RFCs in March this year
* RFC 1063 was obsoleted by RFC 1191
* draft-ietf-tcpimpl-pmtud-0x.txt was published as an Informational
RFC, RFC 2923 on 2000-09-22.
All references verified.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible
to a fragmentation attack causing false negatives on extension header matches.
When extension headers occur in the non-first fragment after the fragment
header (possibly with an incorrect nexthdr value in the fragment header)
a rule looking for this extension header will never match.
Drop fragments that are at offset 0 and don't contain the final protocol
header regardless of the ruleset, since this should not happen normally.
Since all extension headers are before the protocol header this makes sure
an extension header is either not present or in the first fragment, where
we can properly parse it.
With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
As reported by Mark Dowd <Mark_Dowd@McAfee.com>, ip6_tables is susceptible
to a fragmentation attack causing false negatives on protocol matches.
When the protocol header doesn't follow the fragment header immediately,
the fragment header contains the protocol number of the next extension
header. When the extension header and the protocol header are sent in
a second fragment a rule like "ip6tables .. -p udp -j DROP" will never
match.
Drop fragments that are at offset 0 and don't contain the final protocol
header regardless of the ruleset, since this should not happen normally.
With help from Yasuyuki KOZAKAI <yasuyuki.kozakai@toshiba.co.jp>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm_state_num needs to be increased for XFRM_STATE_ACQ states created
by xfrm_state_find() to prevent the counter from going negative when
the state is destroyed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
memcpy 4 bytes to address of auto unsigned long variable followed
by comparison with u32 is a bloody bad idea.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The networking emulator can queue SKBs for a very long
time, so if you're using netem on the sender side for
large bandwidth/delay product testing, the SKB socket
send queue sizes become artificially larger.
Correct this by calling skb_orphan() in netem_enqueue().
Signed-off-by: David S. Miller <davem@davemloft.net>
Based upon a patch from Jesper Juhl. Try to match the
TCP IPv6 code this was copied from as much as possible,
so that it's easy to see where to add the ipv6 pktoptions
support code.
Signed-off-by: David S. Miller <davem@davemloft.net>
I think I got the cause for the Oops observed in
http://www.mail-archive.com/dccp@vger.kernel.org/msg00578.html
The problem is always with applications listening on PF_INET6 sockets. Apart
from the mentioned oops, I observed another one one, triggered at irregular
intervals via timer interrupt:
run_timer_softirq -> dccp_keepalive_timer
-> inet_csk_reqsk_queue_prune
-> reqsk_free
-> dccp_v6_reqsk_destructor
The latter function is the problem and is also the last function to be called
in said kernel panic.
In any case, there is a real problem with allocating the right request_sock
which is what this patch tackles.
It fixes the following problem:
- application listens on PF_INET6
- DCCPv4 packet comes in, is handed over to dccp_v4_do_rcv, from there
to dccp_v4_conn_request
Now: socket is PF_INET6, packet is IPv4. The following code then furnishes the
connection with IPv6 - request_sock operations:
req = reqsk_alloc(sk->sk_prot->rsk_prot);
The first problem is that all further incoming packets will get a Reset since
the connection can not be looked up.
The second problem is worse:
--> reqsk_alloc is called instead of inet6_reqsk_alloc
--> consequently inet6_rsk_offset is never set (dangling pointer)
--> the request_sock_ops are nevertheless still dccp6_request_ops
--> destructor is called via reqsk_free
--> dccp_v6_reqsk_destructor tries to free random memory location (inet6_rsk_offset not set)
--> panic
I have tested this for a while, DCCP sockets are now handled correctly in all
three scenarios (v4/v6 only/v4-mapped).
Commiter note: I've added the dccp_request_sock_ops forward declaration to keep
the tree building and to reduce the size of the patch for 2.6.19,
later I'll move the functions to the top of the affected source
code to match what we have in the TCP counterpart, where this
problem hasn't existed in the first place, dumb me not to have
done the same thing on DCCP land 8)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (36 commits)
[Bluetooth] Fix HID disconnect NULL pointer dereference
[Bluetooth] Add missing entry for Nokia DTL-4 PCMCIA card
[Bluetooth] Add support for newer ANYCOM USB dongles
[NET]: Can use __get_cpu_var() instead of per_cpu() in loopback driver.
[IPV4] inet_peer: Group together avl_left, avl_right, v4daddr to speedup lookups on some CPUS
[TCP]: One NET_INC_STATS() could be NET_INC_STATS_BH in tcp_v4_err()
[NETFILTER]: Missing check for CAP_NET_ADMIN in iptables compat layer
[NETPOLL]: initialize skb for UDP
[IPV6]: Fix route.c warnings when multiple tables are disabled.
[TG3]: Bump driver version and release date.
[TG3]: Add lower bound checks for tx ring size.
[TG3]: Fix set ring params tx ring size implementation
[NET]: reduce per cpu ram used for loopback stats
[IPv6] route: Fix prohibit and blackhole routing decision
[DECNET]: Fix input routing bug
[TCP]: Bound TSO defer time
[IPv4] fib: Remove unused fib_config members
[IPV6]: Always copy rt->u.dst.error when copying a rt6_info.
[IPV6]: Make IPV6_SUBTREES depend on IPV6_MULTIPLE_TABLES.
[IPV6]: Clean up BACKTRACK().
...
This patch is suitable for just about any 2.6 kernel. It should go in
2.6.19 and 2.6.18.2 and possible even the .17 and .16 stable series.
This is a long standing bug that seems to have only recently become
apparent, presumably due to increasing use of NFS over TCP - many
distros seem to be making it the default.
The SK_CONN bit gets set when a listening socket may be ready
for an accept, just as SK_DATA is set when data may be available.
It is entirely possible for svc_tcp_accept to be called with neither
of these set. It doesn't happen often but there is a small race in
svc_sock_enqueue as SK_CONN and SK_DATA are tested outside the
spin_lock. They could be cleared immediately after the test and
before the lock is gained.
This normally shouldn't be a problem. The sockets are non-blocking so
trying to read() or accept() when ther is nothing to do is not a problem.
However: svc_tcp_recvfrom makes the decision "Should I accept() or
should I read()" based on whether SK_CONN is set or not. This usually
works but is not safe. The decision should be based on whether it is
a TCP_LISTEN socket or a TCP_CONNECTED socket.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: <stable@kernel.org>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Yes, this actually passed tests the way it was.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When submitting a request to a fast portmapper (such as the local rpcbind
daemon), the request can complete before the parent task is even queued up on
xprt->binding. Fix this by queuing before submitting the rpcbind request.
Test plan:
Connectathon locking test with UDP.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The latest HID disconnect sequence change introduced a NULL pointer
dereference. For the quirk to handle buggy remote HID implementations,
it is enough to wait for a potential control channel disconnect from
the remote side and it is also enough to wait only 500 msecs.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
I believe this NET_INC_STATS() call can be replaced by
NET_INC_STATS_BH(), a little bit cheaper.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 32bit compatibility layer has no CAP_NET_ADMIN check in
compat_do_ipt_get_ctl, which for example allows to list the current
iptables rules even without having that capability (the non-compat
version requires it). Other capabilities might be required to exploit
the bug (eg. CAP_NET_RAW to get the nfnetlink socket?), so a plain user
can't exploit it, but a setup actually using the posix capability system
might very well hit such a constellation of granted capabilities.
Signed-off-by: Björn Steinbrink <B.Steinbrink@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Need to fully initialize skb to keep lower layers and queueing happy.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
WE-21 changed the ABI for the SIOC[SG]IW{ESSID,NICKN} ioctls by dropping
NULL termination. This patch adds compatibility code so that WE-21 can
work properly with WE-20 (and older) tools.
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Lookups resolving to ip6_blk_hole_entry must result in silently
discarding the packets whereas an ip6_pkt_prohibit_entry is
supposed to cause an ICMPV6_ADM_PROHIBITED message to be sent.
Thanks to Kim Nordlund <kim.nordlund@nokia.com> for noticing
this bug.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes a silly bug that has been in the input routing code
for some time. It results in trying to send to a node directly when
the origin of the packet is via the default router.
Its been tested by Alan Kemmerer <alan.kemmerer@mittalsteel.com> who
reported the bug and its a fairly obvious fix for a typo.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Patrick Caulfield <patrick@tykepenguin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch limits the amount of time you will defer sending a TSO segment
to less than two clock ticks, or the time between two acks, whichever is
longer.
On slow links, deferring causes significant bursts. See attached plots,
which show RTT through a 1 Mbps link with a 100 ms RTT and ~100 ms queue
for (a) non-TSO, (b) currnet TSO, and (c) patched TSO. This burstiness
causes significant jitter, tends to overflow queues early (bad for short
queues), and makes delay-based congestion control more difficult.
Deferring by a couple clock ticks I believe will have a relatively small
impact on performance.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
As IPV6_SUBTREES can't work without IPV6_MULTIPLE_TABLES have IPV6_SUBTREES
depend on it.
Signed-off-by: Ville Nuorvala <vnuorval@tcs.hut.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>