linux-stable-rt

Commit Graph

Author	SHA1	Message	Date
Linus Torvalds	29b309e52d	Undo duplicate "m68k: drivers/input/serio/hp_sdc.c needs <linux/semaphore.h>" Both commits `0f17e4c796` ("Add missing semaphore.h includes") and `4933d07531` ("m68k: drivers/input/serio/hp_sdc.c needs <linux/semaphore.h>") added a We only really need one ;) Reported-by: Huang Weiyi <weiyi.huang@gmail.com> Requested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-25 09:19:36 -07:00
Herbert Xu	7d7e5a60c6	ipsec: ipcomp - Decompress into frags if necessary When decompressing extremely large packets allocating them through kmalloc is prone to failure. Therefore it's better to use page frags instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-25 02:55:33 -07:00
Herbert Xu	6fccab671f	ipsec: ipcomp - Merge IPComp implementations This patch merges the IPv4/IPv6 IPComp implementations since most of the code is identical. As a result future enhancements will no longer need to be duplicated. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-25 02:54:40 -07:00
Artem Bityutskiy	d37e6bf68f	UBI: always start the background thread This fix only affects UBI debugging. If the the background thread is disabled for debugging purposes, start it anyway, because otherwise we see tonns of kernel debugging complaints like this: INFO: task ubi_bgt0d:26857 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. ubi_bgt0d D dd37bf94 0 26857 2 dd37bfcc 00000086 f8e17cea dd37bf94 00000046 00000000 00000000 f5c62430 f5c62430 f5c62590 c2a09c80 f6cbd498 dd8e9cbc 00000296 dd37bfb0 00000296 dd8e9cb8 dd8e9cbc dd37bfcc c0119774 00000000 00000000 c0132e89 f6961560 Call Trace: [<f8e17cea>] ? ubi_thread+0x0/0x127 [ubi] [<c0119774>] ? complete+0x43/0x4b [<c0132e89>] ? kthread+0x0/0x5b [<f8e17cea>] ? ubi_thread+0x0/0x127 [ubi] [<c0132eae>] kthread+0x25/0x5b [<c0132e89>] ? kthread+0x0/0x5b [<c0104953>] kernel_thread_helper+0x7/0x14 ======================= So start it, and go sleep inside it, instead of creating it and never start. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>	2008-07-25 11:35:15 +03:00
David S. Miller	cffe1c5d7a	pkt_sched: Fix locking in shutdown_scheduler_queue() Qdisc locks need to be held with BH disabled. Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-25 01:25:04 -07:00
Tony Breeds	973b7d83eb	powerpc: Wireup new syscalls signalfd4, eventfd2, epoll_create1, dup3, pipe2 and inotify_init1 Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 16:40:55 +10:00
Benjamin Herrenschmidt	1e3519f8e1	Move update_mmu_cache() declaration from tlbflush.h to pgtable.h where it belongs. This fixes some build problems on some configs Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 16:21:11 +10:00
Nathan Fontenot	16c14b4621	powerpc/pseries: Remove kmalloc call in handling writes to lparcfg There are only 4 valid name=value pairs for writes to /proc/ppc64/lparcfg. Current code allocates a buffer to copy this information in from the user. Since the longest name=value pair will easily fit into a buffer of 64 characters, simply put the buffer on the stack instead of allocating the buffer. Signed-off-by: Nathan Fotenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:45 +10:00
Nathan Fontenot	8391e42a5c	powerpc/pseries: Update arch vector to indicate support for CMO Update the architecture vector to indicate that Cooperative Memory Overcommitment is supported if CONFIG_PPC_SMLPAR is set. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:45 +10:00
Brian King	39c1ffecc6	ibmvfc: Add support for collaborative memory overcommit Adds support to the ibmvfc driver for collaborative memory overcommit. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:45 +10:00
Robert Jennings	7912a0ac59	ibmvscsi: driver enablement for CMO Enable the driver to function in a Cooperative Memory Overcommitment (CMO) environment. The following changes are made to enable the driver for CMO: * DMA mapping errors will not result in error messages if entitlement has been exceeded and resources were not available. * The driver has a get_desired_dma function defined to function in a CMO environment. It will indicate how much IO memory it would like to function. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:44 +10:00
Robert Jennings	1096d63d8e	ibmveth: enable driver for CMO Enable ibmveth for Cooperative Memory Overcommitment (CMO). For this driver it means calculating a desired amount of IO memory based on the current MTU and updating this value with the bus when MTU changes occur. Because DMA mappings can fail, we have added a bounce buffer for temporary cases where the driver can not map IO memory for the buffer pool. The following changes are made to enable the driver for CMO: * DMA mapping errors will not result in error messages if entitlement has been exceeded and resources were not available. * DMA mapping errors are handled gracefully, ibmveth_replenish_buffer_pool() is corrected to check the return from dma_map_single and fail gracefully. * The driver will have a get_desired_dma function defined to function in a CMO environment. * When the MTU is changed, the driver will update the device IO entitlement Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Santiago Leon <santil@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:44 +10:00
Santiago Leon	ea866e6526	ibmveth: Automatically enable larger rx buffer pools for larger mtu Activates larger rx buffer pools when the MTU is changed to a larger value. This patch de-activates the large rx buffer pools when the MTU changes to a smaller value. Signed-off-by: Santiago Leon <santil@us.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:44 +10:00
Nathan Fontenot	22e1a4dd3f	powerpc/pseries: Verify CMO memory entitlement updates with virtual I/O Verify memory entitlement updates can be handled by vio. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:43 +10:00
Robert Jennings	a90ab95a95	powerpc/pseries: vio bus support for CMO This is a large patch but the normal code path is not affected. For non-pSeries platforms the code is ifdef'ed out and for non-CMO enabled pSeries systems this does not affect the normal code path. Devices that do not perform DMA operations do not need modification with this patch. The function get_desired_dma was renamed from get_io_entitlement for clarity. Overview Cooperative Memory Overcommitment (CMO) allows for a set of OS partitions to be run with less RAM than the aggregate needs of the group of partitions. The firmware will balance memory between the partitions and page in/out memory as needed. Based on the number and type of IO adpaters preset each partition is allocated an amount of memory for DMA operations and this allocation will be guaranteed to the partition; this is referred to as the partition's 'entitlement'. Partitions running in a CMO environment can only have virtual IO devices present. The VIO bus layer will manage the IO entitlement for the system. Accounting, at a system and per-device level, is tracked in the VIO bus code and exposed via sysfs. A set of dma_ops functions are added to the bus to allow for this accounting. Bus initialization At initialization, the bus will calculate the minimum needs of the system based on providing each device present with a standard minimum entitlement along with a spare allocation for the bus to handle hotplug events. If the minimum needs can not be met the system boot will be halted. Device changes The significant changes for devices while running under CMO are that the devices must specify how much dedicated IO entitlement they desire and must also handle DMA mapping errors that can occur due to constrained IO memory. The virtual IO drivers are modified to silence errors when DMA mappings fail for CMO and handle these failures gracefully. Each devices will be guaranteed a minimum entitlement that can always be mapped. Devices will specify how much entitlement they desire and the VIO bus will attempt to provide for this. Devices can change their desired entitlement level at any point in time to address particular needs (via vio_cmo_set_dev_desired()), not just at device probe time. VIO bus changes The system will have a particular entitlement level available from which it can provide memory to the devices. The bus defines two pools of memory within this entitlement, the reserved and excess pools. Each device is provided with it's own entitlement no less than a system defined minimum entitlement and no greater than what the device has specified as it's desired entitlement. The entitlement provided to devices comes from the reserve pool. The reserve pool can also contain a spare allocation as large as the system defined minimum entitlement which is used for device hotplug events. Any entitlement not needed to fulfill the needs of a reserve pool is placed in the excess pool. Each device is guaranteed that it can map up to it's entitled level; additional mapping are possible as long as there is unmapped memory in the excess pool. Bus probe As the system starts, each device is given an entitlement equal only to the system defined minimum entitlement. The reserve pool is equal to the sum of these entitlements, plus a spare allocation. The VIO bus also tracks the aggregate desired entitlement of all the devices. If the system desired entitlement is greater than the size of the reserve pool, when devices unmap IO memory it will be reserved and a balance operation will be scheduled for some time in the future. Entitlement balancing The balance function tries to fairly distribute entitlement between the devices in the system with the goal of providing each device with it's desired amount of entitlement. Devices using more than what would be ideal will have their entitled set-point adjusted; this will effectively set a goal for lower IO memory usage as future mappings can fail and deallocations will trigger a balance operation to distribute the newly unmapped memory. A fair distribution of entitlement can take several balance operations to achieve. Entitlement changes and device DLPAR events will alter the state of CMO and will trigger balance operations. Hotplug events The VIO bus allows for changes in system entitlement at run-time via 'vio_cmo_entitlement_update()'. When devices are added the hotplug device event will be preceded by a system entitlement increase and this is reversed when devices are removed. The following changes are made that the VIO bus layer for CMO: * add IO memory accounting per device structure. * add IO memory entitlement query function to driver structure. * during vio bus probe, if CMO is enabled, check that driver has memory entitlement query function defined. Fail if function not defined. * fail to register driver if io entitlement function not defined. * create set of dma_ops at vio level for CMO that will track allocations and return DMA failures once entitlement is reached. Entitlement will limited by overall system entitlement. Devices will have a reserved quantity of memory that is guaranteed, the rest can be used as available. * expose entitlement, current allocation, desired allocation, and the allocation error counter for devices to the user through sysfs * provide mechanism for changing a device's desired entitlement at run time for devices as an exported function and sysfs tunable * track any DMA failures for entitled IO memory for each vio device. * check entitlement against available system entitlement on device add * track entitlement metrics (high water mark, current usage) * provide function to reset high water mark * provide minimum and desired entitlement numbers at a bus level * provide drivers with a minimum guaranteed entitlement * balance available entitlement between devices to satisfy their needs * handle system entitlement changes and device hotplug Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:43 +10:00
Robert Jennings	6490c4903d	powerpc/pseries: iommu enablement for CMO To support Cooperative Memory Overcommitment (CMO), we need to check for failure from some of the tce hcalls. These changes for the pseries platform affect the powerpc architecture; patches for the other affected platforms are included in this patch. pSeries platform IOMMU code changes: * platform TCE functions must handle H_NOT_ENOUGH_RESOURCES errors and return an error. Architecture IOMMU code changes: * Calls to ppc_md.tce_build need to check return values and return DMA_MAPPING_ERROR for transient errors. Architecture changes: * struct machdep_calls for tce_build_pSeriesLP functions need to change to indicate failure. all other platforms will need updates to iommu functions to match the new calling semantics; they will return 0 on success. The other platforms default configs have been built, but no further testing was performed. Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:43 +10:00
Brian King	ffa5abbd0c	powerpc/pseries: Add CMO paging statistics With the addition of Cooperative Memory Overcommitment (CMO) support for IBM Power Systems, two fields have been added to the VPA to report paging statistics. Add support in lparcfg to report them to userspace. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Brian King	84af458bb2	powerpc/pseries: Add collaborative memory manager Adds a collaborative memory manager, which acts as a simple balloon driver for System p machines that support cooperative memory overcommitment (CMO). Adds a platform configuration option for CMO called PPC_SMLPAR. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Brian King	86630a3232	powerpc/pseries: Utilities to set firmware page state Newer versions of firmware support page states, which are used by the collaborative memory manager (future patch) to "loan" pages to the hypervisor for use by other partitions. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Robert Jennings	e46de429cb	powerpc/pseries: Enable CMO feature during platform setup For Cooperative Memory Overcommitment (CMO), set the FW_FEATURE_CMO flag in powerpc_firmware_features from the rtas ibm,get-system-parameters table prior to calling iommu_init_early_pSeries. With this, any CMO specific functionality can be controlled by checking: firmware_has_feature(FW_FEATURE_CMO) Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:42 +10:00
Robert Jennings	398778f78b	powerpc/pseries: Split retrieval of processor entitlement data into a helper routine Split the retrieval of processor entitlement data returned in the H_GET_PPP hcall into its own helper routine. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:41 +10:00
Nathan Fontenot	dfc3403f0e	powerpc/pseries: Add memory entitlement capabilities to /proc/ppc64/lparcfg Update /proc/ppc64/lparcfg to display Cooperative Memory Overcommitment statistics as reported by the H_GET_MPP hcall. This also updates the lparcfg interface to allow setting memory entitlement and weight. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:41 +10:00
Nathan Fotenot	11529396ea	powerpc/pseries: Split processor entitlement retrieval and gathering to helper routines Split the retrieval and setting of processor entitlement and weight into helper routines. This also removes the printing of the raw values returned from h_get_ppp, the values are already parsed and printed. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:41 +10:00
Nathan Fontenot	545500b307	powerpc/pseries: Remove extraneous error reporting for hcall failures in lparcfg Remove the extraneous error reporting used when a hcall made from lparcfg fails. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:40 +10:00
Segher Boessenkool	80c60bf9b9	powerpc: Fix compile error with binutils 2.15 My previous patch to fix compilation with binutils-2.17 causes a "file truncated" build error from ld with binutils 2.15 (and possibly older), and a warning with 2.16 and 2.17. This fixes it. Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Acked-by: Chuck Meade <chuckmeade@mindspring.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:40 +10:00
Mark Nelson	7886250e9d	powerpc/cell: Fixed IOMMU mapping uses weak ordering for a pcie endpoint At the moment the fixed mapping is by default strongly ordered (the iommu_fixed=weak boot option must be used to make the fixed mapping weakly ordered). If we're on a setup where the southbridge is being used in endpoint mode (triblade and CAB boards) the default should be a weakly ordered fixed mapping. This adds a check so that if a node of type pcie-endpoint can be found in the device tree the fixed mapping is set to be weak by default (but can be overridden using iommu_fixed=strong). Signed-off-by: Mark Nelson <markn@au1.ibm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:40 +10:00
Luis Machado	d6a61bfc06	powerpc: BookE hardware watchpoint support This patch implements support for HW based watchpoint via the DBSR_DAC (Data Address Compare) facility of the BookE processors. It does so by interfacing with the existing DABR breakpoint code and adding the necessary bits and pieces for the new bits to be properly set or cleared Signed-off-by: Luis Machado <luisgpm@br.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:39 +10:00
Stephen Rothwell	00bf6e9061	powerpc: Fallout from sysdev API changes A struct sysdev_attribute * parameter was added to the show routine by commit `4a0b2b4dbe` "sysdev: Pass the attribute to the low level sysdev show/store function". This eliminates a warning: arch/powerpc/kernel/sysfs.c:538: warning: initialization from incompatible pointer type Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:39 +10:00
Nathan Lynch	9115d13453	powerpc: Enable AT_BASE_PLATFORM aux vector Stash the first platform string matched by identify_cpu() in powerpc_base_platform, and supply that to the ELF loader for the value of AT_BASE_PLATFORM. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:39 +10:00
Nathan Lynch	483fad1c3f	ELF loader support for auxvec base platform string Some IBM POWER-based platforms have the ability to run in a mode which mostly appears to the OS as a different processor from the actual hardware. For example, a Power6 system may appear to be a Power5+, which makes the AT_PLATFORM value "power5+". This means that programs are restricted to the ISA supported by Power5+; Power6-specific instructions are treated as illegal. However, some applications (virtual machines, optimized libraries) can benefit from knowledge of the underlying CPU model. A new aux vector entry, AT_BASE_PLATFORM, will denote the actual hardware. For example, on a Power6 system in Power5+ compatibility mode, AT_PLATFORM will be "power5+" and AT_BASE_PLATFORM will be "power6". The idea is that AT_PLATFORM indicates the instruction set supported, while AT_BASE_PLATFORM indicates the underlying microarchitecture. If the architecture has defined ELF_BASE_PLATFORM, copy that value to the user stack in the same manner as ELF_PLATFORM. Signed-off-by: Nathan Lynch <ntl@pobox.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2008-07-25 15:44:39 +10:00
Benjamin Herrenschmidt	e9f76354ce	Merge commit 'jk/jk-merge'	2008-07-25 15:35:10 +10:00
Benjamin Herrenschmidt	c174aff956	Merge commit 'gcl/gcl-next'	2008-07-25 15:35:03 +10:00
Linus Torvalds	832fe9c222	Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: virtio: Add transport feature handling stub for virtio_ring. virtio: Rename set_features to finalize_features virtio: Formally reserve bits 28-31 to be 'transport' features. s390: use virtio_console for KVM on s390 virtio: console as a config option virtio_console: use virtqueue notification for hvc_console hvc_console: rework setup to replace irq functions with callbacks virtio_blk: check for hardsector size from host virtio: Use bus_type probe and remove methods virtio: don't always force a notification when ring is full virtio: clarify that ABI is usable by any implementations virtio: Recycle unused recv buffer pages for large skbs in net driver virtio net: Allow receiving SG packets virtio net: Add ethtool ops for SG/GSO virtio: fix virtio_net xmit of freed skb bug	2008-07-24 19:11:49 -07:00
Rusty Russell	ed9559d38a	Label kthread_create() with printf attribute tag. Obvious misc patch been in my queue (& linux-next) for over a cycle. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 19:11:15 -07:00
Rusty Russell	e34f872567	virtio: Add transport feature handling stub for virtio_ring. To prepare for virtio_ring transport feature bits, hook in a call in all the users to manipulate them. This currently just clears all the bits, since it doesn't understand any features. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:14 +10:00
Rusty Russell	c624896e48	virtio: Rename set_features to finalize_features Rather than explicitly handing the features to the lower-level, we just hand the virtio_device and have it set the features. This make it clear that it has the chance to manipulate the features of the device at this point (and that all feature negotiation is already done). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:12 +10:00
Rusty Russell	dd7c7bc462	virtio: Formally reserve bits 28-31 to be 'transport' features. We assign feature bits as required, but it makes sense to reserve some for the particular transport, rather than the particular device. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:07 +10:00
Christian Borntraeger	faeba830b0	s390: use virtio_console for KVM on s390 This patch enables virtio_console as the default console on kvm for s390. We currently use the same notify hack as lguest for early console output. I will try to address this for lguest and s390 later. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:07 +10:00
Christian Borntraeger	7721c494a2	virtio: console as a config option I also added a small Kconfig change that allows the user to specify the virtio console in menuconfig. (Fixes to export symbols from Stephen Rothwell <sfr@canb.auug.org.au>) (Fixes for CONFIG_VIRTIO_CONSOLE=y vs CONFIG_VIRTIO=m from Christian himself) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Stephen Rothwell <sfr@canb.auug.org.au>	2008-07-25 12:06:07 +10:00
Christian Borntraeger	91fcad19d0	virtio_console: use virtqueue notification for hvc_console This patch exploits the new notifier callbacks of the hvc_console. We can use the virtio callbacks instead of the polling code. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:06 +10:00
Christian Borntraeger	611e097d77	hvc_console: rework setup to replace irq functions with callbacks This patch tries to change hvc_console to not use request_irq/free_irq if the backend does not use irqs. This allows virtio_console to use hvc_console without having a linker reference to request_irq/free_irq. In addition, together with patch 2/3 it improves the performance for virtio console input. (an earlier version of this patch was tested by Yajin on lguest) The irq specific code is moved to hvc_irq.c and selected by the drivers that use irqs (System p, System i, XEN). I replaced "int irq" with the opaque "int data". The request_irq and free_irq calls are replaced with notifier_add and notifier_del. I have also changed the code a bit to call the notifier_add and notifier_del inside the spinlock area as the callbacks are found via hp->ops. Changes since last version: o remove ifdef o reintroduce "irq_requested" as "notified" o cleanups, sparse.. I did not move the timer based polling into a separate polling scheme. I played with several variants, but it seems we need to sleep/schedule in a thread even for irq based consoles, as there are throttleing and buffer size constraints. I also kept hvc_struct defined in hvc_console.h so that hvc_irq.c can access the irq_requested element. Feedback is appreciated. virtio_console is currently the only available console for kvm on s390. I plan to push this change as soon as all affected parties agree on it. I would love to get test results from System p, Xen etc. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:06 +10:00
Christian Borntraeger	066f4d82a6	virtio_blk: check for hardsector size from host Currently virtio_blk assumes a 512 byte hard sector size. This can cause trouble / performance issues if the backing has a different block size (like a file on an ext3 file system formatted with 4k block size or a dasd). Lets add a feature flag that tells the guest to use a different hard sector size than 512 byte. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:05 +10:00
Mark McLoughlin	e962fa660d	virtio: Use bus_type probe and remove methods Hook up to the probe() and remove() methods in bus_type rather than device_driver. The latter has been preferred since 2.6.16. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:05 +10:00
Rusty Russell	44653eae14	virtio: don't always force a notification when ring is full We force notification when the ring is full, even if the host has indicated it doesn't want to know. This seemed like a good idea at the time: if we fill the transmit ring, we should tell the host immediately. Unfortunately this logic also applies to the receiving ring, which is refilled constantly. We should introduce real notification thesholds to replace this logic. Meanwhile, removing the logic altogether breaks the heuristics which KVM uses, so we use a hack: only notify if there are outgoing parts of the new buffer. Here are the number of exits with lguest's crappy network implementation: Before: network xmit 7859051 recv 236420 After: network xmit 7858610 recv 118136 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:04 +10:00
Rusty Russell	674bfc23c5	virtio: clarify that ABI is usable by any implementations We want others to implement and use virtio, so it makes sense to BSD license the non-__KERNEL__ parts of the headers to make this crystal clear. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Ryan Harper <ryanh@us.ibm.com> Acked-by: Eric Van Hensbergen <ericvh@gmail.com> Acked-by: Anthony Liguori <aliguori@us.ibm.com>	2008-07-25 12:06:04 +10:00
Rusty Russell	fb6813f480	virtio: Recycle unused recv buffer pages for large skbs in net driver If we hack the virtio_net driver to always allocate full-sized (64k+) skbuffs, the driver slows down (lguest numbers): Time to receive 1GB (small buffers): 10.85 seconds Time to receive 1GB (64k+ buffers): 24.75 seconds Of course, large buffers use up more space in the ring, so we increase that from 128 to 2048: Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds If we recycle pages rather than using alloc_page/free_page: Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds This demonstrates that with efficient allocation, we don't need to have a separate "small buffer" queue. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:02 +10:00
Herbert Xu	97402b96f8	virtio net: Allow receiving SG packets Finally this patch lets virtio_net receive GSO packets in addition to sending them. This can definitely be optimised for the non-GSO case. For comparison the Xen approach stores one page in each skb and uses subsequent skb's pages to construct an SG skb instead of preallocating the maximum amount of pages per skb. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (added feature bits)	2008-07-25 12:06:01 +10:00
Herbert Xu	a9ea3fc6f2	virtio net: Add ethtool ops for SG/GSO This patch adds some basic ethtool operations to virtio_net so I could test SG without GSO (which was really useful because TSO turned out to be buggy :) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (remove MTU setting)	2008-07-25 12:06:01 +10:00
Mark McLoughlin	9953ca6cb7	virtio: fix virtio_net xmit of freed skb bug On Mon, 2008-05-26 at 17:42 +1000, Rusty Russell wrote: > If we fail to transmit a packet, we assume the queue is full and put > the skb into last_xmit_skb. However, if more space frees up before we > xmit it, we loop, and the result can be transmitting the same skb twice. > > Fix is simple: set skb to NULL if we've used it in some way, and check > before sending. ... > diff -r 564237b31993 drivers/net/virtio_net.c > --- a/drivers/net/virtio_net.c Mon May 19 12:22:00 2008 +1000 > +++ b/drivers/net/virtio_net.c Mon May 19 12:24:58 2008 +1000 > @@ -287,21 +287,25 @@ again: > free_old_xmit_skbs(vi); > > /* If we has a buffer left over from last time, send it now. / > - if (vi->last_xmit_skb) { > + if (unlikely(vi->last_xmit_skb)) { > if (xmit_skb(vi, vi->last_xmit_skb) != 0) { > / Drop this skb: we only queue one. */ > vi->dev->stats.tx_dropped++; > kfree_skb(skb); > + skb = NULL; > goto stop_queue; > } > vi->last_xmit_skb = NULL; With this, may drop an skb and then later in the function discover that we could have sent it after all. Poor wee skb :) How about the incremental patch below? Cheers, Mark. Subject: [PATCH] virtio_net: Delay dropping tx skbs Currently we drop the skb in start_xmit() if we have a queued buffer and fail to transmit it. However, if we delay dropping it until we've stopped the queue and enabled the tx notification callback, then there is a chance space might become available for it. Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2008-07-25 12:06:00 +10:00
Adrian Bunk	fb2e405fc1	fix fs/nfs/nfsroot.c compilation This fixes the following compile error caused by commit `f9247273cb` ("UFS: add const to parser token table"): CC fs/nfs/nfsroot.o /home/bunk/linux/kernel-2.6/git/linux-2.6/fs/nfs/nfsroot.c:130: error: tokens causes a section type conflict make[3]: *** [fs/nfs/nfsroot.o] Error 1 Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-07-24 17:32:41 -07:00

... 5 6 7 8 9 ...

106139 Commits All Branches Search

106139 Commits

All Branches