Commit Graph

5173 Commits

Author SHA1 Message Date
Paolo 'Blaisorblade' Giarrusso 20d0021394 [PATCH] uml: allow building as 32-bit binary on 64bit host
This patch makes the command:

make ARCH=um SUBARCH=i386

work on x86_64 hosts (with support for building 32-bit binaries).  This is
especially needed since 64-bit UMLs don't support 32-bit emulation for guest
binaries, currently.  This has been tested in all possible cases and works.

Only exception is that I've built but not tested a 64-bit binary, because I
hadn't a 64-bit filesystem available.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Paolo 'Blaisorblade' Giarrusso ecc354a90a [PATCH] uml: reintroduce pcap support
The pcap support was not working because of some linking problems (expressing
the construct in Kbuild was a bit difficult) and because there was no user
request.  Now that this has come back, here's the support.

This has been tested and works on both 32 and 64-bit hosts, even when
"cross-"building 32-bit binaries.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Paolo 'Blaisorblade' Giarrusso 8e0a218124 [PATCH] uml: fix hppfs error path
Fix the error message to refer to the error code, i.e.  err, not count, plus
add some cosmetical fixes.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:25 -07:00
Paolo 'Blaisorblade' Giarrusso 1c30385ae4 [PATCH] uml: gcc 2.95 fix and Makefile cleanup
1) Cleanup an ugly hyper-nested code in Makefile (now only the arith.
   expression is passed through the host bash).

2) Fix a problem with GCC 2.95: according to a report from Raphael Bossek,
   .remap_data : { arch/um/sys-SUBARCH/unmap_fin.o (.data .bss) } is expanded
   into: .remap_data : { arch/um/sys-i386 /unmap_fin.o (.data .bss) }

(because I didn't use ## to join the two tokens), thus stopping linking.  Pass
the whole path from the Makefile as a simple and nice fix.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Raphael Bossek <raphael.bossek@gmx.de>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:24 -07:00
Paolo 'Blaisorblade' Giarrusso 2e5e55923e [PATCH] uml: consolidate modify_ldt
*) Reorganize the two cases of sys_modify_ldt to share all the reasonably
   common code.

*) Avoid memory allocation when unneeded (i.e.  when we are writing and the
   passed buffer size is known), thus not returning ENOMEM (which isn't
   allowed for this syscall, even if there is no strict "specification").

*) Add copy_{from,to}_user to modify_ldt for TT mode.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:24 -07:00
Paolo 'Blaisorblade' Giarrusso 1feb8d2d73 [PATCH] uml: workaround host bug in "TT mode vs. NPTL link fix"
A big bug has been diagnosed on hosts running the SKAS patch and built with
CONFIG_REGPARM, due to some missing prevent_tail_call().

On these hosts, this workaround is needed to avoid triggering that bug,
because "to" is kept by GCC only in EBX, which is corrupted at the return of
mmap2().

Since to trigger this bug int 0x80 must be used when doing the call, it rarely
manifests itself, so I'd prefer to get this merged to workaround that host
bug, since it should cause no functional change.  Still, you might prefer to
drop it, I'll leave this to you.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:24 -07:00
Paolo 'Blaisorblade' Giarrusso bcb01b8a67 [PATCH] uml: fix lvalue for gcc4
Russell King <rmk+lkml@arm.linux.org.uk>

This construct is refused by GCC 4, so here's the (corrected) fix.  Thanks to
Russell for noticing a stupid mistake I did when first sending this.

As he noted, the code is largely suboptimal however it currently works, and
will be fixed shortly.  Just read the access_ok check on fp which is NULL, or
the pointer arithmetic below which should be done with a cast to void*:

 	frame = (struct rt_sigframe __user *)
 		round_down(stack_top - sizeof(struct rt_sigframe), 16) - 8;

The code shows clearly that has been taken from
arch/x86_64/kernel/signal.c:setup_rt_frame(), maybe in a bit of a hurry.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:24 -07:00
Michael Krufky 3952db66ef [PATCH] dvb: LGDT3302 QAM lock bug fix
Fix QAM lock bug.  Previously, it was necessary to first scan in VSB before
attempting to get a QAM lock.

Signed-off-by: Mac Michaels <wmichaels1@earthlink.net>
Signed-off-by: Michael Krufky <mkrufky@m1k.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-14 09:00:24 -07:00
Chen, Kenneth W 7fce2cf62e [SCSI] Redundant this_count check in sd_init_command()
I was going over the scsi I/O submit path, when sd_init_command
construct the scsi command, this_count is already checked in the
previous else if clause.  Why does it need to check it again in
the last else block?

Patch to delete the spurious check.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:25:17 -04:00
Chen, Kenneth W 0f34e3f533 [SCSI] Redundant memset in scsi_alloc_sgtable
scsi_init_io calls scsi_alloc_sgtable and then calls blk_rq_map_sg
to initialize the scatterlist structure.  blk_rq_map_sg() already
memset the structure for every new segment.  That makes the memset
in scsi_alloc_sgtable unnecessary.

Patch to delete the extra memset in scsi_alloc_sgtable.  Tested on
a x86_64 machine.  Looks stable to me.

Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:24:12 -04:00
James.Smart@Emulex.Com 2f4701d827 [SCSI] add int_to_scsilun() function
One of the issues we had was reverting the midlayers lun value
into the 8byte lun value that we wanted to send to the device.
Historically, there's been some combination of byte swapping,
setting high/low, etc. There's also been no common thread between
how our driver did it and others.  I also got very confused as
to why byteswap routines were being used.

Anyway, this patch is a LLDD-callable function that reverts the
midlayer's lun value, stored in an int, to the 8-byte quantity
(note: this is not the real 8byte quantity, just the same amount
that scsilun_to_int() was able to convert and store originally).

This also solves the dilemma of the thread:
http://marc.theaimsgroup.com/?l=linux-kernel&m=112116767118981&w=2

A patch for the lpfc driver to use this function will be along
in a few days (batched with other patches).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:21:27 -04:00
Andrew Vasquez 77d7414361 [SCSI] qla2xxx: Cleanup FC remote port registration.
Cleanup FC remote port registration.

Due to the inherent behaviour (an immediate scan) of adding
a 'target'-role-capable rport via fc_remote_port_add(),
split the registration into two steps -- addition as
unknown-type role, then use fc_remote_port_rolchg() with
appropriate role (based on PLOGI/PRLI bits).  This allows
for a more cleaner rport->dd_data management as can be seen
with the simplified qla2xxx_slave_alloc() function.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:15:55 -04:00
Andrew Vasquez 88c2666351 [SCSI] qla2xxx: Consolidate ISP24xx chip reset logic.
Consolidate ISP24xx chip reset logic.

Consolidate near-duplicate RISC reset logic from
qla24xx_reset_chip() and qla24xx_chip_diag().  Also, after
initiating a soft-reset, insure the firmware has completed
all NVRAM accesses before continuing.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:15:31 -04:00
Andrew Vasquez f0883ac6a7 [SCSI] qla2xxx: Add firmware version number to qla24xx_fw_version_str().
Add firmware version number to qla24xx_fw_version_str().

Original code was accidently trimmed during port.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:15:10 -04:00
Mark Haverkamp 84e29308ed [SCSI] aacraid: Fix sgmap error
The wrong sgmap structure is being assigned in aac_send_raw_srb.

Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:14:45 -04:00
Andrew Vasquez 97cbe08ff8 [SCSI] qla2xxx: Update version number to 8.01.00b5-k.
Update version number to 8.01.00b5-k.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:13:36 -04:00
Andrew Vasquez cc4731f5b4 [SCSI] qla2xxx: Correct maximum supported lun and target-id definitions.
Correct maximum supported lun and target-id definitions.

The driver uses command-IOCBs which support a maximum lun
value of 0xffff -- correct #define to reflect the change.
Also, remove superfluous MAX_TARGET definition.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:13:11 -04:00
Andrew Vasquez ae91193cd5 [SCSI] qla2xxx: Update copyright banner.
Update copyright banner.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:12:12 -04:00
Andrew Vasquez f04a311fdc [SCSI] qla2xxx: Firmware updates.
Firmware updates.

Resync with latest 21xx firmware      -- 1.19.25.
Resync with latest 22xx firmware      -- 2.02.08.
Resync with latest 23xx/63xx firmware -- 3.03.15.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:09:05 -04:00
Andrew Vasquez fa2a1ce53d [SCSI] qla2xxx: Code scrubbing.
Code scrubbing.

 - Remove trailing whitespace from driver files.
 - Remove unused #defines and inlines.
 - Standardize on C comments (// -> /* */)

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:03:15 -04:00
Andrew Vasquez ba5140b48e [SCSI] qla2xxx: NVRAM id-list updates.
NVRAM id-list updates.

Resync with latest NVRAM subsystem ID list.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:02:48 -04:00
Andrew Vasquez fca2970371 [SCSI] qla2xxx: Add OS initialization codes for ISP24xx recognition.
Add OS initialization codes for ISP24xx recognition.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:02:23 -04:00
Andrew Vasquez 0107109ed6 [SCSI] qla2xxx: Add ISP24xx initialization routines.
Add ISP24xx initialization routines.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:02:06 -04:00
Andrew Vasquez 9a853f7180 [SCSI] qla2xxx: Add ISP24xx ISR routines.
Add ISP24xx ISR routines.

Add appropriate glue-code for ISP24xx support -- this
included generalizing some of the core handling
routines (qla2x00_async_event() [pull-up retrieval of
mailbox values] and qla2x00_status_entry()].  Fixup
2100/2300 ISRs to handle the new conventions.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:00:27 -04:00
Andrew Vasquez 2b6c0cee90 [SCSI] qla2xxx: Add ISP24xx IOCB manipulation routines.
Add ISP24xx IOCB manipulation routines.

Add appropriate glue-code for ISP24xx support while
manipulting IOCB packets.  Add an ISP24xx specific
'start_scsi' routine due to command-type-7 layout
changes.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 11:00:07 -04:00
Andrew Vasquez 459c537807 [SCSI] qla2xxx: Add ISP24xx flash-manipulation routines.
Add ISP24xx flash-manipulation routines.

Add read/write flash manipulation routines for the ISP24xx.
Update sysfs NVRAM objects to use generalized accessor
functions.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:56:54 -04:00
Andrew Vasquez 1c7c63574f [SCSI] qla2xxx: Add MBX command routines for ISP24xx support.
Add MBX command routines for ISP24xx support.

Generalize several routines [qla2x00_load_ram_ext(),
qla2x00_execute_fw(), qla2x00_verify_checksum()] to handle
larger addressing space.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:56:39 -04:00
Andrew Vasquez 8c958a99d6 [SCSI] qla2xxx: Generalize SNS generic-services routines.
Generalize SNS generic-services routines.

Consolidate completion-status checking while adding support
for the ISP24xx.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:56:14 -04:00
Andrew Vasquez 6d9b61ed94 [SCSI] qla2xxx: Add ISP24xx diagnostic routines.
Add ISP24xx diagnostic routines.

Add function and structure definitions for the ISP24xx
diagnostic firmware dump routines.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:55:08 -04:00
Andrew Vasquez 3d71644cf9 [SCSI] qla2xxx: Add ISP24xx definitions.
Add ISP24xx definitions.

Add requisite structure definitions and #define's for ISP24xx
support.  Also drop volatile modifiers from device_reg_* register
layouts as the members are never really accessed, only their
offsets within the layout are used during reads and writes.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:54:46 -04:00
Andrew Vasquez ac96202ba0 [SCSI] qla2xxx: Add pci ids for new ISP types.
Add pci ids for new ISP types.

Move old definitions in local qla_def.h file to pci_ids.h as
well.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:54:20 -04:00
Andrew Vasquez abbd8870b9 [SCSI] qla2xxx: Factor-out ISP specific functions to method-based call tables.
Factor-out ISP specific functions to method-based call tables.

In anticipation of ISP24xx/ISP25xx support, factor-out ISP
specific functions into a method-based call table.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-07-14 10:47:30 -04:00
Linus Torvalds 514fd7fd01 Merge master.kernel.org:/pub/scm/linux/kernel/git/aegl/linux-2.6 2005-07-13 15:48:33 -07:00
Anton Altaparmakov c514720716 Automatic merge with /usr/src/ntfs-2.6.git. 2005-07-13 23:09:23 +01:00
Miles Bader 1e279dd855 [PATCH] v850: Align ___start___param to match parameter alignment
Signed-off-by: Miles Bader <miles@gnu.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 12:25:48 -07:00
Linus Torvalds 3720bd8b1e Merge master.kernel.org:/pub/scm/linux/kernel/git/tglx/mtd-2.6 2005-07-13 12:19:30 -07:00
Tony Luck 99ad25a313 Auto merge with /home/aegl/GIT/linus 2005-07-13 12:15:43 -07:00
David Mosberger-Tang f62c4a96f7 [IA64] Make PCDP work again.
Mark's patch added "attribute((packed))" for pcdp_uart, without
accounting for the fact that the structure definition _relied_ on
implicit padding by 6 bytes.  Fix is to make the padding explicit.

Signed-off-by: David Mosberger-Tang <David.Mosberger@acm.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-13 11:56:32 -07:00
Dean Nelson 59a0a8aa6a [IA64] fix call of smp_processor_id() by XPC while
XPC calls smp_processor_id() twice from xpc_setup_infrastructure() with
preemption enabled, which gets flagged if 'DEBUG_PREEMPT=y'. This patch
replaces the two calls to smp_processor_id() by a single call to
raw_smp_processor_id() since any CPU within the partition will do.

Signed-off-by: Dean Nelson <dcn@sgi.com>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2005-07-13 11:52:45 -07:00
Geert Uytterhoeven a61caa8523 [PATCH] Amiga joystick: Fix typo introduced by the open/close race fixes
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:44:27 -07:00
Olof Johansson f264cc2824 [PATCH] ppc64: add 970MP PVR
Add PVR value and tests for 970MP.  Also switch to a simpler (but slightly
longer) check at init time for simplicity.

Signed-off-by: Olof Johansson <olof@austin.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:25 -07:00
David Gibson 96e2844999 [PATCH] ppc64: kill bitfields in ppc64 hash code
This patch removes the use of bitfield types from the ppc64 hash table
manipulation code.

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Acked-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:25 -07:00
Olaf Hering f13487c66c [PATCH] ppc32: make -j12 all fails in uImage target
make -j zImage may call if_changed twice at the same time, the result is a
corrupted vmlinux.gz

Write to a temporary file for the time being until someone with make skills
fix the serialization properly.

Signed-off-by: Olaf Hering <olh@suse.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Tom Rini <trini@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:25 -07:00
Steve Dickson 7ee91ec14b [PATCH] NFS: procfs/sysctl interfaces for lockd do not work on x86_64
Allow the setting of NLM timeouts and grace periods through the proc and
sysclt interfaces on x86_64 architectures

Signed-off-by: Steve Dickson <steved@redhat.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Martin Schwidefsky 068e1b94bb [PATCH] s390: fadvise hint values.
Add special case for the POSIX_FADV_DONTNEED and POSIX_FADV_NOREUSE hint
values for s390-64.  The user space values in the s390-64 glibc headers for
these two defines have always been 6 and 7 instead of 4 and 5.  All 64 bit
applications therefore use the "wrong" values.  To get these applications
working without recompiling the kernel needs to accept the "wrong" values.
Since the values for s390-31 are 4 and 5 the compat wrapper for fadvise64
and fadvise64_64 need to rewrite the values for 31 bit system calls.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Guillaume Autran ddca3b80ce [PATCH] ppc32: fix destroy_context() race condition
Fix for a race condition when a task gets preempted by another task while
executing the destroy_context(...) in a FEW_CONTEXTS environment.
mm->context == NO_CONTEXT but the context_map may indicate all contexts are
in use.

The solution to this problem is to disable kernel preemption while
destroying a MMU context.

Signed-off-by: Guillaume Autran <gautran@mrv.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Anton Altaparmakov 88bd5121d6 [PATCH] Fix soft lockup due to NTFS: VFS part and explanation
Something has changed in the core kernel such that we now get concurrent
inode write outs, one e.g via pdflush and one via sys_sync or whatever.
This causes a nasty deadlock in ntfs.  The only clean solution
unfortunately requires a minor vfs api extension.

First the deadlock analysis:

Prerequisive knowledge: NTFS has a file $MFT (inode 0) loaded at mount
time.  The NTFS driver uses the page cache for storing the file contents as
usual.  More interestingly this file contains the table of on-disk inodes
as a sequence of MFT_RECORDs.  Thus NTFS driver accesses the on-disk inodes
by accessing the MFT_RECORDs in the page cache pages of the loaded inode
$MFT.

The situation: VFS inode X on a mounted ntfs volume is dirty.  For same
inode X, the ntfs_inode is dirty and thus corresponding on-disk inode,
which is as explained above in a dirty PAGE_CACHE_PAGE belonging to the
table of inodes ($MFT, inode 0).

What happens:

Process 1: sys_sync()/umount()/whatever...  calls __sync_single_inode() for
$MFT -> do_writepages() -> write_page for the dirty page containing the
on-disk inode X, the page is now locked -> ntfs_write_mst_block() which
clears PageUptodate() on the page to prevent anyone else getting hold of it
whilst it does the write out (this is necessary as the on-disk inode needs
"fixups" applied before the write to disk which are removed again after the
write and PageUptodate is then set again).  It then analyses the page
looking for dirty on-disk inodes and when it finds one it calls
ntfs_may_write_mft_record() to see if it is safe to write this on-disk
inode.  This then calls ilookup5() to check if the corresponding VFS inode
is in icache().  This in turn calls ifind() which waits on the inode lock
via wait_on_inode whilst holding the global inode_lock.

Process 2: pdflush results in a call to __sync_single_inode for the same
VFS inode X on the ntfs volume.  This locks the inode (I_LOCK) then calls
write-inode -> ntfs_write_inode -> map_mft_record() -> read_cache_page() of
the page (in page cache of table of inodes $MFT, inode 0) containing the
on-disk inode.  This page has PageUptodate() clear because of Process 1
(see above) so read_cache_page() blocks when tries to take the page lock
for the page so it can call ntfs_read_page().

Thus Process 1 is holding the page lock on the page containing the on-disk
inode X and it is waiting on the inode X to be unlocked in ifind() so it
can write the page out and then unlock the page.

And Process 2 is holding the inode lock on inode X and is waiting for the
page to be unlocked so it can call ntfs_readpage() or discover that
Process 1 set PageUptodate() again and use the page.

Thus we have a deadlock due to ifind() waiting on the inode lock.

The only sensible solution: NTFS does not care whether the VFS inode is
locked or not when it calls ilookup5() (it doesn't use the VFS inode at
all, it just uses it to find the corresponding ntfs_inode which is of
course attached to the VFS inode (both are one single struct); and it uses
the ntfs_inode which is subject to its own locking so I_LOCK is irrelevant)
hence we want a modified ilookup5_nowait() which is the same as ilookup5()
but it does not wait on the inode lock.

Without such functionality I would have to keep my own ntfs_inode cache in
the NTFS driver just so I can find ntfs_inodes independent of their VFS
inodes which would be slow, memory and cpu cycle wasting, and incredibly
stupid given the icache already exists in the VFS.

Below is a patch that does the ilookup5_nowait() implementation in
fs/inode.c and exports it.

ilookup5_nowait.diff:

Introduce ilookup5_nowait() which is basically the same as ilookup5() but
it does not wait on the inode's lock (i.e. it omits the wait_on_inode()
done in ifind()).

This is needed to avoid a nasty deadlock in NTFS.

Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:25:24 -07:00
Robert Love 9a556e8908 [PATCH] inotify: misc cleanup
Really simple, basic cleanup.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00
Robert Love 5995f16b4a [PATCH] inotify: event ordering
This rearranges the event ordering for "open" to be consistent with the
ordering of the other events.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00
Robert Love 0399cb08c5 [PATCH] inotify: move sysctl
This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from
"/proc/sys/fs".  Also some related cleanup.

Signed-off-by: Robert Love <rml@novell.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-13 11:09:31 -07:00