From 2a0f7f5769992bae5b3f97157fd80b2b943be485 Mon Sep 17 00:00:00 2001 From: Li Zefan Date: Mon, 10 Oct 2011 15:43:34 -0400 Subject: [PATCH 1/2] Btrfs: fix recursive auto-defrag Follow those steps: # mount -o autodefrag /dev/sda7 /mnt # dd if=/dev/urandom of=/mnt/tmp bs=200K count=1 # sync # dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc and then it'll go into a loop: writeback -> defrag -> writeback ... It's because writeback writes [8K, 200K] and then writes [0, 8K]. I tried to make writeback know if the pages are dirtied by defrag, but the patch was a bit intrusive. Here I simply set writeback_index when we defrag a file. Signed-off-by: Li Zefan Signed-off-by: Chris Mason --- fs/btrfs/ioctl.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 6f89bcc4e555..df40b7c5f06b 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1055,6 +1055,13 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, if (!max_to_defrag) max_to_defrag = last_index - 1; + /* + * make writeback starts from i, so the defrag range can be + * written sequentially. + */ + if (i < inode->i_mapping->writeback_index) + inode->i_mapping->writeback_index = i; + while (i <= last_index && defrag_count < max_to_defrag) { /* * make sure we stop running if someone unmounts From f7f43cc84152e53b5687cd0eb8823310ba065524 Mon Sep 17 00:00:00 2001 From: Chris Mason Date: Tue, 11 Oct 2011 11:41:40 -0400 Subject: [PATCH 2/2] Btrfs: make sure not to defrag extents past i_size The btrfs file defrag code will loop through the extents and force COW on them. But there is a concurrent truncate in the middle of the defrag, it might end up defragging the same range over and over again. The problem is that writepage won't go through and do anything on pages past i_size, so the cow won't happen, so the file will appear to still be fragmented. defrag will end up hitting the same extents again and again. In the worst case, the truncate can actually live lock with the defrag because the defrag keeps creating new ordered extents which the truncate code keeps waiting on. The fix here is to make defrag check for i_size inside the main loop, instead of just once before the looping starts. Signed-off-by: Chris Mason --- fs/btrfs/ioctl.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index df40b7c5f06b..efc4e4a85a69 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1062,7 +1062,9 @@ int btrfs_defrag_file(struct inode *inode, struct file *file, if (i < inode->i_mapping->writeback_index) inode->i_mapping->writeback_index = i; - while (i <= last_index && defrag_count < max_to_defrag) { + while (i <= last_index && defrag_count < max_to_defrag && + (i < (i_size_read(inode) + PAGE_CACHE_SIZE - 1) >> + PAGE_CACHE_SHIFT)) { /* * make sure we stop running if someone unmounts * the FS