[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 2.6.2-rc2 nfsd+xfs spins in i_size_read()


Christoph Hellwig wrote:
> On Sat, Jan 31, 2004 at 09:59:06AM -0600, Steve Lord wrote:
> 
>>The vn_revalidate call is called out of linvfs_setattr,
>>which is called with the i_sem held, it is also potentially called out
>>of linvfs_getattr, although since the i_size is always maintained
>>as it is changed, this call should not actually be updating the size.
>>Possibly changing the code in vn_revalidate to do this:
>>
>>	if (i_size_read(inode) < va.va_size))
>>		i_size_write(inode, va.va_size);
>>
>>Would be a good starting point, I suspect those calls from the nfs
>>revalidate call are not really going to change the inode size. My
>>guess is this will make your problem go away.
>>
>>Probably some larger code restructure is needed so that revalidate
>>knows if the i_sem is held or not at this point.
> 
> 
> I think the right fix is to update the Linux i_size always shortly after
> di_size is updated.  There's a lot of updates in the directory code while
> should be handled by an i_size_write in the matching linvfs routines, and
> there's a few more but we should be able to handle those without
> vn_revalidate aswell.
> 

Changing the validate_fields call to use

	if (i_size_read(inode) != va.va_size)
		i_size_write(inode, va.va_size);

should take care of directories, certainly better than the
direct assignment to i_size which is in there now....
This is called from the directory ops which modify the
contents of a directory and should already be under the
i_sem. The vn_revalidate code should use a != test
as well of course...


> 
>>The O_DIRECT write case is the hard one. In XFS's internal view of
>>the world, the inode size is maintained via the XFS_ILOCK, but we
>>only hold that across metadata manipulation within the fs code,
>>not across I/O such as a call to generic_file_aio_write_nolock.
>>Right now the only way I see of dealing with that is to make
>>writes which we know will extent the file hold the i_sem for
>>the duration in the O_DIRECT case.
> 
> 
> That's the more difficult case.  Any reson why you'd hold i_sem
> for the whole O_DIRECT I/O instead of just for updating i_sem?
> 

We worked hard not to hold it, but that i_size_write in
the middle of the async_io code is tough to work around.

> Note that the EOF zeroing code also calls i_size_write.
> 

But that is called out of an extending setattr or a buffered write which
hold the semaphore. Hmm, actually O_DIRECT write can be in there as
well, but that is the same problem as the generic I/O path call.

Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo _at_ vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


この情報があなたの探していたものかどうか選択してください。
yes/まさにこれだ!   no/違うなぁ   part/一部見つかった   try/これで試してみる

あなたが探していた情報はどのようなことか、ご自由に記入下さい。特に「まさにこれだ!」と言う場合は記入をお願いします。
例:「複数のマシンからCATV経由でipmasqueradeを利用してWebを参照したい場合の設定について」
References: