[BUG] NFSv4 clients write to the same NFS file on a NFSv4 server with write delegations

2014. 4. 24. 16:44

RHEL6.4 - 6.5: data corruption when multiple NFSv4 clients write to the same NFS file on a NFSv4 server with write delegations

23시 01분 2014년 3월 13일 업데이트

문제

Data corruption when multiple NFS clients write to the same NFS file on a NFS server with write delegations, introduced in 2.6.32-358.23.1.el6
Data corruption when appending to a NFS file

환경

Red Hat Enterprise Linux 6 (NFS client)
- kernel 2.6.32-358.23.1.el6 or later (RHEL 6.4.x)
- kernel 2.6.32-408.el6 or later (RHEL 6.5)
NFS Server with write delegations
- Seen on Solaris or NetApp NFS server
NFSv4

해결

A fix is currently in progress, tracked by private Red Hat bugs 1054493 (RHEL6.6 or later) and 1066942 (RHEL6.5 maintenance kernel). Contact your Red Hat Support representative for more information.
Upstream commit 263b4509ec4d47e0da3e753f85a39ea12d1eff24 (nfs: always make sure page is up-to-date before extending a write to cover the entire page) addresses this problem.

Workaround

Downgrade to a kernel earlier than 2.6.32-358.23.1.el6 or earlier than 2.6.32-408.el6.
Use NFSv3
Disable NFSv4 write delegations on the NFS server.
- For NetApp server, contact NetApp support for official recommendations. Unofficially, you should be able to use the option 'nfs.v4.write_delegation' to determine if write delegations are enabled.

근본 원인

This is a regression caused by commit c7559663e42f4294ffe31fe159da6b6a66b35d61

    [fs] nfs: Allow nfs_updatepage to extend a write under additional circumstances

When determining whether to extend a write to cover an entire page in memory, the writer needs to determine whether the page is up-to-date. Commit c7559663e42f4294ffe31fe159da6b6a66b35d61 added logic to skip this check when the writer was holding a write delegation. Not reading the contents of the entire page first could cause data corruption when the page was written out to disk.

진단 단계

Reproducer

On NFS server, start packet dump

snoop -o capturefile nfsclient2

From nfsclient1

echo 123456789 > /nfs/newfile

From nfsclient2

 echo abcdefghi  >> /nfs/newfile

The resulting file is corrupted. The resulting file looks like this:

$ hexdump -C newfile
00000000  74 63 68 30 33 2e 61 74  6c 61 61 62 63 64 65 66
|tch03.atlaabcdef|
00000010  67 68 69 0a     

snoop -i capturefile -V -x 0 shows in the
write packet:      # use -v for more verbosity

NFS C 4 () PUTFH FH=BB66 WRITE ST=1DBE:0 at 0 for 20

         240: 0002 0000 0014 7463 6830 332e 6174 6c61 ......tch03.atla
         256: 6162 6364 6566 6768 690a                   abcdefghi.

The second nfs client writes 10 bytes of garbage over the 123456789[newline], and then writes abcdefghi[newline]
So the size of the file is not modified, but the original content (10 bytes) is overwritten with 10 bytes of junk, followed by the 10 bytes of abcdefghi[newline].

저작자표시 (새창열림)

'OS > RedHat Bug Report' 카테고리의 다른 글

[BUG] Why does system reboot fail with "init: unable to open console: Input/output error" error message? (0)	2014.04.24
[BUG] Unable to turn TSO on on virtual interfaces on RHEL 6.5 (0)	2014.04.24
[BUG] Stale TCP connections with tg3 on Red Hat Enterprise Linux 6 (0)	2014.04.24
[BUG] Kernel crash at "cpufreq_governor_dbs+0x397" or "__cpufreq_governor+0x2b" (0)	2014.04.24
[BUG] Failed GFP_ATOMIC allocations (dropped network packets) result in kernel warnings and backtrace (0)	2014.04.24
[BUG] EROFS ("Read-only file system") error is incorretly returned when trying to open already existing file with 'O_CREAT' on a read-only filesystem (0)	2014.04.24
[BUG] certain versions of Red Hat Enterprise Linux 6 kernels become unresponsive/hung or incur a kernel panic (0)	2014.04.24
[BUG] Filesystem corruption: "ext3_new_block: Allocating block in system zone" (0)	2014.04.24
[BUG] RHEV guests hanging and crashing (0)	2014.04.24
[BUG] Problem with Hot-adding memory in VMWare RHEL guest (0)	2014.04.24

TOP GUN

[BUG] NFSv4 clients write to the same NFS file on a NFSv4 server with write delegations

RHEL6.4 - 6.5: data corruption when multiple NFSv4 clients write to the same NFS file on a NFSv4 server with write delegations

문제

환경

해결

Workaround

근본 원인

진단 단계

Reproducer

'OS > RedHat Bug Report' 카테고리의 다른 글

+ Recent posts

티스토리툴바