[Virtio-fs] A performance problem for buffer write

wangyan wangyan122 at huawei.com
Wed Aug 21 02:46:54 UTC 2019



On 2019/8/20 17:13, Stefan Hajnoczi wrote:
> On Thu, Aug 08, 2019 at 07:25:25PM +0800, wangyan wrote:
>> Hi all,
>> 	I meet a performance problem when I test buffer write.
>> 	
>> 	Test environment:
>> 		Guest Kernel:
>> 			https://github.com/rhvgoyal/linux/tree/virtio-fs-dev-5.1
>> 		Host Qemu:
>> 			https://gitlab.com/virtio-fs/qemu/tree/virtio-fs-dev
>>
>> 	I format a ramdisk to be a ext4 filesystem, and mount it for a shared
>> folder.
>> 	In the qemu build directory, run:
>> 		./virtiofsd -o vhost_user_socket=/tmp/vhostqemu -o source=/mnt/9pshare/ -o
>> cache=always -o writeback
>> 	and
>> 		./x86_64-softmmu/qemu-system-x86_64 -M pc -cpu host --enable-kvm -smp 2 -m
>> 8G,maxmem=8G -object
>> memory-backend-file,id=mem,size=8G,mem-path=/dev/hugepages,share=on -numa
>> node,memdev=mem -drive
>> if=none,id=root,format=qcow2,file=/mnt/sdb/mirrors/os.qcow2 -device
>> virtio-scsi -device scsi-disk,drive=root,bootindex=1 -object iothread,id=io
>> -device virtio-scsi-pci,iothread=io -net nic,model=virtio -net
>> tap,ifname=tap0,script=/etc/qemu-ifup,downscript=no -vnc 0.0.0.0:0 -chardev
>> socket,id=char0,path=/tmp/vhostqemu -device
>> vhost-user-fs-pci,queue-size=1024,chardev=char0,tag=myfs,cache-size=2G
>>
>> 	Test Result:
>> 		1. the bandwidth of buffer write is just 691894KB/s, it is very bad. The
>> test model:
>> 			fio -filename=/mnt/virtiofs/test -rw=write -bs=1M -size=1G -iodepth=1
>> -ioengine=psync -numjobs=1 -group_reporting -name=1M -time_based -runtime=30
>> 			1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
>> 			fio-2.13
>> 			Starting 1 process
>> 			4K: Laying out IO file(s) (1 file(s) / 1024MB)
>> 			Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/695.2MB/0KB /s] [0/695/0 iops]
>> [eta 00m:00s]
>> 			4K: (groupid=0, jobs=1): err= 0: pid=5580: Thu Aug  8 18:29:20 2019
>> 			  write: io=20271MB, bw=691894KB/s, iops=675, runt= 30001msec
>> 				clat (usec): min=359, max=12348, avg=1423.88, stdev=1587.96
>> 				 lat (usec): min=407, max=12399, avg=1475.98, stdev=1588.17
>>
>>
>> 		2. the avg lat is 6.64usec, it is very high. The test model:
>> 			fio -filename=/mnt/virtiofs/test -rw=write -bs=4K -size=1G -iodepth=1
>> -ioengine=psync -numjobs=1 -group_reporting -name=4K -time_based -runtime=30
>> 			4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
>> 			fio-2.13
>> 			Starting 1 process
>> 			Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/478.9MB/0KB /s] [0/123K/0 iops]
>> [eta 00m:00s]
>> 			4K: (groupid=0, jobs=1): err= 0: pid=5685: Thu Aug  8 18:35:51 2019
>> 			  write: io=13302MB, bw=454013KB/s, iops=113503, runt= 30001msec
>> 				clat (usec): min=2, max=7196, avg= 6.03, stdev=75.27
>> 				lat (usec): min=3, max=7196, avg= 6.64, stdev=75.27
>>
>> 	I found that the judgement statement 'if (!TestSetPageDirty(page))' always
>> true in function '__set_page_dirty_nobuffers', it will waste much time to
>> mark inode dirty, no one page is dirty when write it the second time.
>> 	The buffer write stack:
>> 		fuse_file_write_iter
>> 			->fuse_cache_write_iter
>> 				->generic_file_write_iter
>> 					->__generic_file_write_iter
>> 						->generic_perform_write
>> 							->fuse_write_end
>> 								->set_page_dirty
>> 									->__set_page_dirty_nobuffers
>> 	
>> 	The reason for 'if (!TestSetPageDirty(page))' always true may be the
>> pdflush process will call the function fuse_writepages, and it will clean
>> the page's dirty sign in write_cache_pages, and call fuse_writepages_send to
>> flush all pages to the disk of the host. So when the page is write the
>> second time, it always not dirty.
>> 	
>> 	I want to know is there any problems in this write-back procedure.
>
> Hi,
> It's likely that few people have tried the FUSE writeback feature with
> virtio-fs so far.
>
> Have you found out any more information since sending this email?
>
> Stefan
>

I add a statement "SetPageDirty(page);" to function fuse_writepages_fill(),
so the page always dirty when write it the second time.
The pdflush stack for fuse:
     pdflush
       ->...
         ->do_writepages
           ->fuse_writepages
             ->write_cache_pages         // will clear all page's dirty 
flags
               ->clear_page_dirty_for_io // clear page's dirty flags
			  ->fuse_writepages_fill
             ->fuse_writepages_send      // write all pages to the host
			
And the test result is:
1. Latency
     Test model:
         fio -filename=/mnt/virtiofs/test -rw=write -bs=4K -size=1G 
-iodepth=1 \
             -ioengine=psync -numjobs=1 -group_reporting -name=4K 
-time_based -runtime=30
	
	virtiofs: avg-lat is 3.47 usec, lower than 6.64 usec which pages always 
don't hit
		4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
		fio-2.13
		Starting 1 process
		Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/710.1MB/0KB /s] [0/182K/0 
iops] [eta 00m:00s]
		4K: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=psync, iodepth=1
		fio-2.13
		Starting 1 process
		Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/710.1MB/0KB /s] [0/182K/0 
iops] [eta 00m:00s]
		4K: (groupid=0, jobs=1): err= 0: pid=5480: Wed Aug 21 10:23:07 2019
		  write: io=21189MB, bw=723216KB/s, iops=180804, runt= 30001msec
			clat (usec): min=2, max=832, avg= 2.86, stdev= 1.26
			 lat (usec): min=3, max=833, avg= 3.47, stdev= 1.39

2. Bandwidth
	Test model:
		fio -filename=/mnt/virtiofs/test -rw=write -bs=1M -size=1G -iodepth=1 \
		    -ioengine=psync -numjobs=1 -group_reporting -name=1M -time_based 
-runtime=30
	
	virtiofs: bandwidth is 2050.7MB/s, bigger than 691894KB/s which pages 
always don't hit
		1M: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=psync, iodepth=1
		fio-2.13
		Starting 1 process
		Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/2058MB/0KB /s] [0/2058/0 
iops] [eta 00m:00s]
		1M: (groupid=0, jobs=1): err= 0: pid=5755: Wed Aug 21 10:33:00 2019
		  write: io=61522MB, bw=2050.7MB/s, iops=2050, runt= 30001msec
			clat (usec): min=401, max=725, avg=431.75, stdev=11.44
			 lat (usec): min=446, max=822, avg=482.20, stdev=12.18


So, I also think it is mostly a FUSE file system write-back question. So 
far,
I still do not find the real reason.




More information about the Virtio-fs mailing list