[Cluster-devel] [PATCH v3 1/7] iov_iter: Introduce fault_in_iov_iter helper

Jan Kara jack at suse.cz
Mon Jul 26 16:33:26 UTC 2021


On Fri 23-07-21 22:58:34, Andreas Gruenbacher wrote:
> Introduce a new fault_in_iov_iter helper for manually faulting in an iterator.
> Other than fault_in_pages_writeable(), this function is non-destructive.
> 
> We'll use fault_in_iov_iter in gfs2 once we've determined that the iterator
> passed to .read_iter or .write_iter isn't in memory.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba at redhat.com>
...
> +unsigned long fault_in_user_pages(unsigned long start, unsigned long len,
> +				  bool write)
> +{
> +	struct mm_struct *mm = current->mm;
> +	struct vm_area_struct *vma = NULL;
> +	unsigned long end, nstart, nend;
> +	int locked = 0;
> +	int gup_flags;
> +
> +	/*
> +	 * FIXME: Make sure this function doesn't succeed for pages that cannot
> +	 * be accessed; otherwise we could end up in a loop trying to fault in
> +	 * and then access the pages.  (It's okay if a page gets evicted and we
> +	 * need more than one retry.)
> +	 */
> +
> +	/*
> +	 * FIXME: Are these the right FOLL_* flags?
> +	 */

How about the FIXMEs here? I guess we should answer these questions before
merging and remove the comments. Regarding the first FIXME I tend to agree
that if we cannot fault in the first page, we should return the error
rather than returning 0 as you do now. OTOH the caller can check for 0 and
understand there's something wrong going on as well. But the error would be
probably a bit clearer.

> +
> +	gup_flags = FOLL_TOUCH | FOLL_POPULATE;

I don't think FOLL_POPULATE makes sense here. It makes sense only with
FOLL_MLOCK and determines whether mlock(2) should fault in missing pages or
not.

								Honza

> +	if (write)
> +		gup_flags |= FOLL_WRITE;
> +
> +	end = PAGE_ALIGN(start + len);
> +	for (nstart = start & PAGE_MASK; nstart < end; nstart = nend) {
> +		unsigned long nr_pages;
> +		long ret;
> +
> +		if (!locked) {
> +			locked = 1;
> +			mmap_read_lock(mm);
> +			vma = find_vma(mm, nstart);
> +		} else if (nstart >= vma->vm_end)
> +			vma = vma->vm_next;
> +		if (!vma || vma->vm_start >= end)
> +			break;
> +		nend = min(end, vma->vm_end);
> +		if (vma->vm_flags & (VM_IO | VM_PFNMAP))
> +			continue;
> +		if (nstart < vma->vm_start)
> +			nstart = vma->vm_start;
> +		nr_pages = (nend - nstart) / PAGE_SIZE;
> +		ret = __get_user_pages_locked(mm, nstart, nr_pages,
> +					      NULL, NULL, &locked, gup_flags);
> +		if (ret <= 0)
> +			break;
> +		nend = nstart + ret * PAGE_SIZE;
> +	}
> +	if (locked)
> +		mmap_read_unlock(mm);
> +	if (nstart > start)
> +		return min(nstart - start, len);
> +	return 0;
> +}
> +
>  /**
>   * get_dump_page() - pin user page in memory while writing it to core dump
>   * @addr: user address
> -- 
> 2.26.3
> 
-- 
Jan Kara <jack at suse.com>
SUSE Labs, CR




More information about the Cluster-devel mailing list