kni: fix build with Linux 6.3

Message ID 20230228172909.2054386-1-ferruh.yigit@amd.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series kni: fix build with Linux 6.3 |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/github-robot: build success github build: passed
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

Ferruh Yigit Feb. 28, 2023, 5:29 p.m. UTC
  KNI calls `get_user_pages_remote()` API which is using `FOLL_TOUCH`
flag, but `FOLL_TOUCH` is no more in public headers since v6.3,
causing a build error.

`FOLL_*` defines in Linux kernel first moved to another header [1],
later some of them moved to memory subsystem internal header [2] for 6.3

Quickly fixing build error by defining it in KNI compatibility header
when it is not defined in Linux headers.

There is a risk in this approach that if Linux kernel updates flags
value and it diverges from the value defined in KNI.

[1]
Commit b5054174ac7c ("mm: move FOLL_* defs to mm_types.h")

[2]
Commit 2c2241081f7d ("mm/gup: move private gup FOLL_ flags to internal.h")

Signed-off-by: Ferruh Yigit <ferruh.yigit@amd.com>
---
 kernel/linux/kni/compat.h | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

David Marchand Feb. 28, 2023, 8:45 p.m. UTC | #1
On Tue, Feb 28, 2023 at 6:29 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>
> KNI calls `get_user_pages_remote()` API which is using `FOLL_TOUCH`
> flag, but `FOLL_TOUCH` is no more in public headers since v6.3,
> causing a build error.

Something looks strange with what kni was doing.

Looking at get_user_pages_remote implementation, I see it internally
passes FOLL_TOUCH in addition to passed gup_flags.
And looking at FOLL_TOUCH definition, it seems natural (to me) that
this flag would be handled internally.

Maybe it changed over time... but then the question is when did
passing FOLL_TOUCH become unneeded?


Thanks.
  
David Marchand March 20, 2023, 12:10 p.m. UTC | #2
On Tue, Feb 28, 2023 at 9:45 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Tue, Feb 28, 2023 at 6:29 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
> >
> > KNI calls `get_user_pages_remote()` API which is using `FOLL_TOUCH`
> > flag, but `FOLL_TOUCH` is no more in public headers since v6.3,
> > causing a build error.
>
> Something looks strange with what kni was doing.
>
> Looking at get_user_pages_remote implementation, I see it internally
> passes FOLL_TOUCH in addition to passed gup_flags.
> And looking at FOLL_TOUCH definition, it seems natural (to me) that
> this flag would be handled internally.
>
> Maybe it changed over time... but then the question is when did
> passing FOLL_TOUCH become unneeded?

Here is some more info.

get_user_pages_remote() was added in kernel commit 1e9877902dc7
("mm/gup: Introduce get_user_pages_remote()").
At this time, it was passing the FOLL_TOUCH flag internally.

+long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
+               unsigned long start, unsigned long nr_pages,
+               int write, int force, struct page **pages,
+               struct vm_area_struct **vmas)
 {
        return __get_user_pages_locked(tsk, mm, start, nr_pages, write, force,
-                                      pages, vmas, NULL, false, FOLL_TOUCH);
+                                      pages, vmas, NULL, false,
+                                      FOLL_TOUCH | FOLL_REMOTE);
+}
+EXPORT_SYMBOL(get_user_pages_remote);

get_user_pages_remote() later gained the ability to pass gup flags in
kernel commit 9beae1ea8930 ("mm: replace get_user_pages_remote()
write/force parameters with gup_flags").
But FOLL_TOUCH was still added internally.

 long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
                unsigned long start, unsigned long nr_pages,
-               int write, int force, struct page **pages,
+               unsigned int gup_flags, struct page **pages,
                struct vm_area_struct **vmas)
 {
-       unsigned int flags = FOLL_TOUCH | FOLL_REMOTE;
-
-       if (write)
-               flags |= FOLL_WRITE;
-       if (force)
-               flags |= FOLL_FORCE;
-
        return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
-                                      NULL, false, flags);
+                                      NULL, false,
+                                      gup_flags | FOLL_TOUCH | FOLL_REMOTE);
 }


There were other changes in this area of the kernel code, but I did
not notice a change in relation with FOLL_TOUCH.

So I think the dpdk commit e73831dc6c26 ("kni: support userspace VA")
uselessly introduced call to this flag and we can remove it.
Adding author and reviewers of this change.
  
David Marchand March 20, 2023, 1:01 p.m. UTC | #3
On Mon, Mar 20, 2023 at 1:10 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Tue, Feb 28, 2023 at 9:45 PM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > On Tue, Feb 28, 2023 at 6:29 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
> > >
> > > KNI calls `get_user_pages_remote()` API which is using `FOLL_TOUCH`
> > > flag, but `FOLL_TOUCH` is no more in public headers since v6.3,
> > > causing a build error.
> >
> > Something looks strange with what kni was doing.
> >
> > Looking at get_user_pages_remote implementation, I see it internally
> > passes FOLL_TOUCH in addition to passed gup_flags.
> > And looking at FOLL_TOUCH definition, it seems natural (to me) that
> > this flag would be handled internally.
> >
> > Maybe it changed over time... but then the question is when did
> > passing FOLL_TOUCH become unneeded?
>
> Here is some more info.
>
> get_user_pages_remote() was added in kernel commit 1e9877902dc7
> ("mm/gup: Introduce get_user_pages_remote()").
> At this time, it was passing the FOLL_TOUCH flag internally.
>
> +long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
> +               unsigned long start, unsigned long nr_pages,
> +               int write, int force, struct page **pages,
> +               struct vm_area_struct **vmas)
>  {
>         return __get_user_pages_locked(tsk, mm, start, nr_pages, write, force,
> -                                      pages, vmas, NULL, false, FOLL_TOUCH);
> +                                      pages, vmas, NULL, false,
> +                                      FOLL_TOUCH | FOLL_REMOTE);
> +}
> +EXPORT_SYMBOL(get_user_pages_remote);
>
> get_user_pages_remote() later gained the ability to pass gup flags in
> kernel commit 9beae1ea8930 ("mm: replace get_user_pages_remote()
> write/force parameters with gup_flags").
> But FOLL_TOUCH was still added internally.
>
>  long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
>                 unsigned long start, unsigned long nr_pages,
> -               int write, int force, struct page **pages,
> +               unsigned int gup_flags, struct page **pages,
>                 struct vm_area_struct **vmas)
>  {
> -       unsigned int flags = FOLL_TOUCH | FOLL_REMOTE;
> -
> -       if (write)
> -               flags |= FOLL_WRITE;
> -       if (force)
> -               flags |= FOLL_FORCE;
> -
>         return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
> -                                      NULL, false, flags);
> +                                      NULL, false,
> +                                      gup_flags | FOLL_TOUCH | FOLL_REMOTE);
>  }
>
>
> There were other changes in this area of the kernel code, but I did
> not notice a change in relation with FOLL_TOUCH.
>
> So I think the dpdk commit e73831dc6c26 ("kni: support userspace VA")
> uselessly introduced call to this flag and we can remove it.
> Adding author and reviewers of this change.

Alternatively, we could go with passing 0 in flags when FOLL_TOUCH is
not exported.
Something like:

diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h
index 7aa6cd9fca..3164a48971 100644
--- a/kernel/linux/kni/compat.h
+++ b/kernel/linux/kni/compat.h
@@ -129,6 +129,14 @@
  */
 #if KERNEL_VERSION(4, 10, 0) <= LINUX_VERSION_CODE
 #define HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
+
+/*
+ * get_user_pages_remote() should not require the flag FOLL_TOUCH to be passed.
+ * Simply pass it as 0 when this flag is internal and not exported anymore.
+ */
+#ifndef FOLL_TOUCH
+#define FOLL_TOUCH 0
+#endif
 #endif

 #if KERNEL_VERSION(5, 6, 0) <= LINUX_VERSION_CODE || \
  
Vamsi Krishna Attunuru March 24, 2023, 3:04 a.m. UTC | #4
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Monday, March 20, 2023 6:31 PM
> To: Ferruh Yigit <ferruh.yigit@amd.com>; Vamsi Krishna Attunuru
> <vattunuru@marvell.com>; Kiran Kumar Kokkilagadda
> <kirankumark@marvell.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>
> Cc: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>
> Subject: [EXT] Re: [PATCH] kni: fix build with Linux 6.3
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Mon, Mar 20, 2023 at 1:10 PM David Marchand
> <david.marchand@redhat.com> wrote:
> >
> > On Tue, Feb 28, 2023 at 9:45 PM David Marchand
> > <david.marchand@redhat.com> wrote:
> > >
> > > On Tue, Feb 28, 2023 at 6:29 PM Ferruh Yigit <ferruh.yigit@amd.com>
> wrote:
> > > >
> > > > KNI calls `get_user_pages_remote()` API which is using
> > > > `FOLL_TOUCH` flag, but `FOLL_TOUCH` is no more in public headers
> > > > since v6.3, causing a build error.
> > >
> > > Something looks strange with what kni was doing.
> > >
> > > Looking at get_user_pages_remote implementation, I see it internally
> > > passes FOLL_TOUCH in addition to passed gup_flags.
> > > And looking at FOLL_TOUCH definition, it seems natural (to me) that
> > > this flag would be handled internally.
> > >
> > > Maybe it changed over time... but then the question is when did
> > > passing FOLL_TOUCH become unneeded?
> >
> > Here is some more info.
> >
> > get_user_pages_remote() was added in kernel commit 1e9877902dc7
> > ("mm/gup: Introduce get_user_pages_remote()").
> > At this time, it was passing the FOLL_TOUCH flag internally.
> >
> > +long get_user_pages_remote(struct task_struct *tsk, struct mm_struct
> *mm,
> > +               unsigned long start, unsigned long nr_pages,
> > +               int write, int force, struct page **pages,
> > +               struct vm_area_struct **vmas)
> >  {
> >         return __get_user_pages_locked(tsk, mm, start, nr_pages, write,
> force,
> > -                                      pages, vmas, NULL, false, FOLL_TOUCH);
> > +                                      pages, vmas, NULL, false,
> > +                                      FOLL_TOUCH | FOLL_REMOTE); }
> > +EXPORT_SYMBOL(get_user_pages_remote);
> >
> > get_user_pages_remote() later gained the ability to pass gup flags in
> > kernel commit 9beae1ea8930 ("mm: replace get_user_pages_remote()
> > write/force parameters with gup_flags").
> > But FOLL_TOUCH was still added internally.
> >
> >  long get_user_pages_remote(struct task_struct *tsk, struct mm_struct
> *mm,
> >                 unsigned long start, unsigned long nr_pages,
> > -               int write, int force, struct page **pages,
> > +               unsigned int gup_flags, struct page **pages,
> >                 struct vm_area_struct **vmas)  {
> > -       unsigned int flags = FOLL_TOUCH | FOLL_REMOTE;
> > -
> > -       if (write)
> > -               flags |= FOLL_WRITE;
> > -       if (force)
> > -               flags |= FOLL_FORCE;
> > -
> >         return __get_user_pages_locked(tsk, mm, start, nr_pages, pages,
> vmas,
> > -                                      NULL, false, flags);
> > +                                      NULL, false,
> > +                                      gup_flags | FOLL_TOUCH |
> > + FOLL_REMOTE);
> >  }
> >
> >
> > There were other changes in this area of the kernel code, but I did
> > not notice a change in relation with FOLL_TOUCH.
> >
> > So I think the dpdk commit e73831dc6c26 ("kni: support userspace VA")
> > uselessly introduced call to this flag and we can remove it.
> > Adding author and reviewers of this change.
> 
> Alternatively, we could go with passing 0 in flags when FOLL_TOUCH is not
> exported.
> Something like:

Yes, this flag is useless, I vaguely remember like I added it from v1(in that patch series) itself along with multiple kernel version checks,
but by looking at the internals neither of them would not need it.

We could pass 0 in flags as suggested.

> 
> diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h index
> 7aa6cd9fca..3164a48971 100644
> --- a/kernel/linux/kni/compat.h
> +++ b/kernel/linux/kni/compat.h
> @@ -129,6 +129,14 @@
>   */
>  #if KERNEL_VERSION(4, 10, 0) <= LINUX_VERSION_CODE  #define
> HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
> +
> +/*
> + * get_user_pages_remote() should not require the flag FOLL_TOUCH to be
> passed.
> + * Simply pass it as 0 when this flag is internal and not exported anymore.
> + */
> +#ifndef FOLL_TOUCH
> +#define FOLL_TOUCH 0
> +#endif
>  #endif
> 
>  #if KERNEL_VERSION(5, 6, 0) <= LINUX_VERSION_CODE || \
> 
> 
> --
> David Marchand
  
David Marchand April 13, 2023, 7:22 a.m. UTC | #5
On Fri, Mar 24, 2023 at 4:04 AM Vamsi Krishna Attunuru
<vattunuru@marvell.com> wrote:
> > > So I think the dpdk commit e73831dc6c26 ("kni: support userspace VA")
> > > uselessly introduced call to this flag and we can remove it.
> > > Adding author and reviewers of this change.
> >
> > Alternatively, we could go with passing 0 in flags when FOLL_TOUCH is not
> > exported.
> > Something like:
>
> Yes, this flag is useless, I vaguely remember like I added it from v1(in that patch series) itself along with multiple kernel version checks,
> but by looking at the internals neither of them would not need it.
>
> We could pass 0 in flags as suggested.

Ok, thanks for the confirmation Vamsi.
Ferruh, can you submit a v2?
  
Ferruh Yigit April 14, 2023, 3:29 p.m. UTC | #6
On 3/20/2023 1:01 PM, David Marchand wrote:
> On Mon, Mar 20, 2023 at 1:10 PM David Marchand
> <david.marchand@redhat.com> wrote:
>>
>> On Tue, Feb 28, 2023 at 9:45 PM David Marchand
>> <david.marchand@redhat.com> wrote:
>>>
>>> On Tue, Feb 28, 2023 at 6:29 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
>>>>
>>>> KNI calls `get_user_pages_remote()` API which is using `FOLL_TOUCH`
>>>> flag, but `FOLL_TOUCH` is no more in public headers since v6.3,
>>>> causing a build error.
>>>
>>> Something looks strange with what kni was doing.
>>>
>>> Looking at get_user_pages_remote implementation, I see it internally
>>> passes FOLL_TOUCH in addition to passed gup_flags.
>>> And looking at FOLL_TOUCH definition, it seems natural (to me) that
>>> this flag would be handled internally.
>>>
>>> Maybe it changed over time... but then the question is when did
>>> passing FOLL_TOUCH become unneeded?
>>
>> Here is some more info.
>>
>> get_user_pages_remote() was added in kernel commit 1e9877902dc7
>> ("mm/gup: Introduce get_user_pages_remote()").
>> At this time, it was passing the FOLL_TOUCH flag internally.
>>
>> +long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
>> +               unsigned long start, unsigned long nr_pages,
>> +               int write, int force, struct page **pages,
>> +               struct vm_area_struct **vmas)
>>  {
>>         return __get_user_pages_locked(tsk, mm, start, nr_pages, write, force,
>> -                                      pages, vmas, NULL, false, FOLL_TOUCH);
>> +                                      pages, vmas, NULL, false,
>> +                                      FOLL_TOUCH | FOLL_REMOTE);
>> +}
>> +EXPORT_SYMBOL(get_user_pages_remote);
>>
>> get_user_pages_remote() later gained the ability to pass gup flags in
>> kernel commit 9beae1ea8930 ("mm: replace get_user_pages_remote()
>> write/force parameters with gup_flags").
>> But FOLL_TOUCH was still added internally.
>>
>>  long get_user_pages_remote(struct task_struct *tsk, struct mm_struct *mm,
>>                 unsigned long start, unsigned long nr_pages,
>> -               int write, int force, struct page **pages,
>> +               unsigned int gup_flags, struct page **pages,
>>                 struct vm_area_struct **vmas)
>>  {
>> -       unsigned int flags = FOLL_TOUCH | FOLL_REMOTE;
>> -
>> -       if (write)
>> -               flags |= FOLL_WRITE;
>> -       if (force)
>> -               flags |= FOLL_FORCE;
>> -
>>         return __get_user_pages_locked(tsk, mm, start, nr_pages, pages, vmas,
>> -                                      NULL, false, flags);
>> +                                      NULL, false,
>> +                                      gup_flags | FOLL_TOUCH | FOLL_REMOTE);
>>  }
>>
>>
>> There were other changes in this area of the kernel code, but I did
>> not notice a change in relation with FOLL_TOUCH.
>>
>> So I think the dpdk commit e73831dc6c26 ("kni: support userspace VA")
>> uselessly introduced call to this flag and we can remove it.
>> Adding author and reviewers of this change.
> 
> Alternatively, we could go with passing 0 in flags when FOLL_TOUCH is
> not exported.
> Something like:
> 
> diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h
> index 7aa6cd9fca..3164a48971 100644
> --- a/kernel/linux/kni/compat.h
> +++ b/kernel/linux/kni/compat.h
> @@ -129,6 +129,14 @@
>   */
>  #if KERNEL_VERSION(4, 10, 0) <= LINUX_VERSION_CODE
>  #define HAVE_IOVA_TO_KVA_MAPPING_SUPPORT
> +
> +/*
> + * get_user_pages_remote() should not require the flag FOLL_TOUCH to be passed.
> + * Simply pass it as 0 when this flag is internal and not exported anymore.
> + */
> +#ifndef FOLL_TOUCH
> +#define FOLL_TOUCH 0
> +#endif
>  #endif
> 
>  #if KERNEL_VERSION(5, 6, 0) <= LINUX_VERSION_CODE || \
> 
> 

In KNI, 'get_user_pages_remote()' API is called after kernel version
4.10 and after that point 'FOLL_TOUCH' is set internally anyway as you
highlighted.

So I think we can remove setting 'FOLL_TOUCH' for all cases instead of
defining it internally, I have sent v2 according this.
  

Patch

diff --git a/kernel/linux/kni/compat.h b/kernel/linux/kni/compat.h
index 7aa6cd9fca75..42305799ebbd 100644
--- a/kernel/linux/kni/compat.h
+++ b/kernel/linux/kni/compat.h
@@ -151,3 +151,8 @@ 
 	 RHEL_RELEASE_VERSION(9, 1) <= RHEL_RELEASE_CODE))
 #define HAVE_NETIF_RX_NI
 #endif
+
+/* defined in 'mm/internal.h' since v6.3 */
+#ifndef FOLL_TOUCH
+#define FOLL_TOUCH (1 << 16)
+#endif