[Libguestfs] libnbd golang failure on RISC-V
Richard W.M. Jones
rjones at redhat.com
Thu Jun 9 15:48:34 UTC 2022
On Thu, Jun 09, 2022 at 04:24:12PM +0100, Richard W.M. Jones wrote:
> On Thu, Jun 09, 2022 at 03:20:02PM +0100, Daniel P. Berrangé wrote:
> > > + go test -count=1 -v
> > > === RUN Test010Load
> > > --- PASS: Test010Load (0.00s)
> > > === RUN TestAioBuffer
> > > --- PASS: TestAioBuffer (0.00s)
> > > === RUN TestAioBufferFree
> > > --- PASS: TestAioBufferFree (0.00s)
> > > === RUN TestAioBufferBytesAfterFree
> > > SIGABRT: abort
> > > PC=0x3fdf6f9bac m=0 sigcode=18446744073709551610
> >
> > So suggesting TestAioBufferBytesAfterFree is as fault, but quite
> > odd as that test case is trivial and whle it allocates some
> > native memory it doesn't seem to write to it. Unless the problem
> > happened in an earlier test case and we have delayed detection ?
> >
> > I guess I'd try throwing darts at the wall by chopping out bits
> > of test code to see what makes it disappear.
> >
> > Perhaps also try swapping MakeAioBuffer with MakeAioBufferZero
> > in case pre-existing data into the C.malloc()d block is confusing
> > Go ?
>
> Interestingly if I remove libnbd_020_aio_buffer_test.go completely,
> and disable GODEBUG, then the tests pass. (Reproducer commands at end
> of email). So I guess at least one of the problems is confined to
> this test and/or functions it calls in the main library.
> Unfortunately this test is huge.
>
> At your suggestion, replacing every MakeAioBuffer with
> MakeAioBufferZero in that test, but it didn't help. Also tried
> replacing malloc -> calloc in the aio_buffer.go implementation which
> didn't help.
>
> I'll try some more random things ...
Adding a few printf's shows something interesting:
=== RUN TestAioBufferBytesAfterFree
calling Free on 0x3fbc1882b0
calling C.GoBytes on 0x3fbc1882b0
SIGABRT: abort
PC=0x3fe6aaebac m=0 sigcode=18446744073709551610
goroutine 21 [running]:
gsignal
:0
abort
:0
runtime.throwException
../../../libgo/runtime/go-unwind.c:128
runtime.unwindStack
../../../libgo/go/runtime/panic.go:535
panic
../../../libgo/go/runtime/panic.go:750
runtime.panicmem
../../../libgo/go/runtime/panic.go:210
runtime.sigpanic
../../../libgo/go/runtime/signal_unix.go:634
_wordcopy_fwd_aligned
:0
__GI_memmove
:0
runtime.stringtoslicebyte
../../../libgo/go/runtime/string.go:155
__go_string_to_byte_array
../../../libgo/go/runtime/string.go:509
_cgo_23192bdcbd72_Cfunc_GoBytes
./cgo-c-prolog-gccgo:46
This is a simple use after free because the Free function in
aio_buffer.go frees the array and then the Bytes function attempts to
copy b.Size bytes from the NULL pointer.
I didn't write this test so I'm not quite sure what it's trying to
achieve. It seems to be deliberately trying to cause a panic, but
causes a segfault instead? (And why only on RISC-V?)
func TestAioBufferBytesAfterFree(t *testing.T) {
buf := MakeAioBuffer(uint(32))
buf.Free()
defer func() {
if r := recover(); r == nil {
t.Fatal("Did not recover from panic calling Bytes() after Free()")
}
}()
buf.Bytes()
}
Since this only happens on RISC-V I guess it might be something to do
with the golang implementation on this architecture being unable to
turn segfaults into panics.
Removing all three *AfterFree tests fixes the tests.
It seems a bit of an odd function however. Wouldn't it be better to
changes the Bytes function so that it tests if the pointer is NULL and
panics?
NB: this _does not_ address the other problem where GODEBUG=cgocheck=2
complains about "fatal error: Go pointer stored into non-Go memory".
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org
More information about the Libguestfs
mailing list