Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

long running process crashed here #693

Closed
kolinfluence opened this issue Mar 29, 2023 · 15 comments
Closed

long running process crashed here #693

kolinfluence opened this issue Mar 29, 2023 · 15 comments
Assignees
Labels
WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.

Comments

@kolinfluence
Copy link

kolinfluence commented Mar 29, 2023

Describe the bug

long running process crashed

To Reproduce

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xb80ca2]

goroutine 359534 [running]:
github.com/cloudwego/hertz/pkg/network/standard.(*Conn).Len(...)
        /root/go/pkg/mod/github.com/cloudwego/[email protected]/pkg/network/standard/connection.go:438
github.com/cloudwego/hertz/pkg/network/standard.(*Conn).Read(0x44a693?, {0xc000edce78?, 0x1?, 0x0?})
        /root/go/pkg/mod/github.com/cloudwego/[email protected]/pkg/network/standard/connection.go:82 +0x22
io.ReadAtLeast({0x7fcbc1083e90, 0xc000e95780}, {0xc000edce78, 0x9, 0x9}, 0x9)
        /usr/local/go/src/io/io.go:332 +0x9a
io.ReadFull(...)
        /usr/local/go/src/io/io.go:351
github.com/hertz-contrib/http2.readFrameHeader({0xc000edce78?, 0x9?, 0xc00174a880?}, {0x7fcbc1083e90?, 0xc000e95780?})
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/frame.go:253 +0x6e
github.com/hertz-contrib/http2.(*Framer).ReadFrame(0xc000edce40)
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/frame.go:503 +0x95
github.com/hertz-contrib/http2.(*serverConn).readFrames(0xc000ba1500)
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/server.go:526 +0x91
created by github.com/hertz-contrib/http2.(*serverConn).serve
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/server.go:625 +0x4d2
child 1 exited with error: exit status 2

Expected behavior

pls check the code

Hertz version:

0.6.0

@li-jin-gou
Copy link
Member

I suspect that the use of conn after the request has ended is causing this problem.

@li-jin-gou
Copy link
Member

If you have time, please take a look. cc @Duslia @wzekin

@Duslia
Copy link
Member

Duslia commented Mar 29, 2023

What is the HTTP2 version or commit? If you don't use the latest version, you can upgrade to the latest commit and then have a try. Also please provide how you use it and provide reproduce code.

@Duslia Duslia added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Mar 29, 2023
@kolinfluence
Copy link
Author

kolinfluence commented Mar 30, 2023

@Duslia

    github.com/hertz-contrib/http2 v0.1.5

irrespective of how it's used, it should not crash actually.
there's no memory overflow error etc or overloading.

i think @li-jin-gou is right, given the error message generated, there's no way i can reproduce this because it's running in production environment and only live traffic can generate this error

there are 2 possibilities that i can think of

  1. i use bytedance pkg gopool (i think wrapped around the connection or something, maybe not.) but it says goroutine crashed issues.
  2. i didnt have any of this issue on 0.5.X before running for 1 mth, now it's appearing after upgrade to 0.6. so could be new additional stuff inside?

new error!!!

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xb8265c]

goroutine 53803 [running]:
github.com/cloudwego/hertz/pkg/network/standard.(*Conn).Flush(0xc00241ce20?)
        /root/go/pkg/mod/github.com/cloudwego/[email protected]/pkg/network/standard/connection.go:496 +0x1c
github.com/cloudwego/hertz/pkg/network/standard.(*Conn).Write(0xc00242cb80, {0xc001181000, 0x2f, 0x1000})
        /root/go/pkg/mod/github.com/cloudwego/[email protected]/pkg/network/standard/connection.go:107 +0x2d
bufio.(*Writer).Flush(0xc0013cddc0)
        /usr/local/go/src/bufio/bufio.go:628 +0x62
github.com/hertz-contrib/http2.(*bufferedWriter).Flush(0xc001499218)
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/http2.go:300 +0x3b
github.com/hertz-contrib/http2.(*serverConn).Flush(0xc00008f400?)
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/server.go:402 +0x1d
github.com/hertz-contrib/http2.flushFrameWriter.writeFrame(...)
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/write.go:88
github.com/hertz-contrib/http2.(*serverConn).writeFrameAsync(0xc000b76a80, {{0x145f908?, 0x1d50c20?}, 0x0?, 0x0?})
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/server.go:554 +0x67
created by github.com/hertz-contrib/http2.(*serverConn).startFrameWrite
        /root/go/pkg/mod/github.com/hertz-contrib/[email protected]/server.go:949 +0x32a
child 1 exited with error: exit status 2

@li-jin-gou
Copy link
Member

cc @Duslia

@Duslia
Copy link
Member

Duslia commented Mar 31, 2023

It's hard to debug if it cannot reproduce. You can first update http2 to the latest commit and watch it for a while. If it produce again, you can construct a minimal reproduced demo based on your biz logic. Usually it's with high concurrency or some handler or connection error. I'll try to find the bug using the stack

@Duslia
Copy link
Member

Duslia commented Mar 31, 2023

Do you use ctx.GetConn().Close() to close connection?

@wzekin
Copy link
Contributor

wzekin commented Mar 31, 2023

Hello, this issue was caused by goroutine calling the write interface after the connection was closed. Today, we reviewed the exposed interfaces and found that the problem may be caused by your call to ctx.GetConn().Close() which closes the connection. Therefore, we have implemented a fallback version that can completely solve this issue. Currently, this branch can be used for urgent repairs, but if you are not in a hurry, you can wait for us to verify and release a new version before using it.

@kolinfluence
Copy link
Author

@wzekin ok i can wait. pls update it.
@Duslia
yes i used ctx.GetConn().Close() a lot of places. almost everywhere.

pls do mention when the update will happen. coz it's being used in production now.

@Duslia
Copy link
Member

Duslia commented Apr 1, 2023

Hello. If you are using it in production environment, could you pls register here?

@kolinfluence
Copy link
Author

@Duslia i will after this issue is fixed because now it's crashable.

@kolinfluence
Copy link
Author

kolinfluence commented Apr 11, 2023

@Duslia @wzekin
when is this fix? it's kind of a serious bug actually.

@li-jin-gou
Copy link
Member

@Duslia
Copy link
Member

Duslia commented Apr 12, 2023

@kolinfluence Could you register now? We can set up a Lark group to support.

@welkeyever
Copy link
Member

@kolinfluence Could you register now? We can set up a Lark group to support.

@kolinfluence Hi~ Please let me know if you have hertz in your production environment🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided.
Development

No branches or pull requests

5 participants