-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postgres memleak caused by prepared statements growing forever when prepareCache is overflowing #1143
Comments
@silles79 Is there someway you can confirm this on the Postgres side of things? I suspect something similar is happening but was hoping to see evidence of it in |
@silles79 Ok, so only the connection that created the prepared statements can see what's in that table for them. I modified my application to wake up and print out the contents of that table every 10 seconds and even with these values set to 10:
I'm seeing the count in |
Need to work on a better minimized reproduction but this is what I do:
Even though the previous transaction has been completed and we aren't keeping the prepared statements around, they are all still there in the next transaction. It prints out 2000 prepared statements from the previous transaction. // def inQuery(names: List[String]): Query[names.type, String] =
// sql"select state from pg_stat_activity where state IN (${text.values.list(names)})".query(text)
// def psStatementCount: Query[Void, (Int, String, String, Instant)] =
// sql"select pg_backend_pid(), name, statement, prepare_time from pg_prepared_statements order by prepare_time"
// .query(
// int4 *: text *: text *: offsetDtInstantCodec
_ <- Stream.eval {
withTxFromSessionPool(pool) { (s, tx) =>
1.to(2000).toList.traverse { x =>
val values = 1.to(x).toList.map(_.toString)
multiview.MultiviewRepository.inQuery(values)(s, tx).void
}
}
}
pgStatementDump = Stream.awakeEvery[IO](10.seconds).evalMap { _ =>
withTxFromSessionPool(pool) { (s, tx) =>
multiview.MultiviewRepository.getPreparedStatementCount(s, tx).map { entries =>
println(s"COUNT IS ${entries.size}")
entries.take(10).foreach { case (pid, name, statement, time) =>
println(s" ${pid} $time $name ${statement.take(100)}")
}
}
}
} |
With debug on, I see Skunk sending the close message:
But then after that transaction has closed, we still see the statement:
|
Ok, here's a full reproduction: Run a Postgres container. We're going to first customize the config file to show debug5 so we can see as much of the message protocol as possible:
In a separate window, run the attached test case. In two separate sessions, we issue a prepared statement, and then query the prepared statements table: //> using dep org.tpolecat::skunk-core:1.0.0-M8
import cats.effect.IOApp
import cats.effect.ExitCode
import cats.effect.IO
import cats.syntax.all.*
import skunk.*
import skunk.codec.all.*
import skunk.implicits.*
import cats.effect.kernel.Resource
import java.time.OffsetDateTime
import org.typelevel.otel4s.trace.Tracer.Implicits.noop
object SkunkMain extends IOApp {
def inQuery(names: List[String]): Query[names.type, String] =
sql"select state from pg_stat_activity where state IN (${text.values.list(names)})".query(text)
val psPreparedStatements: Query[Void, (Int, String, String, OffsetDateTime)] =
sql"select pg_backend_pid(), name, statement, prepare_time from pg_prepared_statements order by prepare_time"
.query(
int4 *: text *: text *: timestamptz
)
private def databaseSessionPool: Resource[IO, Resource[IO, Session[IO]]] =
Session
.pooled[IO](
host = "localhost",
port = 5432,
user = "postgres",
database = "postgres",
password = "mysecretpassword".some,
max = 1,
commandCache = 0,
queryCache = 0,
parseCache = 0,
debug = true
)
def run(args: List[String]): IO[ExitCode] =
val activeStates = List("active")
databaseSessionPool
.use { sessions =>
sessions.use { session =>
IO.println("Preparing and using query") >>
session.execute(inQuery(activeStates))(activeStates)
} >>
sessions.use { session =>
session.execute(psPreparedStatements).flatMap { results =>
IO.println(s"Prepared Statement Count: ${results.size}") >>
results.traverse(IO.println)
}
}
}
.as(ExitCode.Success)
} You'll see that Skunk sends the Close("P") command for portal_2
On the postgres side, you can see the bind to portal_2 occurring but there is no corresponding close message:
Not evidence of much but interesting that I see one and not the other. I took a Wireshark capture and confirmed Skunk's protocol. So unless someone else has some ideas, I'm going to reach out to the postgres mailing list. |
Here is Wireshark capture of the exchange |
Ok I think I have found the issue. We are closing portals (Close('P')) but not statements (Close('S')). Portal = prepared statement that is ready to execute. We only actually clean up statements when Protocol.cleanup is called and that is only called when the session is really closed.
There is a further bug which is exasperated if you set query/commandCache to low values. If you set the cache to 10 but create 100 prepared statements in your session, even calling |
I guess part of what I don't understand is how are folks using prepared statements in their code? In addition to tons of our queries being dynamically generated leading to lots of unique statements, you also have the problem with IN queries: select * from foo where id IN (?); These are two different prepared statements. Not saying my app doesn't have any stable queries but the dynamic ones far outweigh the stable ones. I see a couple possible fixes:
|
As a workaround I replaced the list codecs: where id in (${SomeCodecs.id.list(ids.size)}); with where ccbp.id = ANY(${intKeyArrayCodec[ Product.Id ]}) intKeyArrayCodec is just a wrapper and using Codecs.array. This reduced by many folds our postgres mem usage. |
I can confirm: yes when the connection is closed the server releases the sessions prepared statements. My issue is that we never close the connections (unless we deploy a new app), because we have a infinity stream. It takes a few days and postgress db runs out of RAM. |
I was looking at that too. Was curious if there were any performance drawbacks to that but looks like Postgres rewrites the IN statement to an array one anyway:
|
Fixes memory leak in prepared statement cache. Fixes #1143
Our Postgres db keeps running out of RAM.
We have an infinity stream of events from Kafka, each we groupBy(number = 1000) then go to the db and fetch in batch
We have a few queries which are parametrized with lists like
sql """ select ...
where id in (${SomeCodecs.id.list(ids.size)});
"""
We have a session with a pool, prepareCacheSize = 1024
We have about 25 queries
I think what happens is that we have more prepared statements ( lets say about 25 x 1000 ) than the cache size ,
and when the local cache runs out of space it just disposed it from the local cache but not from the service with deallocate.
When the same query with same number of params to list comes up, skunk just prepares again since it's no longer in the cache, so the number prepared statement prepared statement on the postgres server just keeps growing until it either dies or we restart the app
There are a few thing I could/will do at our side but I thought it make sense to report this as a bug
The text was updated successfully, but these errors were encountered: