Accepted, but we can revisit more complex solutions if we learn something that indicates that effort is worth it.
In August 2024, Andy Croll opened an issue1 describing sqlite file corruption related to solid queue. After investigation, we were able to reproduce corruption under certain circumstances when forking a process with open sqlite databases.2
SQLite is known to not be fork-safe3, so this was not entirely surprising though it was the first time your author had personally seen corruption in the wild. The corruption became much more likely after the sqlite3-ruby gem improved its memory management with respect to open statements4 in v2.0.0.
Advice from upstream contributors5 is, essentially: don't fork if you have open database connections. Or, if you have forked, don't call sqlite3_close
on those connections and thereby leak some amount of memory in the child process. Neither of these options are ideal, see below.
- Open writable database connections carried across a
fork()
will automatically be closed in the child process to mitigate the risk of corrupting the database file. - These connections will be incompletely closed ("discarded") which will result in a one-time memory leak in the child process.
First, the gem will register an "after fork" handler via Process._fork
that will close any open writable database connections in the child process. This is a best-effort attempt to avoid corruption, but it is not guaranteed to prevent corruption in all cases. Any connections closed by this handler will also emit a warning to let users know what's happening.
Second, the sqlite3-ruby gem will store the ID of the process that opened each database connection. If, when a writable database is closed (either explicitly with Database#close
or implicitly via GC or after-fork callback) the current process ID is different from the original process, then we "discard" the connection.
"Discard" here means:
sqlite3_close_v2
is not called on the database, because it is unsafe to do so per sqlite instructions3.- Open file descriptors associated with the database are closed.
- Any memory that can be freed safely is recovered.
- But some memory will be lost permanently (a one-time "memory leak").
- The
Database
object acts "closed", including returningtrue
from#closed?
. - Related
Statement
objects are rendered unusable and will raise an exception if used.
Note that readonly databases are being treated as "fork safe" and are not affected by these changes.
The positive consequence is that we remove a potential cause of database corruption for applications that fork with active sqlite database connections.
The negative consequence is that, for each discarded connection, some memory will be permanently lost (leaked) in the child process. We consider this to be an acceptable tradeoff given the risk of data loss.
This is the advice5 given by the upstream maintainers of sqlite, and so was the first thing we tried to implement in Rails in rails/rails#529316. That first simple implementation was not thread safe, however, and in order to make it thread-safe it would be necessary to pause all sqlite database activity, close the open connections, and then fork. At least one Rails core team member was not happy that this would interfere with database connections in the parent, and the complexity of a thread-safe solution seemed high, so this work was paused.
Sqlite offers a configuration option to specify custom memory functions for malloc et al. It seems possible that the sqlite3-ruby gem could implement a custom arena that would be used by sqlite so that in a new process, after forking, all the memory underlying the sqlite Ruby objects could be discarded in a single operation.
I think this approach is promising, but complex and risky. Sqlite is a complex library and uses shared memory in addition to the traditional heap. Would throwing away the heap memory (the arena) result in a segfault or other undefined behaviors or corruption? Determining the answer to that question feels expensive in and of itself, and any solution along these lines would not be supported by the sqlite authors. We can explore this space if the memory leak from discarded connections turns out to be a large source of pain.
Footnotes
-
SQLite queue database corruption · Issue #324 · rails/solid_queue ↩
-
flavorjones/2024-09-13-sqlite-corruption: Temporary repo, reproduction of sqlite database corruption. ↩
-
How To Corrupt An SQLite Database File: §2.6 Carrying an open database connection across a fork() ↩ ↩2
-
Always call sqlite3_finalize in deallocate func by haileys · Pull Request #392 · sparklemotion/sqlite3-ruby ↩
-
SQLite Forum: Correct way of carrying connections over forked processes ↩ ↩2
-
SQLite3Adapter: Ensure fork-safety by flavorjones · Pull Request #52931 · rails/rails ↩