-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add sqlite-lines extension #519
Conversation
Thanks, Ralf! I see in the checks at https://github.com/r-dbi/RSQLite/actions/runs/10479285293/job/29024577294?pr=519#step:14:503 :
What do you mean by "could not get fileio to work"? That other repo seems to have quite a few more GitHub stars -- not the best metric, but still a signal. |
Thanks for putting this together @rfhb! One question I have: Would it be possible to add a The reason I ask: I have published 12+ SQLite extensions that can be used in a variety of environments. They all have "bindings" to different programming languages and library. For example, you can But RSQLite is the exception. I do like how you have an allow-list of SQLite extensions with I know this is a big ask, and out-of-scope for this PR. But in case you all are interested, I can make a separate issue and add info there. If so, I'd be open to publishing my extensions on CRAN! |
…ent in upgrade.R (was inadvertently disabled)
Thanks, inadvertently I had left disabled parts in |
Alex, I like the idea of packaging the extra extensions in separate R packages. I remember tweaking the SQLite code to essentially disable dynamic loading. I also think we want to use R to load the shared library provided by the separate R package, and then pass a function pointer into the main RSQLite package for registering the extension. Would you like to draft an implementation? Happy to review! Ralf, there shouldn't be a need to activate GHA on your fork, you can always push to this PR. On the other hand, I don't see a reason why activating GHA on the fork would fail. |
Thank you @asg017 Alex! I would favour a generic solution that enables users to decide on extensions while minimising the efforts for the maintainer. Best for @krlmlr Kirill to express a view on what would be best in line with plans and principles for feature development for
From my perspective this would be great to advance and perhaps broaden the discussion of options, thank you. |
I realise that compilation fails under MS Windows because R uses mingw gcc, which does not support GNU getdelim() functions. No need to act on this PR, and I can close if you want, considering your earlier comment on dynamic loading - which in R worked for me with e.g.
|
I am closing this because |
This is a proposal to add the
lines_read()
function as per source code by @asg017 at https://github.com/asg017/sqlite-lines/ with MIT licence. The documentation is updated, and simple tests are included. Localdevtools::test()
completed without errors.As per @asg017's README, "sqlite-lines is great for line-oriented datasets, like ndjson or JSON Lines, when paired with SQLite's JSON support." Thus, I see interesting and important use cases for this extension, providing large performance improvements over loading data using e.g.
DBI::dbAppendTable()
. Also, other DMSs have similarSQL
functions for reading data from file, includingduckdb
.I acknowledge this PR pulls in source files from outside the SQLite repository, thus enlarging the risk surface of
RSQLite
; please let me know about any policy and source control requirements (I am not a C/C++ developer and cannot offer source code quality review).Many thanks for considering!
PS. I could not get the derived
fileio_scan()
function working as available at https://github.com/nalgeon/sqlean/blob/main/docs/fileio.md