Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion on Database plans/ORM #10

Open
adenner opened this issue Nov 20, 2022 · 22 comments
Open

Discussion on Database plans/ORM #10

adenner opened this issue Nov 20, 2022 · 22 comments
Labels
question Further information is requested

Comments

@adenner
Copy link

adenner commented Nov 20, 2022

For ease of development, any objections to using Entity Framework and sqlsql lite? By using ef it will make it far easier to switch to a "real" database like mysql, mssql et.al.

@benrick
Copy link
Member

benrick commented Nov 21, 2022

No objections at all. In fact, I was going to recommend we start with SQLite during development regardless of how we store data long-term. In-memory and SQLite have basically no up-front cost or setup, which is perfect for us, since we don't have dev containers, etc. configured yet.

@benrick benrick added the question Further information is requested label Nov 21, 2022
@kirkbrauer
Copy link
Member

I think long-term we should probably look into PostgreSQL and Npgsql + EF Core. I'm currently using this in production, it's a super powerful solution for most general use cases. However, SQLite is a great place to start development, single file DB for the win!

@edmistond
Copy link
Contributor

I'll reiterate my somewhat more ephemeral Mastodon comments here - I think SQLite's great for getting some code down to start, but I'm not sure it would work well if we need to implement background workers outside the main process... which, just based on what I know so far about ActivityPub and Mastodon generally, I am assuming we will.

I agree that Postgres is a great choice for long-term, though. I wouldn't want to tie this to Azure (or AWS) since I suspect that would be a disincentive for a lot of folks, which kind of argues against something like CosmosDB.

@benrick
Copy link
Member

benrick commented Nov 22, 2022

Yes, I think SQLite is a good starting point, but the background workers will require some kind of storage that isn't as tied to the file system.

And I agree that we don't want to add anything that requires any specific provider (Azure, AWS, etc.). Someone should be able to host this on a PC in their house if they want to (no idea why they would).

@adenner
Copy link
Author

adenner commented Nov 22, 2022

Both azure's cosmos and aws's aurora have compatibility and support for postgres in one way or another. For the homelab situation there is always k8s or docker compose.

@kirkbrauer
Copy link
Member

kirkbrauer commented Nov 22, 2022

@adenner there's also other options available like Azure Database for PostgreSQL and Amazon RDS for PostgreSQL. Looking at the docs for Mastodon, it appears that they use Postgres and Redis for their own data store and cache.

Perhaps it would be a good idea to make this project schema-compatible with any original Mastodon Postgres databases?

Ruby on Rails has a similar ORM to EF Core, so it wouldn't be too difficult to replicate their migrations

@benrick
Copy link
Member

benrick commented Nov 22, 2022

Yeah, I think there's a good chance postgres will be our answer. Obviously, it's also the choice mastodon uses.

@benrick
Copy link
Member

benrick commented Nov 22, 2022

Yeah, I was figuring that we'd either need to support a mastodon database directly, or have some kind of conversion to/from in order for us to be compatible.

@benrick
Copy link
Member

benrick commented Nov 22, 2022

Have a look at #18 @adenner and @kirkbrauer . I'm happy to have us start with SQLite or Postgres. Whichever someone wants to put in first, we can use.

@kirkbrauer
Copy link
Member

@benrick you know what, the more I think about it, I think we should just start with Postgres. That would make our lives a lot easier and makes it much easier to run regression tests comparing it to the original DB when ready.

@edmistond
Copy link
Contributor

I'm fine with either. Postgres runs great in a Docker container for dev purposes so it shouldn't be a heavy lift to set it up.

@benrick
Copy link
Member

benrick commented Nov 22, 2022

Yeah @kirkbrauer , I was leaning the same way. I had Postgres installed in PR 18, but removed it before sending the PR. 😆

@benrick
Copy link
Member

benrick commented Nov 22, 2022

The other aspect of this is ORM discussion. I do like Entity Framework, but I'm thinking about how quick a lighter solution can be. Any of you work with Dapper?

@edmistond
Copy link
Contributor

@benrick I've used Dapper on and off since it was a single file you just could pull into your project... some might say I've been doing this stuff for too long now. 😂 Big fan of it though, it made my life a lot easier.

I had kind of the same thought about performance; the one option I could see is taking a more CQRS-y approach where you have separate read/write EF contexts and all the reads go through AsNoTracking()... but that gets to be a lot to set up and maintain, and if I'm honest I feel like taking a CQRS approach also brings in Mediatr and then you end up going down a huge rabbit hole of indirection. Nice library, just feels like overkill sometimes. :)

@benrick
Copy link
Member

benrick commented Nov 22, 2022 via email

@kirkbrauer
Copy link
Member

@edmistond yeah, I totally feel you. My current CQRS implementation is nice, but we have a whole abstraction library on top of MediatR, which is another thing to maintain. The developer experience is amazing, but you need to make sure that the abstractions you build work for you and aren't too fine-grained.

@benrick EF Core these days has pretty good performance, there are a few libraries like Ardalis.Specification are a really nice way to make your queries testable and re-usable. I think the main reason to go with EF Core is because that will draw in the larger .NET community because it's basically the de-facto standard for most projects, even if it isn't the most performant.

@adenner
Copy link
Author

adenner commented Nov 22, 2022

There is something to be said about not prematurely optimizing. The other argument for ef core is that it's longterm support and viability is almost guaranteed

@devployment
Copy link

I would love to see SqlLite as a first class citizen. It's always a hassle to be dependant on a certain db engine for running such things either on my home lab, or on a really small VPS. Usually I don't want to deal with the things that come with a "proper" database.

If SQLite would be a first class citizen, I could imagine running Smilodon as my own / or small community instance, and use https://litestream.io/ for peace of mind to backup my database. No servers, no worries on how to backup my database server and what not.

Never used Litestream, but stumbled upon while searching for alternatives to the traditional database solutions for a side project I'm currently researching for.

@kirkbrauer
Copy link
Member

@devployment yeah, SQLite is pretty nice. Since we're using EF Core, potentially we could shift to SQLite in the future. For the reference implementation, I went ahead and did it in PostgreSQL since it's easy to compare to the original Mastodon schema and has great support.

@benrick
Copy link
Member

benrick commented Nov 27, 2022

Yeah, we decided on Postgres to support existing Mastodon databases. In theory, we could support either/both.

@benrick
Copy link
Member

benrick commented Nov 27, 2022

@devployment here's an article that shows how you can set up postgres and sqlite

https://blog.jetbrains.com/dotnet/2022/08/24/entity-framework-core-and-multiple-database-providers/

@edmistond
Copy link
Contributor

I've been fascinated in following Litestream and have wanted to find a good excuse to use it in a project for a long time. 😄

While I think the reference implementation with Postgres is the first priority, I'm hopeful that with the bulk of the database being tables and only having a small number of views/materialized views, porting shouldn't be overly difficult.

My bigger concern would be handling background jobs and the associated job queue. I believe we can embed background workers into the main web API application (I've never tried implementing this personally) but trying to handle an external queue like Redis/Rabbit for larger installs and some kind of internal one for single-file deployments might be a challenge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants