Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust constraint/sequence diffing. #42

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

danbornside
Copy link
Contributor

@danbornside danbornside commented Jun 23, 2022

This PR hopefully improves beam-automigration hadnling of most types of constraint, and in how it handles sequences; so that they can be used in more cases and with fewer surprises.

The main idea is to make the internal types used in beam-automigrate more accurately reflect the kind of changes it can actually do, and to use keys in maps that more usefully reflect the salient parts of the schema. One rarely cares for the name of a constraint, but only the way that it constrains the data in the database.

Most of the linecount in this PR is working through the fallout of fixing the way beam-automigrate uses set operations to instead use maps as described above.

This has a secondary (benificial?) affect of making beam-automigrate a bit more robust agains superficial differences in the schema, such as when a column is renamed that was used in some constraint; since beam-automigrate now no longer cares what constraints are named.

still TODO

  • needs rebased on latest develop branch
  • non SERIAL defaults are being reset every time
    • make a real AST for the sql types.
    • "fix" "parser" to expect (and discard) typecasts
    • there are some possibly helpful notes in getAllDefaults that might inform a more robust solution.
    • this all probably has to do with whatever pgSerialTyColumnType was doing.
  • document new featues in README
  • fix code examples in haddocs to work correctly (particurly in D.B.A.Annotated)
  • make tests compile/kinda work
  • sort imports?
  • add constraint key data to TableConstraintRemoved to facilitate unsafe migrations when the constraint name is not known at the haskell level.
  • read 'primary key cannot be null' check to validator? (see comment on deleted validateRemoveColumnConstraint)

@danbornside
Copy link
Contributor Author

This PR should also resolve a couple of open issues, #10 #38 and #40

Comment on lines -574 to +670
( tbl
{ dbAnnotatedConstraints =
let colPairs =
concatMap
( \case
(References ours theirs) ->
zipWith
(,)
(fieldAsColumnNames (ours (tableSettings e)))
[ColumnName (theirs (tableSettings externalEntity) ^. Beam.fieldName)]
)
us
tName = externalEntity ^. dbEntityDescriptor . dbEntityName
conname = T.intercalate "_" (tName : map (columnName . snd) colPairs) <> "_fkey"
in S.insert
(ForeignKey conname (TableName tName) (S.fromList colPairs) onDelete onUpdate)
(dbAnnotatedConstraints tbl)
}
( tbl Lens.& Lens.over (_dbAnnotatedConstraints . _foreignKeyConstraints) (<> M.singleton (mkNullableForeignKeyConstraint e ourColumn externalEntity theirColumn) options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like these bits need to go in a lens blog post somewhere :). Crushing it with the line count :D.

@ali-abrar ali-abrar mentioned this pull request Jun 25, 2022
@madeline-os madeline-os self-assigned this Jul 21, 2022
danbornside and others added 6 commits August 3, 2022 13:19
Table constraints are now keyed according to what they constrain instead
of all of their attributes, including names.

No "type of all constraints, one at a time".  separate types for PRIMARY
KEY, UNIQUE, and FOREIGN KEY constraints; all split into the "match" and
"extra" parts.  All such constraints are now collected into a record of
maps, with the key being the match part of that constraint type and all
other data in the value of the map.

In most places for constraint/sequence, the name is wrapped in a Maybe,
to instruct beam-automigrate to allow postgres to choose the name, or
use the name from the db for existing, matching entity.

Sequences are now (optionally) owned by "SERIAL" columns using the built
in "OWNED BY" syntax in postgres.  A serial column can have an optional
name; beam-migrate will learn the name chosen by postgres at startup
when it already exists.

string parsing of DEFAULT constraints are now handled with a self
contained attoparsec based parser in `D.B.A.Parser`.  It's only parsing
the fragment of sql that beam-automigrate needs to understand to do it's
job; but is a good start for gradually improving its precision as the
need arises (future check-constraint support, as an example)

More liberal use of `"lens"` library to make some code a little easier
to understand.

Fixes dodgy catalog sql query for retrieving foreign key constraints.
how did this ever work?
Before ("NULLDEFAULT"):
        CREATE TABLE cities (capital BOOL NOT NULLDEFAULT
        'false'::boolean, city VARCHAR NOT NULL, location VARCHAR NOT
        NULL);

After:
        CREATE TABLE cities (capital BOOL NOT NULL DEFAULT
        'false'::boolean, city VARCHAR NOT NULL , location VARCHAR NOT
        NULL );
Comment on lines +91 to +92
DefaultExpr t -> DefaultExpr $ normalizeDefaultExpr t
Autoincrement mseq -> Autoincrement mseq
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea whether this is valid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about why autoincrement is special. Aren't they also just regular expressions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the specialness is mostly down to the postgres column ownership concept for sequences. In postgres, sequences can optionally be owned by a column; if the owning column of a sequence is dropped, the sequence is automatically dropped, too.

I don't think that's too critically important, other than the fact that in beam, SqlSerial is somewhat implied to mean serial or bigserial column type, which automatically creates an owned sequence of some indeterminate name.

There are potential reasons why someone might not want this all the time, though; sequences can be used to get values before the corresponding row is created, or the same sequence could be used for several tables.

These two "ways" of thinking about sequences are encoded in BA types as the AutoIncrement column constraint and the separate Sequence schema map.

a sequence might be known wanted by virtue of the SqlSerial but not have a known name (so that it may be added to the Sequences map in Schema), because it hasn't been crated yet; alternatively, a sequence might be known of, but not have an owner (and may even not appear in any default constraint), so we still want to have the map of sequences.

If SqlSerial was not a thing at all, then I think the special treatment of AutoIncrement is probably also unneeded, but as long as BA needs to support SqlSerial, the special autoincrement feature is neccessary to encode the "unknown/uncreated yet" sequences owned by the autoincrement column.

@@ -172,32 +199,75 @@ newtype DbEnum a
= DbEnum a
deriving (Show, Eq, Typeable, Enum, Bounded, Generic)

instance Semigroup Table where
(Table c1 t1) <> (Table c2 t2) = Table (c1 <> c2) (t1 <> t2)
newtype ConstraintName = ConstraintName { unConsraintName :: Text }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
newtype ConstraintName = ConstraintName { unConsraintName :: Text }
newtype ConstraintName = ConstraintName { unConstraintName :: Text }

typo

-- We only ever represent two kinds of sequence, either the sequence is owned by a column which has a default of nextval(sequence); or else the sequence is not owned, and must have a name.
data DefaultConstraint
= DefaultExpr Text
| Autoincrement (Maybe SequenceName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it make sense to represent Autoincrement as different from Default "nextval(whatever)"?

@@ -19,6 +19,9 @@ module Database.Beam.AutoMigrate.Validity
)
where

import Control.Lens (Lens', (^.), lens, at, (<<.=), (.=), ix, preuse, preview, _Nothing, set, _Unwrapped)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too much lens.

-- then Left $ InvalidRemoveColumnConstraint (Qualified tName colName) reason
-- else Right ()
-- SomeForeignKey {} -> Right ()
-- SomeUnique {} -> Right ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code

Column ->
Either ApplyFailed (Maybe Column)
changeColumnNullable newNullable col = pure . Just $
-- TODO: validateRemoveColumnConstraint!!!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

@@ -596,7 +648,7 @@ withExistingSequence ::
SequenceName ->
Edit ->
Schema ->
(Sequence -> Either ApplyFailed (Maybe Sequence)) ->
(Maybe Sequence -> Either ApplyFailed (Maybe (Maybe Sequence))) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clarify what the Maybe Sequence and the Maybe (Maybe Sequence) mean here

@ryantrinkle
Copy link
Member

@danbornside It looks like this improves a bunch of stuff; thanks! I left a few comments which I'd like to resolve before we merge.

danbornside and others added 5 commits August 11, 2022 10:07
Previous commits introduce a bug where no migration is assumed if both
versions of a table have a primary key at all,  This commit fixes the
case when both versions have a PK but with different columns.

This also exposed a related bug that primary key migrations must be
handled differently to unique key migrations.  a table can have only one
PK, which has further affects on when foreign key constraints must be
added or removed.
Otherwise, SequenceSetOwner will be executed on the new name before it has been renamed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants