Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expect all file mode transitions to be represented by FileMode sections #93

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jgilchrist
Copy link

@jgilchrist jgilchrist commented Jan 27, 2025

An alternative approach to #92 for fixing various issues in jj.

Based on a discussion from Discord where it was decided that scm-record should expect that any file mode transitions have a Section::FileMode. This PR also implements UI/UX improvements in scm-record such as auto-unselecting the file deletion mode transition if any of the deleted lines are un-selected, but that could be done as a follow-up if the approach works out.

See #92 (comment) for some rationale regarding the removal of SelectedContents::Absent, .get_file_mode() and the introduction of SelectedChanges.

Note also that the internal consumers of .get_selected_changes() are not yet finished - for example, apply_changes is not completely passing the tests yet. If there's agreement on the approach then I can update these too.

scm-record/src/types.rs Outdated Show resolved Hide resolved
@jgilchrist jgilchrist force-pushed the push-psqlpxytkoqk branch 2 times, most recently from c41f6e1 to 70b170e Compare January 27, 2025 17:37
scm-record/src/types.rs Outdated Show resolved Hide resolved
/// The file's contents.
pub contents: SelectedContents<'a>,
}

/// The contents of a file selected as part of the record operation.
#[derive(Clone, Debug, Hash, Eq, PartialEq, PartialOrd, Ord)]
pub enum SelectedContents<'a> {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be reduced to a Option<String> at this point I feel like.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so although it means moving the binary/text handling stuff to the consumer side. You'd also need a wrapper struct if you still wanted a .push_str() that turns it from None to Some("...")

@30350n
Copy link

30350n commented Jan 27, 2025

Oh also, could you maybe restore all those changed tests for now?

Keeping this as minimal as possible for now would make it much easier to review/understand ^^

@jgilchrist
Copy link
Author

@30350n I've pushed a new version which removes the snapshot changes and updates the naming as you suggested.

@jgilchrist jgilchrist force-pushed the push-psqlpxytkoqk branch 2 times, most recently from 1df58ec to 88a1027 Compare January 27, 2025 19:31
@jgilchrist
Copy link
Author

@arxanas The tests were passing, but the changes were reverted in response to #93 (comment). If you'd like me to push the test changes again I'm happy to do so.

@jgilchrist
Copy link
Author

Commit ff00611 includes the changes discussed in Discord to automatically select/deselect file mode changes and contents changes (e.g. automatically de-selecting a deletion if any of the removed lines are de-selected, since the file needs to exist to contain those lines).

scm-diff-editor/src/lib.rs Outdated Show resolved Hide resolved
@arxanas arxanas self-assigned this Feb 8, 2025
@arxanas
Copy link
Owner

arxanas commented Feb 9, 2025

Hi @jgilchrist and @30350n, thanks for working on this and sorry for the delay in reviewing. I appreciate all the thinking and coding you've done so far 🙏

Actionable items

Here's my latest proposal based on an analysis from the perspective of interleaved deltas (see later):

  • Make File::file_mode into an enum with an Absent case.
    • We can either make FileMode into an enum, or wrap it in something else.
    • This isn't the only possible solution, but I think it's the simplest route forward.
  • Remove the special FileMode::absent() value.
  • In jj: always provide a meaningful FileMode value.
  • Remove Section::FileMode::before, as it's misleading and duplicates information. The "before" value is File::file_mode.
    • Note: Logical renames (via File::old_path) and Binary sections suffer from the same logical issues, although I don't know if there are any specific bugs right now. Ideally, we would clean them up, too.
  • Delete File::get_file_mode and unconditionally return the FileMode as part of the return value of File::get_selected_contents in some way.

In comparison to other proposals:

Analysis

The mental models in scm-record are confusing. I think it's because it's inadvertently used multiple diff-related concepts:

  • Fundamentally: scm-record is an implementation of interleaved deltas/"weaves".
    • That gives us some related mental models to use.
    • I should have tried to frame the problem in terms of source control more directly to begin with, rather than inventing my own data structures and algorithms.
  • Fundamentally: scm-record's UI involves selecting diffs.
    • The caller is actually required to compute the diffs and provide them as input for scm-record to render them. Yet the output must ultimately become snaphots, not diffs (for traditional DVCS, at least). That's a bit of an impedance mismatch.
  • Incidentally: scm-record has to scale to large files and avoid touching files on disk when not necessary.
    • That's why SelectedContents::Unchanged and SelectedContents::Binary exist. I was trying really hard to not literally copy contents to and from memory, but instead reference them in a symbolic way.

I think this is the non-obvious characterization of the scm-record data types:

  • The logical input to is two snapshots (commits) which induce one diff.
    • But the caller provides one snapshot and a diff, and leaves the other snapshot to be inferred.
    • (Note that it's probably not feasible to pass two snapshots and make scm-record calculate the diff, since the caller will know the domain-specific details of how to diff two snapshots.)
  • The logical output is three snapshots which induce two diffs.
    • I think this is what ends up being confusing in practice when we try to actually use the API.

To categorize some existing data types as "snapshot" vs "diff":

  • FileMode: snapshot
  • SelectedContents:
    • snapshot for the first member of the returned tuple
    • diff for the second member — this is likely a design mistake, but I don't think any real callers use the second member, so there may be no observable issues
  • Section: diff (including text, binary, and file mode)
  • Commit: snapshot
    • In particular, it has a message.
    • Not currently used in jj.
    • I believe there is no currently way to render any commit message other than the first, even though there are potentially 2–3 commits involved at present.

Related ideas

Some other things I'm thinking about to help clarify the design (but aren't necessarily relevant for this specific work):

  • Multi-way splits (feature: multi-way splits #73):
    • We probably want to assign "diffs" among any number of Commits (by index?).
    • The contents of snapshot $n$ is the accumulation of all the diffs from among commits $0 \ldots n$.
      • Note: This is not how get_selected_contents currently works!
  • Multiple commit messages:
    • We'd need to support assigning one message per output commit.
  • Manual textual edits (feature: add option to edit the right side of the diff #83):
    • We'd need to support selecting snapshots rather than diffs in some cases. It's not clear how that would interact with diffs assigned to commits that are "after" such a snapshot.
  • Binary edits:
    • We already have the problem that the caller needs to figure out what selecting a binary file section means via an out-of-band mechanism. The same issue kind of applies to file modes.
  • Resolving merge conflicts:
    • No ideas at present.
  • Copying and renaming files:
    • The act of "copying"/"renaming" could probably be separately selected from any content changes, just like file mode changes.

Previous thoughts

Basically, I think I was wrong here: #62 (comment). The ultimate goal is that we should try to implement a snapshot-oriented model of get_selected_contents() and try to do any diffing separately from that.

  • I wrote "file mode transition" (diff), but it's the wrong way of thinking about it. We logically want to return the "file mode" (snapshot) in all cases.
  • The underlying problem is that we don't have a way of knowing if the file mode changed unless we also have the original file mode, which is why I proposed returning the transition.
    • What I proposed technically gives us the information to resolve the issue, under the previous assumption that File::file_mode can't really be trusted.
    • But it's confusing and makes it seem like get_selected_contents should be returning diffs, not snapshots, because it's embedding information about the base snapshot into the return value.

Copy link
Owner

@arxanas arxanas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking as "changes requested" to remove this from my review queue; you can request review again later.

@jgilchrist
Copy link
Author

Thanks for the detailed review @arxanas and no problem at all on the delay! I'll take a stab at your proposed changes over the next few days, I think everything is clear to me but I might pop up on Discord with a question or two as they come up!

@jgilchrist
Copy link
Author

jgilchrist commented Feb 9, 2025

@arxanas I've taken a stab at your suggestions, and made the corresponding changes in the jj pull request. For now I've left the commits separate just so it's easy to see what's changed from the previous approach, but will squash if things look good.

  • Make File::file_mode into an enum with an Absent case.
    We can either make FileMode into an enum, or wrap it in something else.
    This isn't the only possible solution, but I think it's the simplest route forward.
    Remove the special FileMode::absent() value.

This is done. I wasn't sure what to call the non-absent variant. Rwx is very much a placeholder name. Happy to go with suggestions if you have any!

  • In jj: always provide a meaningful FileMode value.

Done as part of the jj PR, but this was already being done. It's mostly scm-record tests that were passing None.

  • Delete File::get_file_mode and unconditionally return the FileMode as part of the return value of File::get_selected_contents in some way.

This was already done as part of the previous changes (aside from 'unconditionally'), so I've kept the approach the same there, i.e. introducing SelectedChanges to return both SelectedContents and the file mode. The name doesn't really make sense now though. Again, happy to take suggestions.

@jgilchrist jgilchrist force-pushed the push-psqlpxytkoqk branch 2 times, most recently from 51b4574 to 21e5b9b Compare February 9, 2025 15:32
@jgilchrist jgilchrist force-pushed the push-psqlpxytkoqk branch 3 times, most recently from 11aab04 to d6f8419 Compare February 9, 2025 15:59
Self(0)
}
/// The default Unix permissions for files.
pub const FILE_DEFAULT: FileMode = FileMode::Rwx(0o100644);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this so that the tests that previously passed None don't each have to dupe 100644.

}
}

impl From<FileMode> for usize {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These didn't seem to be being used, either here, in the tests, or in jj. Now that 0 isn't a sentinel value there isn't an infallible conversion to FileMode anyway, so I've removed it. But if I'm missing something, I can add these back in.

let text = format!("File mode changed from {before} to {after}");

let text = match mode {
// TODO: It would be nice to render this as 'file was created with mode x' but we don't have access
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arxanas Thoughts on this? It's not currently rendered this way so happy to leave this as is, but feels like this might improve the UX a bit.

@jgilchrist jgilchrist requested a review from arxanas February 9, 2025 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants