Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Resource Feature #44

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Update Resource Feature #44

wants to merge 25 commits into from

Conversation

gkostin1966
Copy link
Contributor

Update resource feature with two scenarios: update everything (data and metadata) and update metadata only.

@gkostin1966 gkostin1966 self-assigned this Feb 28, 2025
@ssciolla
Copy link
Contributor

ssciolla commented Mar 3, 2025

The message bus in dor/cli/repo needs to be updated I think. This might be an opportunity to move the bootstrap function there into a separate file in service_layer and reference it in the BDD tests? Or it could be done later in a separate PR.

Copy link
Contributor

@ssciolla ssciolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool how this all came together. I'm leaving some comments and will follow up separately. Looking forward to discussing.

.gitignore Outdated
features/scratch/
tests/test_storage/
tests/test_descriptor_generator/
tests/test_workspaces/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tests/test_workspaces/

test_descriptor_generator took the places of test_workspaces.

Copy link
Contributor

@ssciolla ssciolla Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out I was wrong! tests/test_workspaces is used by test_translocator.py. ☹️ This doesn't have to be fixed here, but we should add it back to .gitignore eventually.


from dor.providers.models import PackageResource, StructMapType

### IGNORE THIS FILE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this comment be removed?

incoming_resource.data_files,
)

if incoming_resource.struct_maps and incoming_resource.struct_maps[0].type == StructMapType.PHYSICAL:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the physical struct map always be first?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, yes.

Comment on lines +3 to +6
class MinterProvider(ABC):
@abstractmethod
def mint(self) -> str:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I'd probably have just called it Minter. It's not providing a minter, it's providing an ID, right?

def _(
path_data: PathData, unit_of_work: AbstractUnitOfWork, message_bus: MemoryMessageBus
):
"""a package containing all the scanned pages, OCR, and metadata."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this comment be the same as what's in the given?

Comment on lines +43 to +45
@abstractmethod
def log(self, id: str, reversed: bool = True) -> list[VersionInfo]:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the underlying command is log, but I'd like to see this method be named something more specific, like get_versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like reversed is actually used anywhere?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the re-using versions, because it's not versions. It's a log of versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see your point, that's fine. I just wish the method said something about how it has something to do with version data. Maybe that's implied by logging an object, but feels vague at this point.

class VersionInfo:
version: int
author: str
date: datetime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd opt for datetime or timestamp. I find it confusing when only date is used for both date and time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks as if you're using the data straight from ROCFL, so maybe this would be some work to change...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We can't (readily) translate from the rocfl log back into discrete agent/contact, and we only care about the version number at the moment.

Comment on lines +165 to +191
def log(self, id: str, reversed: bool = True) -> list[VersionInfo]:
version_log = []
args: list[str | Path] = ["rocfl", "-r", self.storage_path, "log", "-r", id]
info = {}
try:
result = subprocess.run(args, check=True, capture_output=True)
for line in result.stdout.decode("utf-8").split("\n"):
if not line.strip():
if info:
version_log.append(VersionInfo(**info))
continue

if line.startswith("Version "):
info = {}
info['version'] = int(line.split(" ")[-1])
else:
key, value = [ v.strip() for v in line.strip().split(":", 1) ]
info[key.lower()] = value

return version_log
except CalledProcessError as e:
staged_version_not_found_message = (
f"Not found: {id} does not have a staged version."
)
if staged_version_not_found_message in e.stderr.decode():
return False
raise RepositoryGatewayError() from e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see tests for this and (less importantly) the FakeRepositoryGateway.log. Happy to chip in with a PR if you want me to.


def log(self, id: str, reversed: bool = True) -> list[VersionInfo]:
return reversed([
VersionInfo(version=v.number, author=v.contributor, message=v.message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work if you don't provide date?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We'll have to fake something in the fake logging.

tests/test_workspaces/

Co-authored-by: Sam Sciolla <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants