Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMaHT ingestion related work. #293

Merged
merged 50 commits into from
Nov 30, 2023
Merged
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
ada2c8f
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 7, 2023
cc03d52
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 7, 2023
7beaa3d
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 8, 2023
3de7611
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 8, 2023
33b64ae
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 9, 2023
701384f
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 9, 2023
d49fd69
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 9, 2023
4e21b85
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 9, 2023
4e55f29
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 9, 2023
23d29e8
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 11, 2023
d80817e
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 12, 2023
7260521
Changes to bundle_utils/etc related to SMaHT ingestion work.
dmichaels-harvard Nov 12, 2023
676b7ce
Minor sheet_utils updates related to SMaHT ingestion.
dmichaels-harvard Nov 13, 2023
352c801
lint fixes
dmichaels-harvard Nov 13, 2023
28b780b
SMaHT ingestion work.
dmichaels-harvard Nov 14, 2023
8d3d432
Support ordering in sheet_utils.
dmichaels-harvard Nov 15, 2023
728c5bb
Minor sheet_utils updates related to SMaHT ingestion.
dmichaels-harvard Nov 15, 2023
bb7ebc9
Minor sheet_utils updates related to SMaHT ingestion.
dmichaels-harvard Nov 15, 2023
beb9cf7
Minor sheet_utils updates related to SMaHT ingestion.
dmichaels-harvard Nov 15, 2023
023bc78
Minor sheet_utils updates related to SMaHT ingestion.
dmichaels-harvard Nov 16, 2023
5f432bf
Minor sheet_utils updates related to SMaHT ingestion; handle missing …
dmichaels-harvard Nov 16, 2023
8e2141d
Minor sheet_utils updates related to SMaHT ingestion; array split rel…
dmichaels-harvard Nov 16, 2023
df58904
flake8 fixes
dmichaels-harvard Nov 17, 2023
74f504b
Minor sheet_utils updates related to SMaHT ingestion; handle missing …
dmichaels-harvard Nov 17, 2023
7efcf66
merge in maseter
dmichaels-harvard Nov 17, 2023
b5ecf1a
flake8 fixes
dmichaels-harvard Nov 18, 2023
72057e1
Added split_string to dcicutils.
dmichaels-harvard Nov 19, 2023
4b6f527
Added split_string to dcicutils.
dmichaels-harvard Nov 19, 2023
17d1d4d
Added merge_objects to misc_utils.
dmichaels-harvard Nov 19, 2023
5aaaaed
Added remove_empty_properties to misc_utils.
dmichaels-harvard Nov 19, 2023
7fd4037
Added zip_utils.
dmichaels-harvard Nov 19, 2023
9c5712c
Minor fix to temporary_file in zip_utils.
dmichaels-harvard Nov 19, 2023
ff2a607
Added right_trim_tuple to misc_utils.
dmichaels-harvard Nov 19, 2023
bc27b13
Added right_trim_tuple to misc_utils.
dmichaels-harvard Nov 19, 2023
9b0df7a
Added right_trim_tuple to misc_utils.
dmichaels-harvard Nov 20, 2023
4d40798
Added to_float and to_integer to misc_utils.
dmichaels-harvard Nov 20, 2023
471f05b
Added to_boolean to misc_utils
dmichaels-harvard Nov 20, 2023
a7e50d1
Added to_enum to misc_utils
dmichaels-harvard Nov 20, 2023
8c945ba
Updated rst doc file with zip_utils.
dmichaels-harvard Nov 20, 2023
c303955
Added data_readers.
dmichaels-harvard Nov 22, 2023
eb9ccf3
Comments
dmichaels-harvard Nov 22, 2023
7037476
Minor update to merge_objects.
dmichaels-harvard Nov 22, 2023
19612b1
Minor updates to Excel class in data_readers.
dmichaels-harvard Nov 27, 2023
cd0f512
Updated dcicutils.rst
dmichaels-harvard Nov 27, 2023
843b4e0
Minor update to misc_utils.load_json_if
dmichaels-harvard Nov 27, 2023
5db0985
Added test for merge_objects
dmichaels-harvard Nov 27, 2023
59673ca
Comments and tests for misc_utils.merge_objects.
dmichaels-harvard Nov 27, 2023
b27428f
lint fixes
dmichaels-harvard Nov 27, 2023
c0ba3d4
Minor update to misc_utils.right_trim
dmichaels-harvard Nov 30, 2023
35296b7
Update version and CHANGELOG.rst.
dmichaels-harvard Nov 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Minor sheet_utils updates related to SMaHT ingestion; handle missing …
…refs.
dmichaels-harvard committed Nov 16, 2023
commit 5f432bf42039d195875a8e87d98ed4c501bbb5e3
18 changes: 13 additions & 5 deletions dcicutils/bundle_utils.py
Original file line number Diff line number Diff line change
@@ -150,9 +150,10 @@ class RefHint(TypeHint):
def __str__(self):
return f"<RefHint {self.schema_name} context={self.context}>"

def __init__(self, schema_name: str, context: TypeHintContext):
def __init__(self, schema_name: str, required: bool, context: TypeHintContext):
self.schema_name = schema_name
self.context = context
self.required = required
super().__init__()

def apply_hint(self, value):
@@ -163,6 +164,8 @@ def apply_hint(self, value):
return value

def _apply_ref_hint(self, value):
if not value and self.required:
raise ValidationProblem(f"Missing required {self.schema_name} reference")
if self.is_array:
for item in value:
if item and not self.context.validate_ref(item_type=self.schema_name, item_ref=item):
@@ -364,7 +367,8 @@ def set_path_value(cls, datum: Union[List, Dict], path: ParsedHeader, value: Any
cls.set_path_value(datum[key], more_path, value)

@classmethod
def find_type_hint_for_subschema(cls, subschema: Any, context: Optional[TypeHintContext] = None):
def find_type_hint_for_subschema(cls, subschema: Any, required: bool = False,
context: Optional[TypeHintContext] = None):
if subschema is not None:
t = subschema.get('type')
if t == 'string':
@@ -374,14 +378,14 @@ def find_type_hint_for_subschema(cls, subschema: Any, context: Optional[TypeHint
return EnumHint(mapping)
link_to = subschema.get('linkTo')
if link_to and context.schema_exists(link_to):
return RefHint(schema_name=link_to, context=context)
return RefHint(schema_name=link_to, required=required, context=context)
return StringHint()
elif t in ('integer', 'number'):
return NumHint(declared_type=t)
elif t == 'boolean':
return BoolHint()
elif t == 'array':
array_type_hint = cls.find_type_hint_for_subschema(subschema.get("items"), context)
array_type_hint = cls.find_type_hint_for_subschema(subschema.get("items"), required=required, context=context)
if type(array_type_hint) == RefHint:
array_type_hint.is_array = True
return array_type_hint
@@ -399,7 +403,8 @@ def finder(subheader, subschema):
if subschema.get('type') == 'object':
subsubschema = subschema.get('properties', {}).get(key1)
if not other_headers:
hint = cls.find_type_hint_for_subschema(subsubschema, context=context)
required = key1 and subschema and key1 in subschema.get('required', [])
hint = cls.find_type_hint_for_subschema(subsubschema, required=required, context=context)
if hint:
return hint
else:
@@ -655,6 +660,9 @@ def check_flattened_row(self, row: Dict, *, tab_name: str, row_number: int, prot
patch_item = copy.deepcopy(prototype)
for column_number, column_value in enumerate(row.values()):
parsed_value = ItemTools.parse_item_value(column_value, apply_heuristics=self.apply_heuristics)
if len(row) > column_number and (list(row.keys())[column_number] or "").endswith("#"):
if isinstance(parsed_value, str):
parsed_value = [value.strip() for value in parsed_value.split(ARRAY_VALUE_DELIMITER) if value]
type_hint = type_hints[column_number]
if type_hint:
try: