Skip to content

Commit

Permalink
exposing additional_caller_info_text argument in ModularTokenizerOp._…
Browse files Browse the repository at this point in the history
…_call__() (#379)

* added embedding heads for use in core

* general table -based benchmark ingestion for PPI

* clarify EncoderEmbeddingOutputHead status of work in progress

* clarify EncoderEmbeddingOutputHead status of work in progress

* exposing additional_caller_info_text in ModularTokenizerOp

---------

Co-authored-by: VADIM RATNER [email protected] <[email protected]>
  • Loading branch information
floccinauc and VADIM RATNER [email protected] authored Nov 13, 2024
1 parent 5f95441 commit 8c7d90b
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions fuse/data/tokenizers/modular_tokenizer/op.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,7 @@ def __call__(
verbose (Optional[int], optional): verbosity level. 0: no notification, 1: warning notification, 2: warning with partial data, 3: warning
with full data. Defaults to 1.
validate_ends_with_eos (Optional[bool], optional): if not None, overrides self._validate_ends_with_eos
additional_caller_info_text (Optional[str]): information about the caller to add to error messages
key_out_encoding_per_meta: optional key out. If set to a string will put in it the per-meta-instruction encoded parts as a list of Encoding elements
Raises:
Expand Down Expand Up @@ -446,6 +447,7 @@ def __call__(
verbose: Optional[int] = 1,
validate_ends_with_eos: Optional[bool] = None,
key_out_scalars: Optional[str] = None,
additional_caller_info_text: Optional[str] = "",
) -> NDict:
"""_summary_
Expand All @@ -468,6 +470,7 @@ def __call__(
if provided, will write to:
`sample_dict[f'{key_out_scalars}.values]` - a 1D torch tensor with all the scalars values
`sample_dict[f'{key_out_scalars}.valid_mask]` - a 1D torch boolean tensor representing which elements have scalar values
additional_caller_info_text (Optional[str]): information about the caller to add to error messages
Returns:
NDict: _description_
Expand All @@ -492,6 +495,7 @@ def __call__(
on_unknown=on_unknown,
verbose=verbose,
validate_ends_with_eos=validate_ends_with_eos,
additional_caller_info_text=additional_caller_info_text,
key_out_encoding_per_meta=key_in
+ ".per_meta_part_encoding", # using the key_in as base for the name because key_out_* are optional
)
Expand Down

0 comments on commit 8c7d90b

Please sign in to comment.