Add extraction code for processing fusion caller data #229

jarbesfeld · 2025-01-16T20:04:52Z

Feature description

The translators are able to standardize fusion caller output to AssayedFusion objects. In #228 I created pydantic classes for the relevant callers (I plan on ultimately dropping support FusionMap and MapSplice since there is no online documentation).

We should develop a series of extraction methods to convert fusion caller output to pydantic classes to enable downstream standardization. I realized that @jsstevenson implemented a similar feature in the MAVE work for processing score set records, so I think a similar thing can be implemented here. For example, if we had a csv that contained 100 detected fusions from JAFFA, we could create a list of 100 JAFFA objects using the following code:

path=Path("../jaffa_results.csv")
fusions_list: list[JAFFA] = []
column_rename= {
    "fusion genes": "fusion_genes",
    "spanning reads": "spanning_reads",
    "spanning pairs": "spanning_pairs"
}
with path.open() as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        row = {column_rename.get(key, key): value for key, value in row.items()}
        fusions_list.append(JAFFA(**row))

Example output for the first item in the list:

{'type': <Caller.JAFFA: 'JAFFA'>,
 'fusion_genes': 'RP4-777O23.3:AC005154.6',
 'chrom1': 'chr7',
 'base1': 30550636,
 'chrom2': 'chr7',
 'base2': 30574881,
 'rearrangement': True,
 'classification': 'HighConfidence',
 'inframe': 'FALSE',
 'spanning_reads': 7,
 'spanning_pairs': 1602}

We could then iterate through this list using from_jaffa to create the standardized AssayedFusion objects

Use case

This will make the standardization of the fusion data more efficient and allow for validation checks

Acceptance Criteria

Extraction methods have been created for each pydantic class and the attributes in the pydantic classes have been appropriately updated

Proposed solution

No response

Alternatives considered

No response

Implementation details

No response

Potential Impact

No response

Additional context

No response

Contribution

Yes, I can create a PR for this feature.

The text was updated successfully, but these errors were encountered:

jarbesfeld added enhancement New feature or request priority:medium Medium priority labels Jan 16, 2025

jarbesfeld self-assigned this Jan 16, 2025

jarbesfeld mentioned this issue Jan 17, 2025

feat!: Add methods for extracting data from fusion callers #230

Merged

jarbesfeld closed this as completed in #230 Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add extraction code for processing fusion caller data #229

Add extraction code for processing fusion caller data #229

jarbesfeld commented Jan 16, 2025

Add extraction code for processing fusion caller data #229

Add extraction code for processing fusion caller data #229

Comments

jarbesfeld commented Jan 16, 2025

Feature description

Use case

Acceptance Criteria

Proposed solution

Alternatives considered

Implementation details

Potential Impact

Additional context

Contribution