Skip to content

Commit

Permalink
Support earlier notebook versions (#13)
Browse files Browse the repository at this point in the history
* feat: decode v4.* notebooks

Decoder for v4.4 is reused for all notebooks with major version 4, because they
their differences do not affect how the notebooks are rendered:
- v4.5 requires that each cell has a unique ID
- versions < v4.3 do not have the 'code_cell.metadata.execution' field, which holds the code's execution time.
  They also do not have 'raw_cell.metadata.jupyterf.source_hidden' which controls if the source is hidden.
  This has a default behaviour in 'nb' and is probably not that important anyways.
  Finally, they miss metadata.title field, which is currently not used also in v4.4 notebooks.

* feat: decode v3.0 notebooks

Prior to v4.0:
- top-level 'worksheets' contained multiple worksheets with the actual 'cells'
- execution_results was called pyout
- error output was called pyerr
- code cell 'source' was called 'input'; execution_count was called prompt_number
- mime-bundle explicitly defined keys for all mime-tyipes which it supported and
had to be decoded differently

BREAKING: decode.Decoder interface not inlcudes ExtractCells method to handle the
deprecation of top-level 'worksheets'

* refactor: extract common schema structs

* feat: support v1.0 and v2.0 notebooks

Turns out, v1 and v2 only differ in how Jupyter interprets them,
not in the schema itself. We can use the same decoder we use
for v3.

* chore: update version.go to reflect current release version

* refactor: create 1 decoder instance per package

This has no logic implications, but it feels like multiple instances are unnecessary
  • Loading branch information
bevzzz authored Feb 26, 2024
1 parent e69229b commit 7525745
Show file tree
Hide file tree
Showing 7 changed files with 874 additions and 64 deletions.
18 changes: 16 additions & 2 deletions decode/decode.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,13 @@ func (n *notebook) UnmarshalJSON(data []byte) error {
return fmt.Errorf("%s: notebook metadata: %w", ver, err)
}

n.cells = make([]schema.Cell, len(n.Notebook.Cells))
for i, raw := range n.Notebook.Cells {
cells, err := d.ExtractCells(data)
if err != nil {
return fmt.Errorf("%s: extract cells: %w", ver, err)
}

n.cells = make([]schema.Cell, len(cells))
for i, raw := range cells {
c := cell{meta: meta, decoder: d}
if err := json.Unmarshal(raw, &c); err != nil {
return fmt.Errorf("%s: %w", ver, err)
Expand Down Expand Up @@ -78,7 +83,16 @@ func (c *cell) UnmarshalJSON(data []byte) error {
// Decoder implementations are version-aware and decode cell contents and metadata
// based on the respective JSON schema definition.
type Decoder interface {
// ExtractCells accesses the array of notebook cells.
//
// Prior to v4.0 cells were not a part of the top level structure,
// and were contained in "worksheets" instead.
ExtractCells(data []byte) ([]json.RawMessage, error)

// DecodeMeta decodes version-specific metadata.
DecodeMeta(data []byte) (schema.NotebookMetadata, error)

// DecodeCell decodes raw cell data to a version-specific implementation.
DecodeCell(v map[string]interface{}, data []byte, meta schema.NotebookMetadata) (schema.Cell, error)
}

Expand Down
Loading

0 comments on commit 7525745

Please sign in to comment.