Support earlier notebook versions (#13)

* feat: decode v4.* notebooks Decoder for v4.4 is reused for all notebooks with major version 4, because they their differences do not affect how the notebooks are rendered: - v4.5 requires that each cell has a unique ID - versions < v4.3 do not have the 'code_cell.metadata.execution' field, which holds the code's execution time. They also do not have 'raw_cell.metadata.jupyterf.source_hidden' which controls if the source is hidden. This has a default behaviour in 'nb' and is probably not that important anyways. Finally, they miss metadata.title field, which is currently not used also in v4.4 notebooks. * feat: decode v3.0 notebooks Prior to v4.0: - top-level 'worksheets' contained multiple worksheets with the actual 'cells' - execution_results was called pyout - error output was called pyerr - code cell 'source' was called 'input'; execution_count was called prompt_number - mime-bundle explicitly defined keys for all mime-tyipes which it supported and had to be decoded differently BREAKING: decode.Decoder interface not inlcudes ExtractCells method to handle the deprecation of top-level 'worksheets' * refactor: extract common schema structs * feat: support v1.0 and v2.0 notebooks Turns out, v1 and v2 only differ in how Jupyter interprets them, not in the schema itself. We can use the same decoder we use for v3. * chore: update version.go to reflect current release version * refactor: create 1 decoder instance per package This has no logic implications, but it feels like multiple instances are unnecessary
bevzzz · Feb 26, 2024 · 7525745 · 7525745
1 parent e69229b
commit 7525745
Show file tree

Hide file tree

Showing 7 changed files with 874 additions and 64 deletions.
diff --git a/decode/decode.go b/decode/decode.go
@@ -44,8 +44,13 @@ func (n *notebook) UnmarshalJSON(data []byte) error {
 		return fmt.Errorf("%s: notebook metadata: %w", ver, err)
 	}
 
-	n.cells = make([]schema.Cell, len(n.Notebook.Cells))
-	for i, raw := range n.Notebook.Cells {
+	cells, err := d.ExtractCells(data)
+	if err != nil {
+		return fmt.Errorf("%s: extract cells: %w", ver, err)
+	}
+
+	n.cells = make([]schema.Cell, len(cells))
+	for i, raw := range cells {
 		c := cell{meta: meta, decoder: d}
 		if err := json.Unmarshal(raw, &c); err != nil {
 			return fmt.Errorf("%s: %w", ver, err)
@@ -78,7 +83,16 @@ func (c *cell) UnmarshalJSON(data []byte) error {
 // Decoder implementations are version-aware and decode cell contents and metadata
 // based on the respective JSON schema definition.
 type Decoder interface {
+	// ExtractCells accesses the array of notebook cells.
+	//
+	// Prior to v4.0 cells were not a part of the top level structure,
+	// and were contained in "worksheets" instead.
+	ExtractCells(data []byte) ([]json.RawMessage, error)
+
+	// DecodeMeta decodes version-specific metadata.
 	DecodeMeta(data []byte) (schema.NotebookMetadata, error)
+
+	// DecodeCell decodes raw cell data to a version-specific implementation.
 	DecodeCell(v map[string]interface{}, data []byte, meta schema.NotebookMetadata) (schema.Cell, error)
 }