Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import FileAttachment #325

Merged
merged 4 commits into from
Dec 6, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 7 additions & 18 deletions docs/javascript/files.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,13 @@

TK Should this be called “working with data”?

You can load files the built-in `FileAttachment` function or the standard [`fetch`](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch) API. We recommend `FileAttachment` because it supports many common data formats, including CSV, TSV, JSON, Apache Arrow, and SQLite. For example, here’s how to load a CSV file:
You can load files the built-in `FileAttachment` function. This is available by default in Markdown, but you can import it like so:
Copy link
Contributor

@cinxmo cinxmo Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can load files the built-in `FileAttachment` function. This is available by default in Markdown, but you can import it like so:
You can load files with the built-in `FileAttachment` function. This is available by default in Markdown, but you can import it like so:

going to continue reviewing tomorrow


```js echo
import {FileAttachment} from "npm:@observablehq/stdlib";
```

`FileAttachment` supports many common data formats, including CSV, TSV, JSON, Apache Arrow, and SQLite. For example, here’s how to load a CSV file:

```js echo
const gistemp = FileAttachment("gistemp.csv").csv({typed: true});
Expand Down Expand Up @@ -34,23 +40,6 @@ const gistemp = FileAttachment("gistemp.csv").csv().then((D) => D.map(coerceRow)

TK An explanation of why coercing types as early as possible is important.

## Fetch

Here’s `fetch` for comparison.

```js run=false
import {autoType, csvParse} from "npm:d3-dsv";

const gistemp = fetch("./gistemp.csv").then(async (response) => {
if (!response.ok) throw new Error(`fetch error: ${response.status}`);
return csvParse(await response.text(), autoType);
});
```

Use `fetch` if you prefer to stick to the web standards, you don’t mind writing a little extra code. 🥴 Also, you’ll need to use `fetch` to load files from imported ES modules; `FileAttachment` only works within Markdown.

**Caution:** If you use `fetch` for a local file, the path *must* start with `./`, `../`, or `/`. This allows us to distinguish between local files and absolute URLs. But that’s a little silly, right? Because unlike `import`, you can’t `fetch` a bare module specifier, so we could be more generous and detect URLs using `/^\w+:/` instead.

## Supported formats

The following type-specific methods are supported:
Expand Down
2 changes: 1 addition & 1 deletion src/client/main.js
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ export function define(cell) {
v.define(outputs.length ? `cell ${id}` : null, inputs, body);
variables.push(v);
for (const o of outputs) variables.push(main.variable(true).define(o, [`cell ${id}`], (exports) => exports[o]));
for (const f of files) registerFile(f.name, {url: f.path, mimeType: f.mimeType});
for (const f of files) registerFile(f.name, f);
for (const d of databases) registerDatabase(d.name, d);
}

Expand Down
5 changes: 3 additions & 2 deletions src/client/stdlib/databaseClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ export function DatabaseClient(name) {
return new DatabaseClientImpl(name, token);
}

const DatabaseClientImpl = class DatabaseClient {
class DatabaseClientImpl {
#token;

constructor(name, token) {
Expand Down Expand Up @@ -71,8 +71,9 @@ const DatabaseClientImpl = class DatabaseClient {
async sql() {
return this.query(...this.queryTag.apply(this, arguments));
}
};
}

Object.defineProperty(DatabaseClientImpl, "name", {value: "DatabaseClient"}); // prevent mangling
DatabaseClient.prototype = DatabaseClientImpl.prototype; // instanceof

function coerceBuffer(d) {
Expand Down
19 changes: 11 additions & 8 deletions src/client/stdlib/fileAttachment.js
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
const files = new Map();

export function registerFile(name, file) {
if (file == null) files.delete(name);
else files.set(name, file);
const url = String(new URL(name, location.href));
if (file == null) files.delete(url);
else files.set(url, file);
}

export function FileAttachment(name) {
export function FileAttachment(name, base = location.href) {
if (new.target !== undefined) throw new TypeError("FileAttachment is not a constructor");
const file = files.get((name = `${name}`));
const url = String(new URL(name, base));
const file = files.get(url);
if (!file) throw new Error(`File not found: ${name}`);
const {url, mimeType} = file;
return new FileAttachmentImpl(url, name, mimeType);
const {path, mimeType} = file;
return new FileAttachmentImpl(path, name.split("/").pop(), mimeType);
}

async function remote_fetch(file) {
Expand Down Expand Up @@ -85,16 +87,17 @@ class AbstractFile {
}
}

const FileAttachmentImpl = class FileAttachment extends AbstractFile {
class FileAttachmentImpl extends AbstractFile {
constructor(url, name, mimeType) {
super(name, mimeType);
Object.defineProperty(this, "_url", {value: url});
}
async url() {
return (await this._url) + "";
}
};
}

Object.defineProperty(FileAttachmentImpl, "name", {value: "FileAttachment"}); // prevent mangling
FileAttachment.prototype = FileAttachmentImpl.prototype; // instanceof

class ZipArchive {
Expand Down
4 changes: 2 additions & 2 deletions src/files.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ export function getLocalPath(sourcePath: string, name: string): string | null {

export function fileReference(name: string, sourcePath: string): FileReference {
return {
name,
name: relativeUrl(sourcePath, name),
mimeType: mime.getType(name),
path: relativeUrl(sourcePath, resolvePath("_file", sourcePath, name))
path: relativeUrl(sourcePath, join("_file", name))
};
}

Expand Down
21 changes: 10 additions & 11 deletions src/javascript.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ import {findAwaits} from "./javascript/awaits.js";
import {resolveDatabases} from "./javascript/databases.js";
import {findDeclarations} from "./javascript/declarations.js";
import {findFeatures} from "./javascript/features.js";
import {rewriteFetches} from "./javascript/fetches.js";
import {defaultGlobals} from "./javascript/globals.js";
import {findExports, findImportDeclarations, findImports} from "./javascript/imports.js";
import {createImportResolver, rewriteImports} from "./javascript/imports.js";
import {findReferences} from "./javascript/references.js";
Expand All @@ -20,9 +18,11 @@ export interface DatabaseReference {
}

export interface FileReference {
/** The relative path from the page to the original file (e.g., "./test.txt"). */
name: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name is still the relative path from the page to the file?

docs/
├── earthquakes.csv
└── sub
    └── index.md

and in docs/sub/index.md we have FileAttachment("../earthquakes.csv"), name is "../earthquakes.csv" vs "./earthquakes.csv"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can double check but there is a difference between Feature.name, FileAttachment.name in generated code, and FileReference.name. It’s all a bit hard to keep straight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right! 👍 Sorry, I was confused. I’ve updated the comment.

/** The MIME type, if known; derived from the file extension. */
mimeType: string | null;
/** The relative path from the document to the file in _file */
/** The relative path from the page to the file in _file (e.g., "../_file/sub/test.txt"). */
path: string;
}

Expand Down Expand Up @@ -93,7 +93,6 @@ export function transpileJavaScript(input: string, options: ParseOptions): Pendi
output.insertRight(input.length, "\n))");
}
await rewriteImports(output, node, sourcePath, createImportResolver(root, "_import"));
rewriteFetches(output, node, sourcePath);
const result = `${node.async ? "async " : ""}(${inputs}) => {
${String(output)}${node.declarations?.length ? `\nreturn {${node.declarations.map(({name}) => name)}};` : ""}
}`;
Expand Down Expand Up @@ -143,7 +142,7 @@ export interface JavaScriptNode {
}

function parseJavaScript(input: string, options: ParseOptions): JavaScriptNode {
const {globals = defaultGlobals, inline = false, root, sourcePath} = options;
const {inline = false, root, sourcePath} = options;
// First attempt to parse as an expression; if this fails, parse as a program.
let expression = maybeParseExpression(input, parseOptions);
if (expression?.type === "ClassExpression" && expression.id) expression = null; // treat named class as program
Expand All @@ -152,16 +151,16 @@ function parseJavaScript(input: string, options: ParseOptions): JavaScriptNode {
const body = expression ?? Parser.parse(input, parseOptions);
const exports = findExports(body);
if (exports.length) throw syntaxError("Unexpected token 'export'", exports[0], input); // disallow exports
const references = findReferences(body, globals);
findAssignments(body, references, globals, input);
const declarations = expression ? null : findDeclarations(body as Program, globals, input);
const {imports, fetches} = findImports(body, root, sourcePath);
const features = findFeatures(body, root, sourcePath, references, input);
const references = findReferences(body);
findAssignments(body, references, input);
Copy link
Contributor

@cinxmo cinxmo Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have good test coverage for transpileJavaScript but it would be nice to have specific unit tests for the find<...> functions. I can work that if it's helpful nvm there are comprehensive test cases at the end

const declarations = expression ? null : findDeclarations(body as Program, input);
const {imports, features: importedFeatures} = findImports(body, root, sourcePath);
const features = findFeatures(body, sourcePath, references, input);
return {
body,
declarations,
references,
features: [...features, ...fetches],
features: [...features, ...importedFeatures],
imports,
expression: !!expression,
async: findAwaits(body).length > 0
Expand Down
7 changes: 4 additions & 3 deletions src/javascript/assignments.ts
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
import type {Expression, Node, Pattern, VariableDeclaration} from "acorn";
import type {Expression, Identifier, Node, Pattern, VariableDeclaration} from "acorn";
import {simple} from "acorn-walk";
import {defaultGlobals} from "./globals.js";
import {syntaxError} from "./syntaxError.js";

export function findAssignments(node: Node, references: Node[], globals: Set<string>, input: string): void {
export function findAssignments(node: Node, references: Identifier[], input: string): void {
function checkConst(node: Expression | Pattern | VariableDeclaration) {
switch (node.type) {
case "Identifier":
if (references.includes(node)) throw syntaxError(`Assignment to external variable '${node.name}'`, node, input);
if (globals.has(node.name)) throw syntaxError(`Assignment to global '${node.name}'`, node, input);
if (defaultGlobals.has(node.name)) throw syntaxError(`Assignment to global '${node.name}'`, node, input);
break;
case "ObjectPattern":
node.properties.forEach((node) => checkConst(node.type === "Property" ? node.value : node));
Expand Down
5 changes: 3 additions & 2 deletions src/javascript/declarations.ts
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
import type {Identifier, Pattern, Program} from "acorn";
import {defaultGlobals} from "./globals.js";
import {syntaxError} from "./syntaxError.js";

export function findDeclarations(node: Program, globals: Set<string>, input: string): Identifier[] {
export function findDeclarations(node: Program, input: string): Identifier[] {
const declarations: Identifier[] = [];

function declareLocal(node: Identifier) {
if (globals.has(node.name) || node.name === "arguments") {
if (defaultGlobals.has(node.name) || node.name === "arguments") {
throw syntaxError(`Global '${node.name}' cannot be redefined`, node, input);
}
declarations.push(node);
Expand Down
63 changes: 29 additions & 34 deletions src/javascript/features.ts
Original file line number Diff line number Diff line change
@@ -1,55 +1,50 @@
import type {Identifier, Literal, Node, TemplateLiteral} from "acorn";
import type {CallExpression, Identifier, Literal, Node, TemplateLiteral} from "acorn";
import {simple} from "acorn-walk";
import {getLocalPath} from "../files.js";
import type {Feature} from "../javascript.js";
import {syntaxError} from "./syntaxError.js";

export function findFeatures(
node: Node,
root: string,
sourcePath: string,
references: Identifier[],
input: string
): Feature[] {
export function findFeatures(node: Node, path: string, references: Identifier[], input: string): Feature[] {
const features: Feature[] = [];

simple(node, {
CallExpression(node) {
const {
callee,
arguments: args,
arguments: [arg]
} = node;

const {callee} = node;
// Ignore function calls that are not references to the feature. For
// example, if there’s a local variable called Secret, that will mask the
// built-in Secret and won’t be considered a feature.
if (
callee.type !== "Identifier" ||
(callee.name !== "Secret" && callee.name !== "FileAttachment" && callee.name !== "DatabaseClient") ||
!references.includes(callee)
) {
return;
}

// Forbid dynamic calls.
if (args.length !== 1 || !isStringLiteral(arg)) {
throw syntaxError(`${callee.name} requires a single literal string argument`, node, input);
}

// Forbid file attachments that are not local paths.
const value = getStringLiteralValue(arg);
if (callee.name === "FileAttachment" && !getLocalPath(sourcePath, value)) {
throw syntaxError(`non-local file path: "${value}"`, node, input);
}

features.push({type: callee.name, name: value});
if (callee.type !== "Identifier" || !references.includes(callee)) return;
const {name: type} = callee;
if (type !== "Secret" && type !== "FileAttachment" && type !== "DatabaseClient") return;
features.push(getFeature(type, node, path, input));
}
});

return features;
}

export function getFeature(type: Feature["type"], node: CallExpression, path: string, input: string): Feature {
const {
arguments: args,
arguments: [arg]
} = node;

// Forbid dynamic calls.
if (args.length !== 1 || !isStringLiteral(arg)) {
throw syntaxError(`${type} requires a single literal string argument`, node, input);
}

// Forbid file attachments that are not local paths; normalize the path.
let name: string | null = getStringLiteralValue(arg);
if (type === "FileAttachment") {
const localPath = getLocalPath(path, name);
if (!localPath) throw syntaxError(`non-local file path: ${name}`, node, input);
name = localPath;
}

return {type, name};
}

export function isStringLiteral(node: any): node is Literal | TemplateLiteral {
return (
node &&
Expand Down
74 changes: 0 additions & 74 deletions src/javascript/fetches.ts

This file was deleted.

Loading