Specification for streaming massive heterogeneous 3D geospatial datasets.
3D Tiles has entered the Open Geospatial Consortium (OGC) Community Standard process.
Created by the Cesium team and built on glTF.
Editor: Patrick Cozzi, @pjcozzi, [email protected].
Cesium Composer converters | Cesium |
---|---|
CyberCity3D | virtualcitySYSTEMS |
Cityzenith | Fraunhofer |
Vricon | Federal Office of Topography swisstopo |
Bentley ContextCapture | Bentley MicroStation (in progress) |
aero3Dpro | Entwine |
GeoRocket 3DPS | OSGJS (in progress) |
CSIRO Data61 | GameSim Conform |
SiteSee (using three.js) | Safe FME |
Peaxy | Prototype Point Cloud Converter |
VirtualGIS | LOPoCS and py3dtiles |
iTowns 2 | osm-cesium-3d-tiles |
geopipe | |
3D Digital Territory Lab | Çeşme 3D City Model |
- NYC by AGI
- 3D Swiss Federal Geoportal with 3 million buildings by Swisstopo and AGI
- Bentley ContextCapture
- aero3Dpro
- virtualcityMAP by virtualcitySYSTEMS
- 10.1 million buildings in North Rhine-Westphalia (34.098 km²)
- Textured buildings
- Textured buildings + point clouds
- building solar potential
- Berlin Atlas of Economy (switch to 3D and zoom in)
- Downtown Miami by CyberCity3D and AGI
- Entwine demos, including ~4.7 billion points in NYC
- AEROmetrex
- VirtualGIS: 2200 Miles of Pipeline
- UrbISOnline: 230,000 buildings Brussels (article)
Also see the 3D Tiles Showcases video on YouTube.
- Resources
- Spec status
- Introduction
- Tile metadata
- tileset.json
- Tile formats
- Declarative styling
- Roadmap Q&A
- Acknowledgments
- Data credits
- Introducing 3D Tiles - the motivation for and principles of 3D Tiles. Read this first if you are new to 3D Tiles.
- The Next Generation of 3D Tiles - future plans for 3D Tiles.
- Cesium implementation
- Download Cesium 1.35 or later and check out the Sandcastle examples labeled '3D Tiles'.
- Roadmap.
- Sample data
- 3d-tiles-samples - sample tilesets for learning how to use 3D Tiles
- Simple 3D tilesets used in the Cesium unit tests.
- Tools
- 3d-tiles-tools - upcoming tools for debugging, analyzing, and validating 3D Tiles tilesets.
- Selected Talks
- 3D Tiles in Action (pdf) at FOSS4G 2017.
- Point Clouds with 3D Tiles (pdf) at the OGC Technical Committee Meeting (June 2017).
- The Open Cesium 3D Tiles Specification: Bringing Massive Geospatial 3D Scenes to the Web (pptx, example tilesets) at Web3D 2016. 90-minute technical tutorial.
- 3D Tiles: Beyond 2D Tiling (pdf, video) at FOSS4G NA 2016.
- 3D Tiles motivation and ecosystem update (pdf) at the OGC Technical Committee Meeting (March 2016).
- 3D Tiles intro (pdf) at the Cesium BOF at SIGGRAPH 2015.
- Selected Articles
- OneSky Using Cesium / 3D Tiles For Volumetric Airspace Visualization. April 2018.
- Draco Compressed Meshes with glTF and 3D Tiles. April 2018.
- OGC Testbed-13: 3D Tiles and I3S Interoperability and Performance ER. March 2018.
- Historic Pharsalia Cabin Point Cloud Using Cesium & 3D Tiles. February 2018.
- Cesium's Participation in OGC Testbed 13. February 2018.
- Adaptive Subdivision of 3D Tiles. August 2017.
- Aerometrex and 3D Tiles. July 2017.
- Duke Using 3D Tiles for Excavation in Vulci. May 2017.
- GERST Engineers, Agisoft PhotoScan, and 3D Tiles. May 2017.
- Skipping Levels of Detail. May 2017.
- Infrastructure Visualisation using 3D Tiles. April 2017.
- SiteSee Photogrammetry and 3D Tiles. April 2017.
- Optimizing Spatial Subdivisions in Practice. April 2017.
- Optimizing Subdivisions in Spatial Data Structures. March 2017.
- What's new in 3D Tiles?. March 2017.
- Streaming 3D Capture Data using 3D Tiles. March 2017.
- Visualizing Massive Models using 3D Tiles. February 2017.
- Bringing City Models to Life with 3D Tiles. September 2016.
- Using Quantization with 3D Models. August 2016.
- News
- 3D Tiles thread on the Cesium forum - get the latest 3D Tiles news and ask questions here.
The 3D Tiles spec is pre-1.0 (indicated by "version": "0.0"
in tileset.json). We expect a draft 1.0 version and the Cesium implementation to stabilize in 2017; see the remaining items.
Draft 1.0 Plans
Topic | Status |
---|---|
tileset.json The tileset's spatial hierarchy |
✅ Solid base, will add features as needed |
Batched 3D Model (*.b3dm) Textured terrain and surfaces, 3D building exteriors and interiors, massive models, ... |
✅ Solid base, only minor, if any, changes expected |
Instanced 3D Model (*.i3dm) Trees, windmills, bolts, ... |
✅ Solid base, only minor, if any, changes expected |
Point Cloud (*.pnts) Massive amount of points |
✅ Solid base, only minor, if any, changes expected |
Vector Data (*.vctr) Polygons, polylines, and placemarks |
⚪ In progress, #124 |
Composite (*.cmpt) Combine heterogeneous tile formats |
✅ Solid base, only minor, if any, changes expected |
Declarative Styling Style features using per-feature metadata |
✅ Solid base, will add features/functions as needed, #2 |
Post Draft 1.0 Plans
Topic | Status |
---|---|
Terrain v2 | ⚪ Not started, quantized-mesh is a good starting point; in the meantime, folks are using Batched 3D Model |
OpenStreetMap | ⚪ Not started Currently folks are using Batched 3D Model |
Stars | ⚪ Not started |
For spec work in progress, watch this repo and browse the issues.
In 3D Tiles, a tileset is a set of tiles organized in a spatial data structure, the tree. Each tile has a bounding volume completely enclosing its contents. The tree has spatial coherence; the content for child tiles are completely inside the parent's bounding volume. To allow flexibility, the tree can be any spatial data structure with spatial coherence, including k-d trees, quadtrees, octrees, and grids.
To support tight fitting volumes for a variety of datasets from regularly divided terrain to cities not aligned with a line of longitude or latitude to arbitrary point clouds, the bounding volume may be an oriented bounding box, a bounding sphere, or a geographic region defined by minimum and maximum longitudes, latitudes, and heights.
TODO: include a screenshot of each bounding volume type.
A tile references a feature or set of features, such as 3D models representing buildings or trees, points in a point cloud, or polygons, polylines, and points in a vector dataset. These features may be batched together into essentially a single feature to reduce client-side load time and WebGL draw call overhead.
The metadata for each tile - not the actual contents - are defined in JSON. For example:
{
"boundingVolume": {
"region": [
-1.2419052957251926,
0.7395016240301894,
-1.2415404171917719,
0.7396563300150859,
0,
20.4
]
},
"geometricError": 43.88464075650763,
"refine" : "ADD",
"content": {
"boundingVolume": {
"region": [
-1.2418882438584018,
0.7395016240301894,
-1.2415422846940714,
0.7396461198389616,
0,
19.4
]
},
"url": "2/0/0.b3dm"
},
"children": [...]
}
The boundingVolume.region
property is an array of six numbers that define the bounding geographic region in WGS84 / EPSG:4326 coordinates with the order [west, south, east, north, minimum height, maximum height]
. Longitudes and latitudes are in radians, and heights are in meters above (or below) the WGS84 ellipsoid. Besides region
, other bounding volumes, such as box
and sphere
, may be used.
The geometricError
property is a nonnegative number that defines the error, in meters, introduced if this tile is rendered and its children are not. At runtime, the geometric error is used to compute Screen-Space Error (SSE), i.e., the error measured in pixels. The SSE determines Hierarchical Level of Detail (HLOD) refinement, i.e., if a tile is sufficiently detailed for the current view or if its children should be considered.
An optional viewerRequestVolume
property (not shown above) defines a volume, using the same schema as boundingVolume
, that the viewer must be inside of before the tile's content will be requested and before the tile will be refined based on geometricError
. See the Viewer request volume section.
The refine
property is a string that is either "REPLACE"
for replacement refinement or "ADD"
for additive refinement. It is required for the root tile of a tileset; it is optional for all other tiles. When refine
is omitted, it is inherited from the parent tile.
The content
property is an object that contains metadata about the tile's content and a link to the content. content.url
is a string that points to the tile's contents with an absolute or relative url. In the example above, the url, 2/0/0.b3dm
, has a TMS tiling scheme, {z}/{y}/{x}.extension
, but this is not required; see the roadmap Q&A.
The url can be another tileset.json file to create a tileset of tilesets. See External tilesets.
A file extension is not required for content.url
. A content's tile format can be identified by the magic
field in its header, or otherwise as an external tileset if the content is JSON.
content.boundingVolume
defines an optional bounding volume similar to the top-level boundingVolume
property. But unlike the top-level boundingVolume
property, content.boundingVolume
is a tightly fit bounding volume enclosing just the tile's contents. This is used for replacement refinement; boundingVolume
provides spatial coherence and content.boundingVolume
enables tight view frustum culling. The screenshot below shows the bounding volumes for the root tile for Canary Wharf. boundingVolume
, shown in red, encloses the entire area of the tileset; content.boundingVolume
shown in blue, encloses just the four features (models) in the root tile.
content.boundingVolume
is optional. When it is not defined, the tile's bounding volume is still used for culling (see Grids).
An optional transform
property (not shown above) defines a 4x4 affine transformation matrix that transforms the tile's content
, boundingVolume
, and viewerRequestVolume
as described in the Tile transform section.
children
is an array of objects that define child tiles. See the section below.
3D Tiles use a right-handed Cartesian coordinate system, that is, the cross product of x and y yields z. 3D Tiles define the z axis as up for local Cartesian coordinate systems (see the Tile transform section). A tileset's global coordinate system will often be WGS84 coordinates, but it doesn't have to be, e.g., a power plant may be defined fully in its local coordinate system for using with a modeling tool without a geospatial context.
b3dm
and i3dm
tiles embed glTF. According to the glTF spec glTF uses a right-handed coordinate system and defines the y axis as up. By default embedded models are considered to be y-up, but in order to support a variety of source data, including models defined directly in WGS84 coordinates, embedded glTF models may be defined as x-up, y-up, or z-up with the asset.gltfUpAxis
property of tileset.json
. In general an implementation should transform glTF assets to z-up at runtime to be consistent with the z-up coordinate system of the bounding volume hierarchy.
The units for all linear distances are meters.
All angles are in radians.
3D Tiles do not explicitly store Cartographic coordinates (longitude, latitude, and height); these values are implicit in WGS84 Cartesian coordinates, which are efficient for the GPU to render since they do not require a non-affine coordinate transformation. A 3D Tiles tileset can include application-specific metadata, such as Cartographic coordinates, but the semantics are not part of the 3D Tiles specification.
To support local coordinate systems, e.g., so a building tileset inside a city tileset can be defined in its own coordinate system, and a point cloud tileset inside the building could, again, be defined in its own coordinate system, each tile has an optional transform
property.
The transform
property is a 4x4 affine transformation matrix, stored in column-major order, that transforms from the tile's local coordinate system to the parent tile's coordinate system, or tileset's coordinate system in the case of the root tile.
The transform
property applies to:
tile.content
- Each feature's position.
- Each feature's normal should be transformed by the top-left 3x3 matrix of the inverse-transpose of
transform
to account for correct vector transforms when scale is used. content.boundingVolume
, except whencontent.boundingVolume.region
is defined, which is explicitly in WGS84 / EPSG:4326 coordinates.
tile.boundingVolume
, except whentile.boundingVolume.region
is defined, which is explicitly in WGS84 / EPSG:4326 coordinates.tile.viewerRequestVolume
, except whentile.viewerRequestVolume.region
is defined, which is explicitly in WGS84 / EPSG:4326 coordinates.
The transform
property does not apply to geometricError
, i.e., the scale defined by transform
does not scale the geometric error; the geometric error is always defined in meters.
When transform
is not defined, it defaults to the identity matrix:
[
1.0, 0.0, 0.0, 0.0,
0.0, 1.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0,
0.0, 0.0, 0.0, 1.0
]
The transformation from each tile's local coordinate to the tileset's global coordinate system is computed by a top-down traversal of the tileset and post-multiplying a child's transform
with its parent's transform
like a traditional scene graph or node hierarchy in computer graphics.
The following JavaScript code shows how to compute this using Cesium's Matrix4 and Matrix3 types.
function computeTransforms(tileset) {
var t = tileset.root;
var transformToRoot = defined(t.transform) ? Matrix4.fromArray(t.transform) : Matrix4.IDENTITY;
computeTransform(t, transformToRoot);
}
function computeTransform(tile, transformToRoot) {
// Apply 4x4 transformToRoot to this tile's positions and bounding volumes
var inverseTransform = Matrix4.inverse(transformToRoot, new Matrix4());
var normalTransform = Matrix4.getRotation(inverseTransform, new Matrix3());
normalTransform = Matrix3.transpose(normalTransform, normalTransform);
// Apply 3x3 normalTransform to this tile's normals
var children = tile.children;
var length = children.length;
for (var k = 0; k < length; ++k) {
var child = children[k];
var childToRoot = defined(child.transform) ? Matrix4.fromArray(child.transform) : Matrix4.clone(Matrix4.IDENTITY);
childToRoot = Matrix4.multiplyTransformation(transformToRoot, childToRoot, childToRoot);
computeTransform(child, childToRoot);
}
}
For an example of the computed transforms (transformToRoot
in the code above) for a tileset, consider:
The computed transform for each tile is:
TO
:[T0]
T1
:[T0][T1]
T2
:[T0][T2]
T3
:[T0][T1][T3]
T4
:[T0][T1][T4]
The positions and normals in a tile's content may also have tile-specific transformations applied to them before the tile's transform
(before indicates post-multiplying for affine transformations). Some examples are:
b3dm
andi3dm
tiles embed glTF, which defines its own node hierarchy, where each node has a transform. These are applied beforetile.transform
.i3dm
's Feature Table defines per-instance position, normals, and scales. These are used to create per-instance 4x4 affine transform matrices that are applied to each instance beforetile.transform
.- Compressed attributes, such as
POSITION_QUANTIZED
in the Feature Tables fori3dm
,pnts
, andvctr
, andNORMAL_OCT16P
inpnts
should be decompressed before any other transforms.
Therefore, the full computed transforms for the above example are:
TO
:[T0]
T1
:[T0][T1]
T2
:[T0][T2][pnts-specific Feature Table properties-derived transform]
T3
:[T0][T1][T3][b3dm-specific transform, including the glTF node hierarchy]
T4
:[T0][T1][T4][i3dm-specific transform, including per-instance Feature Table properties-derived transform and the glTF node hierarchy]
A tile's viewerRequestVolume
can be used for combining heterogeneous datasets, and can be combined with external tilesets.
The following example has a building in a b3dm
tile and a point cloud inside the building in a pnts
tile. The point cloud tile's boundingVolume
is a sphere with a radius of 1.25
. It also has a larger sphere with a radius of 15
for the viewerRequestVolume
. Since the geometricError
is zero, the point cloud tile's content is always rendered (and initially requested) when the viewer is inside the large sphere defined by viewerRequestVolume
.
"children": [{
"transform": [
4.843178171884396, 1.2424271388626869, 0, 0,
-0.7993325488216595, 3.1159251367235608, 3.8278032889280675, 0,
0.9511533376784163, -3.7077466670407433, 3.2168186118075526, 0,
1215001.7612985559, -4736269.697480114, 4081650.708604793, 1
],
"boundingVolume": {
"box": [
0, 0, 6.701,
3.738, 0, 0,
0, 3.72, 0,
0, 0, 13.402
]
},
"geometricError": 32,
"content": {
"url": "building.b3dm"
}
}, {
"transform": [
0.968635634376879, 0.24848542777253732, 0, 0,
-0.15986650990768783, 0.6231850279035362, 0.7655606573007809, 0,
0.19023066741520941, -0.7415493329385225, 0.6433637229384295, 0,
1215002.0371330238, -4736270.772726648, 4081651.6414821907, 1
],
"viewerRequestVolume": {
"sphere": [0, 0, 0, 15]
},
"boundingVolume": {
"sphere": [0, 0, 0, 1.25]
},
"geometricError": 0,
"content": {
"url": "points.pnts"
}
}]
TODO: screenshot showing the request vs. bounding volume
For more on request volumes, see the sample tileset and demo video.
tileset.json defines a tileset. Here is a subset of the tileset.json used for Canary Wharf (also see the complete tileset.json):
{
"asset" : {
"version": "0.0",
"tilesetVersion": "e575c6f1-a45b-420a-b172-6449fa6e0a59",
"gltfUpAxis": "Y"
},
"properties": {
"Height": {
"minimum": 1,
"maximum": 241.6
}
},
"geometricError": 494.50961650991815,
"root": {
"boundingVolume": {
"region": [
-0.0005682966577418737,
0.8987233516605286,
0.00011646582098558159,
0.8990603398325034,
0,
241.6
]
},
"geometricError": 268.37878244706053,
"content": {
"url": "0/0/0.b3dm",
"boundingVolume": {
"region": [
-0.0004001690908972599,
0.8988700116775743,
0.00010096729722787196,
0.8989625664878067,
0,
241.6
]
}
},
"children": [..]
}
}
The top-level object in tileset.json has four properties: asset
, properties
, geometricError
, and root
.
asset
is an object containing properties with metadata about the entire tileset. Its version
property is a string that defines the 3D Tiles version. The version defines the JSON schema for tileset.json and the base set of tile formats. The tilesetVersion
property is an optional string that defines an application-specific version of a tileset, e.g., for when an existing tileset is updated. The gltfUpAxis
property is an optional string that specifies the up-axis of glTF models contained in the tileset.
properties
is an object containing objects for each per-feature property in the tileset. This tileset.json snippet is for 3D buildings, so each tile has building models, and each building model has a Height
property (see Batch Table). The name of each object in properties
matches the name of a per-feature property, and defines its minimum
and maximum
numeric values, which are useful, for example, for creating color ramps for styling.
geometricError
is a nonnegative number that defines the error, in meters, when the tileset is not rendered.
root
is an object that defines the root tile using the JSON described in the above section. root.geometricError
is not the same as tileset.json's top-level geometricError
. tileset.json's geometricError
is the error when the entire tileset is not rendered; root.geometricError
is the error when only the root tile is rendered.
root.children
is an array of objects that define child tiles. Each child tile has a boundingVolume
fully enclosed by its parent tile's boundingVolume
and, generally, a geometricError
less than its parent tile's geometricError
. For leaf tiles, the length of this array is zero, and children
may not be defined.
See schema for the detailed JSON schema for tileset.json.
See the Q&A below for how tileset.json will scale to a massive number of tiles.
To create a tree of trees, a tile's content.url
can point to an external tileset (another tileset.json). This enables, for example, storing each city in a tileset and then having a global tileset of tilesets.
When a tile points to an external tileset, the tile
- Cannot have any children,
tile.children
must beundefined
or an empty array. - Has several properties that match the external tileset's root tile:
root.geometricError === tile.geometricError
,root.refine === tile.refine
, androot.boundingVolume === tile.content.boundingVolume
(orroot.boundingVolume === tile.boundingVolume
whentile.content.boundingVolume
isundefined
).root.viewerRequestVolume === tile.viewerRequestVolume
orroot.viewerRequestVolume
isundefined
.
- Cannot be used to create cycles, for example, by pointing to the same tileset.json containing the tile or by pointing to another tileset.json that then points back to the tileset.json containing the tile.
- Both the tile's
transform
and root tile'stransform
are applied. For example, in the following tileset referencing an external tileset, the computed transform forT3
is[T0][T1][T2][T3]
.
As described above, the tree has spatial coherence; each tile has a bounding volume completely enclosing its contents, and the content for child tiles are completely inside the parent's bounding volume. This does not imply that a child's bounding volume is completely inside its parent's bounding volume. For example:
Bounding sphere for a terrain tile.
Bounding spheres for the four child tiles. The children's content are completely inside the parent's bounding volume, but the children's bounding volumes are not since they are not tightly fit.
The tree defined in tileset.json by root
and, recursively, its children
, can define different types of spatial data structures. In addition, any combination of tile formats and refinement approach (replacement or additive) can be used, enabling a lot of flexibility to support heterogeneous datasets.
It is up to the conversion tool that generates tileset.json to define an optimal tree for the dataset. A runtime engine, such as Cesium, is generic and will render any tree defined by tileset.json. Here's a brief descriptions of how 3D Tiles can represent various spatial data structures.
A k-d tree is created when each tile has two children separated by a splitting plane parallel to the x, y, or z axis (or longitude, latitude, height). The split axis is often round-robin rotated as levels increase down the tree, and the splitting plane may be selected using the median split, surface area heuristics, or other approaches.
Example k-d tree. Note the non-uniform subdivision.
Note that a k-d tree does not have uniform subdivision like typical 2D geospatial tiling schemes and, therefore, can create a more balanced tree for sparse and non-uniformly distributed datasets.
3D Tiles enable variations on k-d trees such as multi-way k-d trees where, at each leaf of the tree, there are multiple splits along an axis. Instead of having two children per tile, there are n
children.
A quadtree is created when each tile has four uniformly subdivided children (e.g., using the center longitude and latitude) similar to typical 2D geospatial tiling schemes. Empty child tiles can be omitted.
3D Tiles enable quadtree variations such as non-uniform subdivision and tight bounding volumes (as opposed to bounding, for example, the full 25% of the parent tile, which is wasteful for sparse datasets).
Quadtree with tight bounding volumes around each child.
For example, here is the root tile and its children for Canary Wharf. Note the bottom left, where the bounding volume does not include the water on the left where no buildings will appear:
3D Tiles also enable other quadtree variations such as loose quadtrees, where child tiles overlap but spatial coherence is still preserved, i.e., a parent tile completely encloses all of its children. This approach can be useful to avoid splitting features, such as 3D models, across tiles.
Quadtree with non-uniform and overlapping tiles.
Below, the green buildings are in the left child and the purple buildings are in the right child. Note that the tiles overlap so the two green and one purple building in the center are not split.
An octree extends a quadtree by using three orthogonal splitting planes to subdivide a tile into eight children. Like quadtrees, 3D Tiles allow variations to octrees such as non-uniform subdivision, tight bounding volumes, and overlapping children.
Traditional octree subdivision.
Non-uniform octree subdivision for a point cloud using additive refinement. Point Cloud of the Church of St Marie at Chappes, France by Prof. Peter Allen, Columbia University Robotics Lab. Scanning by Alejandro Troccoli and Matei Ciocarlie
3D Tiles enable uniform, non-uniform, and overlapping grids by supporting an arbitrary number of child tiles. For example, here is a top-down view of a non-uniform overlapping grid of Cambridge:
3D Tiles take advantage of empty tiles: those tiles that have a bounding volume, but no content. Since a tile's content
property does not need to be defined, empty non-leaf tiles can be used to accelerate non-uniform grids with hierarchical culling. This essentially creates a quadtree or octree without hierarchical levels of detail (HLOD).
Each tile's content.url
property points to a tile that is one of the formats listed in the Status section above.
A tileset can contain any combination of tile formats. 3D Tiles may also support different formats in the same tile using a Composite tile.
Buildings colored by height using declarative styling.
3D Tiles include concise declarative styling defined with JSON and expressions written in a small subset of JavaScript augmented for styling.
Styles generally define a feature's show
and color
(RGB and translucency) using an expression based on a feature's properties, for example:
{
"color" : "(${Temperature} > 90) ? color('red') : color('white')"
}
This colors features with a temperature above 90 as red and the others as white.
For complete details, see the Declarative Styling spec.
- General Q&A
- Technical Q&A
- How do 3D Tiles support heterogeneous datasets?
- Will tileset.json be part of the final 3D Tiles spec?
- How do I request the tiles for Level
n
? - Will 3D Tiles support horizon culling?
- Is screen-space error the only metric used to drive refinement?
- How are cracks between tiles with vector data handled?
- When using replacement refinement, can multiple children be combined into one request?
- How can additive refinement be optimized?
- What compressed texture formats do 3D Tiles use?
We expect the initial 3D Tiles spec to evolve until fall 2016. If you are OK with things changing, then yes, jump in.
No, 3D Tiles are a general spec for streaming massive heterogeneous 3D geospatial datasets. The Cesium team started this initiative because we need an open format optimized for streaming 3D content to Cesium. AGI, the founder of Cesium, is also developing tools for creating 3D Tiles. We expect to see other visualization engines and conversion tools use 3D Tiles.
glTF, the runtime asset format for WebGL, is an open standard for 3D models from Khronos (the same group that does WebGL and COLLADA). Cesium uses glTF as its 3D model format, and the Cesium team contributes heavily to the glTF spec and open-source COLLADA2GLTF converter. We recommend using glTF in Cesium for individual assets, e.g., an aircraft, a character, or a 3D building.
We created 3D Tiles for streaming massive geospatial datasets where a single glTF model would be prohibitive. Given that glTF is optimized for rendering, that Cesium has a well-tested glTF loader, and that there are existing conversion tools for glTF, 3D Tiles use glTF for some tile formats such as b3dm (used for 3D buildings). We created a binary glTF extension (KHR_binary_glTF) in order to embed glTF into binary tiles and avoid base64-encoding or multiple file overhead.
Taking this approach allows us to improve Cesium, glTF, and 3D Tiles at the same time, e.g., when we add mesh compression to glTF, it benefits 3D models in Cesium, the glTF ecosystem, and 3D Tiles.
A common use case for 3D buildings is to stream a city dataset, color each building based on one or more properties (e.g., the building's height), and then hide a few buildings and replace them with high-resolution 3D buildings. With 3D Tiles, this type of editing can be done at runtime.
The general case runtime editing of geometry on a building, vector data, etc., and then efficiently saving those changes in a 3D Tile will be possible, but is not the initial focus. However, styling is much easier since it can be applied at runtime without modification to the 3D Tiles tree and is part of the initial work.
Yes, a quantized-mesh-like tile would fit well with 3D Tiles and allow Cesium to use the same streaming code (we say quantized-mesh-like because some of the metadata, e.g., for bounding volumes and horizon culling, may be organized differently or moved to tileset.json).
However, since Cesium already streams terrain well, we are not focused on this in the short-term.
Yes, there is an opportunity to provide an optimized base layer of terrain and imagery (similar to how a 3D model contains both geometry and textures). There is also the open research problem of how to tile imagery for 3D. In 2D, only one LOD (z
layer) is used for a given view. In 3D, especially when looking towards the horizon, tiles from multiple LODs are adjacent to each other. How do we make the seams look good? This will likely require tool and runtime support.
As with terrain, since Cesium already streams imagery well, we are not focused on this in the short-term.
In many cases, yes. KML regions and network links are a clunky approach to streaming massive 3D geospatial datasets on the web. 3D Tiles are built for the web and optimized for streaming; true HLOD is used; polygons do not need to be triangulated; and so on.
Geospatial datasets are heterogeneous: 3D buildings are different from terrain, which is different from point clouds, which are different from vector data, and so on.
3D Tiles support heterogeneous data by allowing different tile formats in a tileset, e.g., a tileset may contain tiles for 3D buildings, tiles for instanced 3D trees, and tiles for point clouds, all using different tile formats.
3D Tiles also support heterogeneous datasets by concatenating different tile formats into one tile using the Composite tile format. In the example above, a tile may have a short header followed by the content for the 3D buildings, instanced 3D trees, and point clouds.
Supporting heterogeneous datasets with both inter-tile (different tile formats in the same tileset) and intra-tile (different tile formats in the same Composite tile) options allows conversion tools to make trade-offs between number of requests, optimal type-specific subdivision, and how visible/hidden layers are streamed.
Yes. There will always be a need to know metadata about the tileset and about tiles that are not yet loaded, e.g., so only visible tiles can be requested. However, when scaling to millions of tiles, a single tileset.json with metadata for the entire tree would be prohibitively large.
3D Tiles already support trees of trees. content.url
can point to another tileset.json, which enables conversion tools to chunk up a tileset into any number of tileset.json files that reference each other.
There's a few other ways we may solve this:
- Moving subtree metadata to the tile payload instead of tileset.json. Each tile would have a header with, for example, the bounding volumes of each child, and perhaps grandchildren, and so on.
- Explicit tile layout like those of traditional tiling schemes (e.g., TMS's
z/y/x
). The challenge is that this implicitly assumes a spatial subdivision, whereas 3D Tiles are general enough to support quadtrees, octrees, k-d trees, and so on. There is likely to be a balance where two or three explicit tiling schemes can cover common cases to complement the generic spatial data structures.
More generally, how do 3D Tiles support the use case for when the viewer is zoomed in very close to terrain, for example, and we do not want to load all the parent tiles toward the root of the tree; instead, we want to skip right to the high-resolution tiles needed for the current 3D view?
This 3D Tiles topic needs additional research, but the answer is basically the same as above: either the skeleton of the tree can be quickly traversed to find the desired tiles or an explicit layout scheme will be used for specific subdivisions.
Since horizon culling is useful for terrain, 3D Tiles will likely support the metadata needed for it. We haven't considered it yet since our initial work with 3D Tiles is for 3D buildings where horizon culling is not effective.
At runtime, a tile's geometricError
is used to compute the Screen-Space Error (SSE) to drive refinement. We expect to expand this, for example, by using the Virtual Multiresolution Screen Space Error (VMSSE), which takes occlusion into account. This can be done at runtime without streaming additional tile metadata. Similarly, fog can also be used to tolerate increases to the SSE in the distance.
However, we do anticipate other metadata for driving refinement. SSE may not be appropriate for all datasets; for example, points of interest may be better served with on/off distances and a label collision factor computed at runtime. Note that the viewer's height above the ground is rarely a good metric for 3D since 3D supports arbitrary views.
See #15.
Unlike 2D, in 3D, we expect adjacent tiles to be from different LODs so, for example, in the distance, lower resolution tiles are used. Adjacent tiles from different LODs can lead to an artifact called cracking where there are gaps between tiles. For terrain, this is generally handled by dropping skirts slightly angled outward around each tile to fill the gap. For 3D buildings, this is handled by extending the tile boundary to fully include buildings on the edge; see above. For vector data, this is an open research problem that we need to solve. This could invole boundary-aware simplification or runtime stitching.
Often when using replacement refinement, a tile's children are not rendered until all children are downloaded (an exception, for example, is unstructured data such as point clouds, where clipping planes can be used to mask out parts of the parent tile where the children are loaded; naively using the same approach for terrain or an arbitrary 3D model results in cracking or other artifacts between the parent and child).
We may design 3D Tiles to support downloading all children in a single request by allowing tileset.json to point to a subset of a file for a tile's content similiar to glTF buffer and bufferView. HTTP/2 will also make the overhead of multiple requests less important.
See #9.
Compared to replacement refinement, additive refinement has a size advantage because it doesn't duplicate data in the original dataset. However, it has a disadvantage when there are expensive tiles to render near the root and the view is zoomed in close. In this case, for example, the entire root tile may be rendered, but perhaps only one feature or even no features are visible.
3D Tiles can optimize this by storing an optional spatial data structure in each tile. For example, a tile could contain a simple 2x2 grid, and if the tile's bounding volume is not completely inside the view frustum, each box in the grid is checked against the frustum, and only those inside or intersecting are rendered.
See #11.
3D Tiles will support the same texture compression that glTF will support. In addition, we need to consider how well GPU formats compress compared to, for example, jpeg. Some desktop game engines stream jpeg, then decompress and recompress to a GPU format in a thread. The CPU overhead for this approach may be too high for JavaScript and Web Workers.
The screenshots in this spec use awesome CyberCity3D buildings and the Bing Maps base layer.