This document was originally designed in the Immersive Web Working Group to build upon the WebXR Device API. Ownership of this content has been moved to the Immersive Web Community Group for further incubation.
The purpose of this document is to describe a design for enabling developers to hit test against the real-world environment. For context, it may be helpful to have first read about WebXR Session Establishment, Spatial Tracking, Input Mechanisms, and the Augmented Reality Module.
"Hit testing" (aka "raycasting") is the process of finding intersections between 3D geometry and a ray, comprised of an origin and direction. Conceptually, hit testing can be done against virtual 3D geometry or real-world 3D geometry. As WebXR does not have any knowledge of the developer's 3D scene graph, it does not provide APIs for virtual hit testing. It does, however, have information about the real-world and provides a method for developers to hit test against it. Most commonly in WebXR, developers will hit test using XRInputSource
s or the XRReferenceSpace
of type "viewer"
to track where a cursor should be drawn on hand-held devices, or even to bounce a virtual object off real-world geometry. In WebXR, 'inline' and 'immersive-vr' sessions are limited to performing virtual hit tests, while 'immersive-ar' sessions can perform both virtual and real-world hit tests.
Main use-cases enabled by hit testing API include:
- Showing an object that appears to track the real world surfaces at which the device or controller is pointed.
- Often, AR apps want to display something that appears to stick to real-world surfaces as the user moves the pointing device. The object's position should reflect most up-to-date knowledge of the real world as of the displayed frame.
- Frequency: this action is done every single frame.
- Placing a virtual object in the real world.
- In order for virtual objects to appear to be anchored in the real world, they must be placed at the same height as the real world objects (the floor, a table, a wall, ...).
- Frequency: this action is usually done in response to user input and can potentially happen on every frame.
Hit-testing against application's virtual scene elements is explicitly out of scope for this API.
Hit-testing might potentially be used to estimate the location of real-world geometry by the application (for example by attempting to perform a hit test using dozens of rays) - this use case is not directly supported by the API, but will not be actively blocked.
Since the hit test API can potentially be used to extract data about user's environment similarly to real-world-geometry APIs (albeit at lower fidelity), UAs should be careful about controlling the access to the API - the specific mechanisms of how this could be achieved are out of scope for this explainer.
As an alternative to using hit-test API, applications could try and perform arbitrary hit tests leveraging data obtained from real-world-geometry APIs. Due to that, it's unclear whether a web-exposed hit test would be useful and feedback from early adopters of the API will be especially important.
A key challenge with enabling real-world hit testing in WebXR is that computing real-world hit test results can be performance-impacting and dependant on secondary threads in many of the underlying implementations. However from a developer perspective, out-of-date asynchronous hit test results are often less than useful.
WebXR addresses this challenge through the use of the XRHitTestSource
& XRTransientInputHitTestSource
interfaces which serve as handles to hit test subscription. The presence of a hit test source signals to the user agent that the developer intends to query hit test results in subsequent XRFrame
s. The user agent can then precompute hit test results based on the properties of a hit test source such that each XRFrame
will be bundled with all "subscribed" hit test results.
To create an XRHitTestSource
developers call the XRSession.requestHitTestSource()
function. This function accepts an XRHitTestOptionsInit
dictionary with the following key-value pairs:
space
is required and is theXRSpace
to be tracked by the hit test source. As thisXRSpace
updates its location each frame, theXRHitTestSource
will move with it.entityTypes
- see limiting results to specific entities section.offsetRay
is optional and, if provided, is theXRRay
from which the hit test should be performed. The ray's coordinates are defined withspace
as the origin. If anoffsetRay
is not provided, hit testing will be performed using a ray with coincident with thespace
origin and pointing in the "forward" direction (see Rays section).
In this example, an XRHitTestSource
is created slightly above the center of the "viewer"
XRReferenceSpace
. This is because the developer is planning to draw UI elements along the bottom of the hand-held AR device's immersive view while still wanting to give the perception of a centered cursor. For more information, see Rendering cursors and highlights in the Input Explainer.
let viewerHitTestSource = null;
let viewerSpace = ...; // XRReferenceSpace obtained via
// a call to XRSession.requestReferenceSpace("viewer");
let hitTestOptionsInit = {
space : viewerSpace,
offsetRay : new XRRay({y: 0.5})
};
xrSession.requestHitTestSource(hitTestOptionsInit).then((hitTestSource) => {
viewerHitTestSource = hitTestSource;
// Store some additional data on just created hit test source
// by extending the object:
viewerHitTestSource.appContext = { options : hitTestOptionsInit };
});
While asynchronous hit test source creation is useful in many scenarios, it is problematic for transient input sources. If an XRHitTestSource
is requested in response to the inputsourceschange
event using the XRSession.requestHitTestSource()
API, it may take several frames before a hit test source created in response would able to provide hit test results, by which time the input source may no longer exist. This might be the case even with a one-frame delay between hit test source creation request and its creation. However, because of the potential performance impacts mentioned in section Real-world hit testing, it is important WebXR not perform hit tests for input sources the developer does not need.
To address this issue and still enable the web applications to request hit test sources for transient input sources, the applications can use the XRSession.requestHitTestSourceForTransientInput()
:
let transientInputHitTestSource = null;
let hitTestOptionsInit = {
profile : 'generic-touchscreen',
offsetRay : new XRRay()
};
xrSession.requestHitTestSourceForTransientInput(hitTestOptionsInit).then((hitTestSource) => {
transientInputHitTestSource = hitTestSource;
// Store some additional data on just created hit test source
// by extending the object:
transientInputHitTestSource.context = { options : hitTestOptionsInit };
})
The XRSession.requestHitTestSourceForTransientInput()
method accepts a dictionary with the following key-value pairs:
profile
is required and specifies the input profile name (see input profile names) that the transient input source must match in order to be considered for a hit test once it is created (for example in response to the user input).entityTypes
- see limiting results to specific entities section.offsetRay
is optional and specifies anXRRay
for which the hit test should be performed. The ray will be interpreted as if relative totargetRaySpace
of the transient input source that matches the profile mentioned above.
To get synchronous hit test results for a particular frame, developers call XRFrame.getHitTestResults()
passing in a XRHitTestSource
as the hitTestSource
parameter. This function will return a FrozenArray<XRHitTestResult>
in which XRHitTestResult
s are ordered by distance along the XRRay
used to perform the hit test, with the nearest in the 0th position. If no results exist, the array will have a length of zero. The XRHitTestResult
interface will expose a method, getPose(XRSpace baseSpace)
that can be used to query the result's pose. If, in the current frame, the relationship between XRSpace
passed in to baseSpace
parameter cannot be located relative to the hit test result, the function will return null
.
// Input source returned from a call to XRSession.requestHitTestSource(...):
let hitTestSource = ...;
function updateScene(timestamp, xrFrame) {
// Scene update logic ...
let hitTestResults = xrFrame.getHitTestResults(hitTestSource);
if (hitTestResults.length > 0) {
// Do something with the results
}
// Other scene update logic ...
}
In order to obtain hit test results for transient input source hit test subscriptions in a particular frame, developers call XRFrame.getHitTestResultsForTransientInput()
passing in a XRTransientInputHitTestSource
as the hitTestSource
parameter. This function will return a FrozenArray<XRTransientInputHitTestResult>
. Each element of the array will contain an instance of the input source that was used to obtain the results, and the actual hit test results will be contained in FrozenArray<XRHitTestResult> results
, ordered by the distance along the ray used to perform the hit test, with the closest result at 0th position.
// Input source returned from a call to
// XRSession.requestHitTestSourceForTransientInput(...):
let transientInputHitTestSource = ...;
function updateScene(timestamp, xrFrame) {
// Scene update logic ...
let hitTestResultsPerInputSource = xrFrame.getHitTestResultsForTransientInput(transientInputHitTestSource);
hitTestResultsPerInputSource.forEach(resultsPerInputSource => {
if(!isInteresting(resultsPerInputSource.inputSource)) {
return; // Application can perform additional
// filtering based on the input source.
}
if (resultsPerInputSource.results.length > 0) {
// Do something with the results
}
});
// Other scene update logic ...
}
Hit test results returned from the underlying platform can carry an information about the real-world entity that caused the hit test result to be present. Examples of the entities include planes and feature points. The application can specify what kind of entities should be used for a particular hit test subscription by setting a value of entityTypes
key in XRHitTestOptionsInit
/ XRTransientInputHitTestOptionsInit
:
let hitTestOptionsInit = {
space : xrSpace,
entityTypes : ["plane", "point"],
offsetRay : XRRay()
};
let transientInputHitTestOptionsInit = {
profile : "generic-touchscreen",
entityTypes : ["plane"],
offsetRay : XRRay()
};
Using multiple values in the array set for entityTypes
key will be treated as a logical "or" filter. For example entityTypes : ["plane", "point"]
would mean that the arrays returned from XRFrame.getHitTestResults()
/ XRFrame.getHitTestResultsForTransientInput()
will contain hit tests based off of real-world planes, as well as results based off of characteristic points detected in the user's environment; those are the hit test results whose entities satisfy a condition (type == "plane") or (type == "point")
, assuming that the type
contains a type of the given entity. If the application does not set a value for entityTypes
key when requesting hit test source, a default value of ["plane"]
will be used.
In order to allow the applications to unsubscribe from hit test sources, hit test source and hit test source for transient input expose a cancel()
method:
let hitTestSource = ...; // Obtained from XRSession.requestHitTestSource(...).
// Unsubscribe from hit test:
hitTestSource.cancel();
// hitTestSource will no longer be usable to obtain the results,
// might as well set it to null to avoid mistakes.
hitTestSource = null;
let hitTestSourceForTransientInput = ...; // Obtained from XRSession.requestHitTestSourceForTransientInput(...).
// Unsubscribe from hit test for transient input:
hitTestSourceForTransientInput.cancel();
// hitTestSourceForTransientInput will no longer be usable to obtain the results,
// might as well set it to null to avoid mistakes.
hitTestSourceForTransientInput = null;
An XRRay
object includes both an origin
and direction
, both given as DOMPointReadOnly
s. The origin
represents a 3D coordinate in space with a w
component that must be equal to 1, and the direction
represents a normalized 3D directional vector with a w
component that must be equal to 0. The XRRay
also defines a matrix
which represents the transform from a ray originating at [0, 0, 0]
and extending down the negative Z axis to the ray described by the XRRay
's origin
and direction
. This is useful for positioning graphical representations of the ray.
A key component to creating realistic presence in XR experiences, relies on the ability to know if a hit test intersects virtual or real-world geometry. For example, developers might want to put a virtual object somewhere in the real-world but only if a different virtual object isn't already present. In future spec revisions, when real-world occlusion is possible with WebXR, developers will likely be able to create virtual buttons that are only "clickable" if there is no physical object in the way.
There are a handful of techniques which can be used to determine a combined hit test result. For example, a developer may choose to weight hit test results differently if a user is already interacting with a particular object. In this explainer, a simple example of combining hit test results is provided: if a virtual hit-test is found it is returned, otherwise the sample returns the closest real-world hit test result. Because WebXR does not have any knowledge of the developer's 3D scene graph, this sample uses the XRFrame.getPose()
function to create a ray and passes it into the 3D engine's virtual hit test function.
function getCombinedHitTestResult(frame, inputSource, hitTestSource) {
// Try to get virtual hit test result
if (inputSource) {
let inputSourcePose = frame.getPose(inputSource.targetRaySpace, xrReferenceSpace);
if (inputSourcePose) {
var virtualHitTestResult = scene.virtualHitTest(new XRRay(inputSourcePose.transform));
return {
result : virtualHitTestResult,
virtualTarget : virtualHitTestResult.target
}
}
}
// Try to get real-world hit test result
if (hitTestSource) {
var realHitTestResults = frame.getHitTestResults(hitTestSource);
if (realHitTestResults && realHitTestResults.length > 0) {
return { result : realHitTestResults[0] };
}
}
return {};
}
This feature will help a website map the user’s physical environment with a somewhat low level of accuracy. The specification allows a UA to restrict the usage of hit test subscriptions based on an XRRay
. Using a lot of rays would allow the site to have a more detailed view of the environment, and the UA may decide that after a certain number, the requests are superfluous.
This feature is blocked by default for third-party contexts and can be controlled via a Feature Policy flag.
This is a partial IDL and is considered additive to the core IDL found in the main explainer.
//
// Session
//
partial interface XRSession {
Promise<XRHitTestSource> requestHitTestSource(XRHitTestOptionsInit options);
Promise<XRTransientInputHitTestSource> requestHitTestSourceForTransientInput(XRTransientInputHitTestOptionsInit options);
};
//
// Frame
//
partial interface XRFrame {
FrozenArray<XRHitTestResult> getHitTestResults(XRHitTestSource hitTestSource);
FrozenArray<XRTransientInputHitTestResult> getHitTestResultsForTransientInput(XRTransientInputHitTestSource hitTestSource);
};
//
// Hit Testing Options
//
enum XRHitTestTrackableType {
"point",
"plane"
};
dictionary XRHitTestOptionsInit {
required XRSpace space;
FrozenArray<XRHitTestTrackableType> entityTypes;
XRRay offsetRay = new XRRay();
};
dictionary XRTransientInputHitTestOptionsInit {
required DOMString profile;
FrozenArray<XRHitTestTrackableType> entityTypes;
XRRay offsetRay = new XRRay();
};
//
// Hit Test Sources
//
[SecureContext, Exposed=Window]
interface XRHitTestSource {
undefined cancel();
};
[SecureContext, Exposed=Window]
interface XRTransientInputHitTestSource {
undefined cancel();
};
//
// Hit Test Results
//
[SecureContext, Exposed=Window]
interface XRHitTestResult {
XRPose? getPose(XRSpace baseSpace);
};
[SecureContext, Exposed=Window]
interface XRTransientInputHitTestResult {
[SameObject] readonly attribute XRInputSource inputSource;
readonly attribute FrozenArray<XRHitTestResult> results;
};
//
// Geometric Primitives
//
[SecureContext, Exposed=Window,
Constructor(optional DOMPointInit origin, optional DOMPointInit direction),
Constructor(XRRigidTransform transform)]
interface XRRay {
[SameObject] readonly attribute DOMPointReadOnly origin;
[SameObject] readonly attribute DOMPointReadOnly direction;
[SameObject] readonly attribute Float32Array matrix;
};