HTML

Living Standard — Last Updated 16 July 2024

1. 2.4 URLs
2. 2.5 Fetching resources

2.4 URLs

2.4.1 Terminology

A string is a valid non-empty URL if it is a valid URL string but it is not the empty string.

A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing ASCII whitespace from it, it is a valid URL string.

A string is a valid non-empty URL potentially surrounded by spaces if, after stripping leading and trailing ASCII whitespace from it, it is a valid non-empty URL.

This specification defines the URL about:legacy-compat as a reserved, though unresolvable, about: URL, for use in DOCTYPEs in HTML documents when needed for compatibility with XML tools. [ABOUT]

This specification defines the URL about:html-kind as a reserved, though unresolvable, about: URL, that is used as an identifier for kinds of media tracks. [ABOUT]

This specification defines the URL about:srcdoc as a reserved, though unresolvable, about: URL, that is used as the URL of iframe srcdoc documents. [ABOUT]

The fallback base URL of a Document object document is the URL record obtained by running these steps:

If document is an iframe srcdoc document, then:
1. Assert: document's about base URL is non-null.
2. Return document's about base URL.
If document's URL matches about:blank and document's about base URL is non-null, then return document's about base URL.
Return document's URL.

The document base URL of a Document object is the URL record obtained by running these steps:

If there is no base element that has an href attribute in the Document, then return the Document's fallback base URL.
Otherwise, return the frozen base URL of the first base element in the Document that has an href attribute, in tree order.

A URL matches about:blank if its scheme is "about", its path contains a single string "blank", its username and password are the empty string, and its host is null.

Such a URL's query and fragment can be non-null. For example, the URL record created by parsing "about:blank?foo#bar" matches about:blank.

A URL matches about:srcdoc if its scheme is "about", its path contains a single string "srcdoc", its query is null, its username and password are the empty string, and its host is null.

The reason that matches about:srcdoc ensures that the URL's query is null is because it is not possible to create an iframe srcdoc document whose URL has a non-null query, unlike Documents whose URL matches about:blank. In other words, the set of all URLs that match about:srcdoc only vary in their fragment.

2.4.2 Parsing URLs

Parsing a URL is the process of taking a string and obtaining the URL record that it represents. While this process is defined in URL, the HTML standard defines several wrappers to abstract base URLs and encodings. [URL]

Most new APIs are to use parse a URL. Older APIs and HTML elements might have reason to use encoding-parse a URL. When a custom base URL is needed or no base URL is desired, the URL parser can of course be used directly as well.

To parse a URL, given a string url, relative to a Document object or environment settings object environment, run these steps. They return failure or a URL.

Let baseURL be environment's base URL, if environment is a Document object; otherwise environment's API base URL.
Return the result of applying the URL parser to url, with baseURL.

To encoding-parse a URL, given a string url, relative to a Document object or environment settings object environment, run these steps. They return failure or a URL.

Let encoding be UTF-8.
If environment is a Document object, then set encoding to environment's character encoding.
Otherwise, if environment's relevant global object is a Window object, set encoding to environment's relevant global object's associated Document's character encoding.
Let baseURL be environment's base URL, if environment is a Document object; otherwise environment's API base URL.
Return the result of applying the URL parser to url, with baseURL and encoding.

To encoding-parse-and-serialize a URL, given a string url, relative to a Document object or environment settings object environment, run these steps. They return failure or a string.

Let url be the result of encoding-parsing a URL given url, relative to environment.
If url is failure, then return failure.
Return the result of applying the URL serializer to url.

2.4.3 Dynamic changes to base URLs

When a document's document base URL changes, all elements in that document are affected by a base URL change.

The following are base URL change steps, which run when an element is affected by a base URL change (as defined by DOM):

If the element creates a hyperlink

If the URL identified by the hyperlink is being shown to the user, or if any data derived from that URL is affecting the display, then the href attribute's value should be reparsed, relative to the element's node document and the UI updated appropriately.

For example, the CSS :link/:visited pseudo-classes might have been affected.

If the hyperlink has a ping attribute and its URL(s) are being shown to the user, then the ping attribute's tokens should be reparsed, relative to the element's node document and the UI updated appropriately.

If the element is a q, blockquote, ins, or del element with a cite attribute

If the URL identified by the cite attribute is being shown to the user, or if any data derived from that URL is affecting the display, then the cite attribute's value should be reparsed, relative to the element's node document and the UI updated appropriately.

Otherwise

The element is not directly affected.

For instance, changing the base URL doesn't affect the image displayed by img elements, although subsequent accesses of the src IDL attribute from script will return a new absolute URL that might no longer correspond to the image being shown.

2.5 Fetching resources

2.5.1 Terminology

A response whose type is "basic", "cors", or "default" is CORS-same-origin. [FETCH]

A response whose type is "opaque" or "opaqueredirect" is CORS-cross-origin.

A response's unsafe response is its internal response if it has one, and the response itself otherwise.

To create a potential-CORS request, given a url, destination, corsAttributeState, and an optional same-origin fallback flag, run these steps:

Let mode be "no-cors" if corsAttributeState is No CORS, and "cors" otherwise.
If same-origin fallback flag is set and mode is "no-cors", set mode to "same-origin".
Let credentialsMode be "include".
If corsAttributeState is Anonymous, set credentialsMode to "same-origin".
Let request be a new request whose URL is url, destination is destination, mode is mode, credentials mode is credentialsMode, and whose use-URL-credentials flag is set.

2.5.2 Determining the type of a resource

The Content-Type metadata of a resource must be obtained and interpreted in a manner consistent with the requirements of MIME Sniffing. [MIMESNIFF]

The computed MIME type of a resource must be found in a manner consistent with the requirements given in MIME Sniffing. [MIMESNIFF]

The rules for sniffing images specifically, the rules for distinguishing if a resource is text or binary, and the rules for sniffing audio and video specifically are also defined in MIME Sniffing. These rules return a MIME type as their result. [MIMESNIFF]

It is imperative that the rules in MIME Sniffing be followed exactly. When a user agent uses different heuristics for content type detection than the server expects, security problems can occur. For more details, see MIME Sniffing. [MIMESNIFF]

2.5.3 Extracting character encodings from `meta` elements

The algorithm for extracting a character encoding from a meta element, given a string s, is as follows. It either returns a character encoding or nothing.

Let position be a pointer into s, initially pointing at the start of the string.
Loop: Find the first seven characters in s after position that are an ASCII case-insensitive match for the word "charset". If no such match is found, return nothing.
Skip any ASCII whitespace that immediately follow the word "charset" (there might not be any).
If the next character is not a U+003D EQUALS SIGN (=), then move position to point just before that next character, and jump back to the step labeled loop.
Skip any ASCII whitespace that immediately follow the equals sign (there might not be any).
Process the next character as follows:

If it is a U+0022 QUOTATION MARK character (") and there is a later U+0022 QUOTATION MARK character (") in s
If it is a U+0027 APOSTROPHE character (') and there is a later U+0027 APOSTROPHE character (') in s
Return the result of getting an encoding from the substring that is between this character and the next earliest occurrence of this character.
If it is an unmatched U+0022 QUOTATION MARK character (")
If it is an unmatched U+0027 APOSTROPHE character (')
If there is no next character
Return nothing.
Otherwise
Return the result of getting an encoding from the substring that consists of this character up to but not including the first ASCII whitespace or U+003B SEMICOLON character (;), or the end of s, whichever comes first.

This algorithm is distinct from those in the HTTP specifications (for example, HTTP doesn't allow the use of single quotes and requires supporting a backslash-escape mechanism that is not supported by this algorithm). While the algorithm is used in contexts that, historically, were related to HTTP, the syntax as supported by implementations diverged some time ago. [HTTP]

2.5.4 CORS settings attributes

Attributes/crossorigin

Support in all current engines.

Firefox8+Safari6+Chrome13+

Opera?Edge79+

Edge (Legacy)12+Internet ExplorerYes

Firefox Android?Safari iOS?Chrome Android?WebView Android?Samsung Internet?Opera Android?

A CORS settings attribute is an enumerated attribute with the following keywords and states:

Keyword	State	Brief description
`anonymous`	Anonymous	Requests for the element will have their mode set to "`cors`" and their credentials mode set to "`same-origin`".
(the empty string)	Anonymous
`use-credentials`	Use Credentials	Requests for the element will have their mode set to "`cors`" and their credentials mode set to "`include`".

The attribute's missing value default is the No CORS state, and its invalid value default is the Anonymous state. For the purposes of reflection, the canonical keyword for the Anonymous state is the anonymous keyword.

The majority of fetches governed by CORS settings attributes will be done via the create a potential-CORS request algorithm.

For more modern features, where the request's mode is always "cors", certain CORS settings attributes have been repurposed to have a slightly different meaning, wherein they only impact the request's credentials mode. To perform this translation, we define the CORS settings attribute credentials mode for a given CORS settings attribute to be determined by switching on the attribute's state:

No CORS
Anonymous: "same-origin"
Use Credentials: "include"

2.5.5 Referrer policy attributes

A referrer policy attribute is an enumerated attribute. Each referrer policy, including the empty string, is a keyword for this attribute, mapping to a state of the same name.

The attribute's missing value default and invalid value default are both the empty string state.

The impact of these states on the processing model of various fetches is defined in more detail throughout this specification, in Fetch, and in Referrer Policy. [FETCH] [REFERRERPOLICY]

Several signals can contribute to which processing model is used for a given fetch; a referrer policy attribute is only one of them. In general, the order in which these signals are processed are:

First, the presence of a noreferrer link type;
Then, the value of a referrer policy attribute;
Then, the presence of any meta element with name attribute set to referrer.
Finally, the `Referrer-Policy` HTTP header.

2.5.6 Nonce attributes

Global_attributes/nonce

Support in all current engines.

Firefox31+SafariYesChromeYes

Opera?EdgeYes

Edge (Legacy)?Internet ExplorerNo

Firefox Android?Safari iOS?Chrome Android?WebView Android?Samsung Internet?Opera Android?

A nonce content attribute represents a cryptographic nonce ("number used once") which can be used by Content Security Policy to determine whether or not a given fetch will be allowed to proceed. The value is text. [CSP]

Elements that have a nonce content attribute ensure that the cryptographic nonce is only exposed to script (and not to side-channels like CSS attribute selectors) by taking the value from the content attribute, moving it into an internal slot named [[CryptographicNonce]], exposing it to script via the HTMLOrSVGElement interface mixin, and setting the content attribute to the empty string. Unless otherwise specified, the slot's value is the empty string.

element.nonce: Returns the value set for element's cryptographic nonce. If the setter was not used, this will be the value originally found in the nonce content attribute.
element.nonce = value: Updates element's cryptographic nonce value.

HTMLElement/nonce

Firefox75+Safari🔰 10+Chrome61+

Opera?Edge79+

Edge (Legacy)?Internet ExplorerNo

Firefox Android?Safari iOS?Chrome Android?WebView Android?Samsung Internet?Opera Android?

The nonce IDL attribute must, on getting, return the value of this element's [[CryptographicNonce]]; and on setting, set this element's [[CryptographicNonce]] to the given value.

Note how the setter for the nonce IDL attribute does not update the corresponding content attribute. This, as well as the below setting of the nonce content attribute to the empty string when an element becomes browsing-context connected, is meant to prevent exfiltration of the nonce value through mechanisms that can easily read content attributes, such as selectors. Learn more in issue #2369, where this behavior was introduced.

The following attribute change steps are used for the nonce content attribute:

If element does not include HTMLOrSVGElement, then return.
If localName is not nonce or namespace is not null, then return.
If value is null, then set element's [[CryptographicNonce]] to the empty string.
Otherwise, set element's [[CryptographicNonce]] to value.

Whenever an element including HTMLOrSVGElement becomes browsing-context connected, the user agent must execute the following steps on the element:

Let CSP list be element's shadow-including root's policy container's CSP list.
If CSP list contains a header-delivered Content Security Policy, and element has a nonce content attribute attr whose value is not the empty string, then:
1. Let nonce be element's [[CryptographicNonce]].
2. Set an attribute value for element using "nonce" and the empty string.
3. Set element's [[CryptographicNonce]] to nonce.
If element's [[CryptographicNonce]] were not restored it would be the empty string at this point.

The cloning steps for elements that include HTMLOrSVGElement must set the [[CryptographicNonce]] slot on the copy to the value of the slot on the element being cloned.

2.5.7 Lazy loading attributes

Lazy_loading

Support in all current engines.

Firefox75+Safari15.4+Chrome77+

Opera?Edge79+

Edge (Legacy)?Internet ExplorerNo

Firefox Android?Safari iOS?Chrome Android?WebView Android?Samsung Internet?Opera Android?

A lazy loading attribute is an enumerated attribute with the following keywords and states:

Keyword	State	Brief description
`lazy`	Lazy	Used to defer fetching a resource until some conditions are met.
`eager`	Eager	Used to fetch a resource immediately; the default state.

The attribute directs the user agent to fetch a resource immediately or to defer fetching until some conditions associated with the element are met, according to the attribute's current state.

The attribute's missing value default and invalid value default are both the Eager state.

The will lazy load element steps, given an element element, are as follows:

If scripting is disabled for element, then return false.

This is an anti-tracking measure, because if a user agent supported lazy loading when scripting is disabled, it would still be possible for a site to track a user's approximate scroll position throughout a session, by strategically placing images in a page's markup such that a server can track how many images are requested and when.
If element's lazy loading attribute is in the Lazy state, then return true.
Return false.

Each img and iframe element has associated lazy load resumption steps, initially null.

For img and iframe elements that will lazy load, these steps are run from the lazy load intersection observer's callback or when their lazy loading attribute is set to the Eager state. This causes the element to continue loading.

Each Document has a lazy load intersection observer, initially set to null but can be set to an IntersectionObserver instance.

To start intersection-observing a lazy loading element element, run these steps:

Let doc be element's node document.
If doc's lazy load intersection observer is null, set it to a new IntersectionObserver instance, initialized as follows:

The intention is to use the original value of the IntersectionObserver constructor. However, we're forced to use the JavaScript-exposed constructor in this specification, until Intersection Observer exposes low-level hooks for use in specifications. See bug w3c/IntersectionObserver#464 which tracks this. [INTERSECTIONOBSERVER]
- The callback is these steps, with arguments entries and observer:
  1. For each entry in entries using a method of iteration which does not trigger developer-modifiable array accessors or iteration hooks:
    1. Let resumptionSteps be null.
    2. If entry.isIntersecting is true, then set resumptionSteps to entry.target's lazy load resumption steps.
    3. If resumptionSteps is null, then return.
    4. Stop intersection-observing a lazy loading element for entry.target.
    5. Set entry.target's lazy load resumption steps to null.
    6. Invoke resumptionSteps.
    The intention is to use the original value of the isIntersecting and target getters. See w3c/IntersectionObserver#464. [INTERSECTIONOBSERVER]
- The options is an IntersectionObserverInit dictionary with the following dictionary members: «[ "scrollMargin" → lazy load scroll margin ]»
  
  This allows for fetching the image during scrolling, when it does not yet — but is about to — intersect the viewport.
  
  The lazy load scroll margin suggestions imply dynamic changes to the value, but the IntersectionObserver API does not support changing the scroll margin. See issue w3c/IntersectionObserver#428.
Call doc's lazy load intersection observer's observe method with element as the argument.

The intention is to use the original value of the observe method. See w3c/IntersectionObserver#464. [INTERSECTIONOBSERVER]

To stop intersection-observing a lazy loading element element, run these steps:

Let doc be element's node document.
Assert: doc's lazy load intersection observer is not null.
Call doc's lazy load intersection observer's unobserve method with element as the argument.

The intention is to use the original value of the unobserve method. See w3c/IntersectionObserver#464. [INTERSECTIONOBSERVER]

The lazy load scroll margin is an implementation-defined value, but with the following suggestions to consider:

Set a minimum value that most often results in the resources being loaded before they intersect the viewport under normal usage patterns for the given device.
The typical scrolling speed: increase the value for devices with faster typical scrolling speeds.
The current scrolling speed or momentum: the UA can attempt to predict where the scrolling will likely stop, and adjust the value accordingly.
The network quality: increase the value for slow or high-latency connections.
User preferences can influence the value.

It is important for privacy that the lazy load scroll margin not leak additional information. For example, the typical scrolling speed on the current device could be imprecise so as to not introduce a new fingerprinting vector.

2.5.8 Blocking attributes

A blocking attribute explicitly indicates that certain operations should be blocked on the fetching of an external resource. The operations that can be blocked are represented by possible blocking tokens, which are strings listed by the following table:

Possible blocking token	Description
"`render`"	The element is potentially render-blocking.

In the future, there might be more possible blocking tokens.

A blocking attribute must have a value that is an unordered set of unique space-separated tokens, each of which are possible blocking tokens. The supported tokens of a blocking attribute are the possible blocking tokens. Any element can have at most one blocking attribute.

The blocking tokens set for an element el are the result of the following steps:

Let value be the value of el's blocking attribute, or the empty string if no such attribute exists.
Set value to value, converted to ASCII lowercase.
Let rawTokens be the result of splitting value on ASCII whitespace.
Return a set containing the elements of rawTokens that are possible blocking tokens.

An element is potentially render-blocking if its blocking tokens set contains "render", or if it is implicitly potentially render-blocking, which will be defined at the individual elements. By default, an element is not implicitly potentially render-blocking.

2.5.9 Fetch priority attributes

A fetch priority attribute is an enumerated attribute with the following keywords and states:

Keyword	State	Brief description
`high`	high	Signals a high-priority fetch relative to other resources with the same destination.
`low`	low	Signals a low-priority fetch relative to other resources with the same destination.
`auto`	auto	Signals automatic determination of fetch priority relative to other resources with the same destination.

The attribute's missing value default and invalid value default are both the auto state.

2.4 URLs

2.4.1 Terminology

2.4.2 Parsing URLs

2.4.3 Dynamic changes to base URLs

2.5 Fetching resources

2.5.1 Terminology

2.5.2 Determining the type of a resource

2.5.3 Extracting character encodings from meta elements

2.5.4 CORS settings attributes

2.5.5 Referrer policy attributes

2.5.6 Nonce attributes

2.5.7 Lazy loading attributes

2.5.8 Blocking attributes

2.5.9 Fetch priority attributes

2.5.3 Extracting character encodings from `meta` elements