In computer
hypertext
Hypertext is E-text, text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typic ...
, a URI fragment is a
string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
of
characters that refers to a
resource
''Resource'' refers to all the materials available in our environment which are Technology, technologically accessible, Economics, economically feasible and Culture, culturally Sustainability, sustainable and help us to satisfy our needs and want ...
that is subordinate to another, primary resource. The primary resource is identified by a
Uniform Resource Identifier
A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world obje ...
(URI), and the fragment identifier points to the subordinate resource.
The fragment identifier introduced by a
hash mark #
is the optional last part of a
URL
A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identi ...
for a document. It is typically used to identify a portion of that document. The generic syntax is specified i
RFC 3986 The hash mark separator in URIs is not part of the fragment identifier.
Basics
In URIs, a hash mark
#
introduces the optional fragment near the end of the URL. The generic
RFC 3986 syntax for URIs also allows an optional
query part introduced by a question mark
?
. In URIs with a query and a fragment, the fragment follows the query. Query parts depend on the URI scheme and are evaluated by the server—e.g.,
http:
supports queries unlike
ftp:
. Fragments depend on the document
MIME type and are evaluated by the client (
web browser
A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
). Clients are not supposed to send URI fragments to servers when they retrieve a document.
A URI ending with
#
is permitted by the generic syntax and is a kind of empty fragment. In MIME document types such as
text/html
or any XML type, empty identifiers to match this syntactically legal construct are not permitted. Web browsers typically display the top of the document for an empty fragment.
The fragment identifier functions differently to the rest of the URI: its processing is exclusively
client-sided with no participation from the
web server
A web server is computer software and underlying Computer hardware, hardware that accepts requests via Hypertext Transfer Protocol, HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, co ...
, though the server typically helps to determine the MIME type, and the MIME type determines the processing of fragments. When an
agent (such as a web browser)
requests a
web resource from a web server, the agent sends the URI to the server, but does not send the fragment. Instead, the agent waits for the server to send the resource, and then the agent processes the resource according to the document type and fragment value.
In an HTML web page, the agent will look for an anchor identified with an HTML tag that includes an
id=
or
name=
attribute equal to the fragment identifier.
Examples
* In URIs for MIME
text/html
pages such as
http://www.example.org/foo.html#bar
the fragment refers to the element with
id="bar"
.
** Graphical Web browsers typically scroll to position pages so that the top of the element identified by the fragment id is aligned with the top of the viewport; thus fragment identifiers are often used in tables of contents.
** The appearance of the identified element can be changed through the
:target
CSS pseudoclass.
Wikipedia
Wikipedia is a free content, free Online content, online encyclopedia that is written and maintained by a community of volunteers, known as Wikipedians, through open collaboration and the wiki software MediaWiki. Founded by Jimmy Wales and La ...
uses this to highlight the selected reference. Notably CSS
display: block
can be used to show content only if it is the target, and otherwise hidden by
display: none
.
** The
name
attribute of the
element served the same purpose, but is now obsolete in favor of the
id
attribute, which can be applied to any element.
* In all
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
document types including
XHTML
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
While HTML, pr ...
fragments corresponding to an
xml:id
or similar
id
attributes follow the
Name
-syntax and begin with a letter, underscore, or colon. Notably they cannot begin with a digit or hyphen.
**
xml:id
is one of the few generic XML attributes, e.g.,
xml:lang
, which can be used without explicitly declaring a namespace. In XHTML
id
can also be used and seems to be preferred, because XHTML was specified before
xml:id
existed.
* In XML applications, fragment identifiers in a certain syntax can be
XPointers; for example, the fragment identifier in the URI
http://www.example.org/foo.xml#xpointer(//Rube)
refers to all XML elements named "Rube" in the document identified by the URI
http://www.example.org/foo.xml. An XPointer processor, given that URI, would obtain a representation of the document (such as by requesting it from the Internet) and would return a representation of the document's "Rube" elements.
* In
RDF vocabularies, such as
RDFS,
OWL, or
SKOS
Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of t ...
, fragment identifiers are used to identify resources in the same
XML Namespace
XML namespaces are used for providing uniquely named elements and attributes in an XML document. They are defined in a W3C recommendation. An XML instance may contain element or attribute names from more than one XML vocabulary. If each vocabulary ...
, but are not necessarily corresponding to a specific part of a document. For example,
http://www.w3.org/2004/02/skos/core#broader
identifies the concept "broader" in SKOS Core vocabulary, but it does not refer to a specific part of the resource identified by
http://www.w3.org/2004/02/skos/core
, a complete RDF file in which semantics of this specific concept is declared, along with other concepts in the same vocabulary.
* In URIs for MIME
text/plain
documents
RFC 5147 specifies a fragment identifier for the character and line positions and ranges within the document using the keywords "
char
" and "
line
", and an integrity check can be added, either "
length
" or "
md5
". Browser support seems lacking. The following example identifies lines 11 through 20 of a text document:
**
http://example.com/document.txt#line=10,20
* In URIs for MIME
text/csv
documents,
RFC 7111 specifies a fragment identifier as a selector for rows, columns, and cells using the keywords "
row
" , "
col
", and "
cell
", For example:
**
http://example.com/data.csv#row=4
– Selects the 4th row.
**
http://example.com/data.csv#col=2
– Selects 2nd column.
**
http://example.com/data.csv#row=5-7
– Selects three consecutive rows starting with 5th row.
**
http://example.com/data.csv#row=5-*
– Selects all rows starting with 5th row.
**
http://example.com/data.csv#cell=4,1-6,2
– Selects a region that starts at the 4th row and the 1st column and ends at the 6th row and the 2nd column.
* In URIs for MIME audio/*, image/*, video/* documents, very few have defined fragments or fragment semantics. The Media Fragments URI 1.0 (basic) syntax supports addressing a media resource along two dimensions (temporal and spatial) using the keywords
t
and
xywh
, and Media Fragments 1.0 URI (advanced) adds
track
and
id
. Therefore, one can use the following media fragments URI in the
src
attribute of the
audio
or
video
HTML5
HTML5 (Hypertext Markup Language 5) is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommend ...
element:
**
http://example.com/foo.mp4#t=10,20
(this indicates the time interval starting at 10 seconds and ending before 20 seconds)
**
http://example.com/bar.webm#t=40,80&xywh=160,120,320,240
** The specification also allows for specifying hours, minutes (must be 2 digits), and seconds (must be 2 digits) using colons, and milliseconds using a decimal point. Other time schemes may also be able to be specified through prefixes, with
npt:
(Normal Play Time) being the default.
** Other websites use the fragment part to pass some extra information to scripts running on them – for example,
Google Video understands permalinks in the format of
#01h25m30s
to start playing at the specified position, and
YouTube
YouTube is an American social media and online video sharing platform owned by Google. YouTube was founded on February 14, 2005, by Steve Chen, Chad Hurley, and Jawed Karim who were three former employees of PayPal. Headquartered in ...
uses similar code such as
#t=3m25s
.
* In
JavaScript
JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior.
Web browsers have ...
, the fragment identifier of the current HTML or XHTML page can be accessed in the "hash" property
location.hash
– JavaScript can be also used with other document types. With the rise of
AJAX
Ajax may refer to:
Greek mythology and tragedy
* Ajax the Great, a Greek mythological hero, son of King Telamon and Periboea
* Ajax the Lesser, a Greek mythological hero, son of Oileus, the king of Locris
* Ajax (play), ''Ajax'' (play), by the an ...
, some websites use fragment identifiers to emulate the back button behavior of browsers for page changes that do not require a reload, or to emulate subpages.
** For example,
Gmail
Gmail is the email service provided by Google. it had 1.5 billion active user (computing), users worldwide, making it the largest email service in the world. It also provides a webmail interface, accessible through a web browser, and is also ...
uses a single URL for almost every interface – mail boxes, individual mails, search results, settings – the fragment is used to make these interfaces directly linkable.
**
Adobe Flash
Adobe Flash (formerly Macromedia Flash and FutureSplash) is a mostly discontinuedAlthough it is discontinued by Adobe Inc., for the Chinese market it is developed by Zhongcheng and for the international enterprise market it is developed by Ha ...
websites can use the fragment part to inform the user about the state of the website or
web application
A web application (or web app) is application software that is created with web technologies and runs via a web browser. Web applications emerged during the late 1990s and allowed for the server to dynamically build a response to the request, ...
, and to facilitate
deep linking, commonly with the help of the SWFAddress
JavaScript library
A JavaScript library is a library of pre-written JavaScript code that allows for easier development of JavaScript-based applications, especially for AJAX and other web-centric technologies. They can be included in a website by embedding it directl ...
.
* A URI that links to a
JSON
JSON (JavaScript Object Notation, pronounced or ) is an open standard file format and electronic data interchange, data interchange format that uses Human-readable medium and data, human-readable text to store and transmit data objects consi ...
document can specify a pointer to a specific value.
** For example, a URL ending in
#/foo
could be used to extract the value from a key-value pair in a document beginning with
* In URIs for MIME
application/pdf
documents PDF viewers recognize a number of fragment identifiers. For instance, a URL ending in
.pdf#page=35
will cause most readers to open the PDF and scroll to page 35. Several other parameters are possible, including
#nameddest=
(similar to HTML anchors),
#search="word1 word2"
,
#zoom=
, etc. Multiple parameters can be combined with ampersands:
**
http://example.org/doc.pdf#view=fitb&nameddest=Chapter3
.
* In
SVG, fragments are allowed to specify arguments such as
viewBox()
,
preserveAspectRatio()
, and
transform()
.
Proposals
Several proposals have been made for fragment identifiers for use with plain text documents (which cannot store anchor metadata), or to refer to locations within HTML documents in which the author has not used anchor tags:
* As of September 2012 the Media Fragments URI 1.0 (basic) is a
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
Recommendation.
*Chrome versions 80 and above
and Firefox versions 131 and above implement
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
's
WICG ''Text Fragments,'' so
#:~:text=foo
will cause the browser to search for
foo
, highlight the matching text, and scroll to it. Besides the start and end, the snippet can also specify a context: text that must precede or follow
foo
but will not be highlighted (
example that uses #:~:text=night-,vision
to find 'vision' preceded by 'night').
* The
Python Package Index appends the
MD5
The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as Request for Comments, RFC 1321.
MD5 ...
hash of a file to the URL as a fragment identifier. If MD5 were unbroken (it is a
broken hash function), it could be used to ensure the
integrity
Integrity is the quality of being honest and having a consistent and uncompromising adherence to strong moral and ethical principles and values.
In ethics, integrity is regarded as the honesty and Honesty, truthfulness or of one's actions. Integr ...
of the package.
*:
https://pypi.python.org ... zodbbrowser-0.3.1.tar.gz#md5=38dc89f294b24691d3f0d893ed3c119c
* A
hash-bang fragment is a fragment starting with an exclamation mark
!
. It was used in a now-deprecated approach to index dynamic
single-page applications. An
exclamation mark
The exclamation mark (also known as exclamation point in American English) is a punctuation mark usually used after an interjection or exclamation to indicate strong feelings or to show wikt:emphasis, emphasis. The exclamation mark often marks ...
is illegal in
HTML4, XHTML, and XML identifiers, granting certain degree of separation from that functionality. However, it is allowed in
HTML5
HTML5 (Hypertext Markup Language 5) is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommend ...
.
** Between 2009 and 2015,
Google Webmaster Central proposed and then recommended an "AJAX crawling scheme"
using an initial exclamation mark in fragment identifiers for stateful
AJAX
Ajax may refer to:
Greek mythology and tragedy
* Ajax the Great, a Greek mythological hero, son of King Telamon and Periboea
* Ajax the Lesser, a Greek mythological hero, son of Oileus, the king of Locris
* Ajax (play), ''Ajax'' (play), by the an ...
pages:
http://example.com/page?query#!state
** Another implementation has been the replacement of
#!
with
?_escaped_fragment_=
** Hash-bang URIs have been considered problematic by a number of writers including Jeni Tennison at the W3C because they make pages inaccessible to those who do not have
JavaScript
JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior.
Web browsers have ...
activated in their browser. They also break
HTTP referer
In HTTP, "" (a misspelling of "Referrer") is an optional HTTP header field that identifies the address of the web page (i.e., the URI or IRI) from which the resource has been requested. By checking the referrer, the server providing the new web ...
headers as browsers are not allowed to send the fragment identifier in the Referer header.
** In 2015, Google deprecated their hash-bang AJAX crawling proposal, recommending instead the use of
progressive enhancement
Progressive enhancement is a strategy in web design that puts emphasis on web content first, allowing everyone to access the basic content and functionality of a web page, while users with additional browser features or faster Internet access r ...
and
HTML5
HTML5 (Hypertext Markup Language 5) is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommend ...
's
history.pushState()
method.
**
Mozilla Foundation
The Mozilla Foundation is an American non-profit organization that exists to support and collectively lead the Open-source software, open source Mozilla project. Founded in July 2003, the organization sets the policies that govern development, ...
employee Gervase Markham has proposed a fragment identifier for searching, of the form
#!s!search terms
. Adding a number after the s (
#!s10!
) indicates that the browser should search for the ''n''th occurrence of the search term. A negative number (
#!s-3!
) starts searching backwards from the end of the document. A
Greasemonkey script is available to add this functionality to compatible browsers.
**:
http://example.com/index.html#!s3!search terms
* Erik Wilde and Marcel Baschnagel of the
ETH Zurich
ETH Zurich (; ) is a public university in Zurich, Switzerland. Founded in 1854 with the stated mission to educate engineers and scientists, the university focuses primarily on science, technology, engineering, and mathematics. ETH Zurich ran ...
extend this to also identify fragments in plain text documents using
regular expressions
A regular expression (shortened as regex or regexp), sometimes referred to as rational expression, is a sequence of character (computing), characters that specifies a pattern matching, match pattern in string (computer science), text. Usually ...
, with the keyword "
match
". They also describe a prototype implementation as an extension for the
Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curr ...
browser. For example, the following would find the case-insensitive text "RFC" anywhere in the document:
*:
http://example.com/document.txt#match= RfF] C/nowiki>
* K. Yee of the
Foresight Institute
The Foresight Institute (Foresight) is a San Francisco-based research non-profit that promotes the development of nanotechnology and other emerging technologies, such as safe AGI, biotech and longevity.
Foresight runs four cross-disciplinary p ...
proposes "extended fragment identifiers" delimited with
colons and a keyword to differentiate them from anchor identifiers. A text search fragment identifier with "fragment specification scheme" id "
words
" is the first proposal in this scheme. The following example would search a document for the first occurrence of the string "some context for a search term" and then highlight the words "search term":
*:
http://example.com/index.html#:words:some-context-for-a-(search-term)
** The above scheme was implemented in Chrome version 80.
* The LiveURLs project proposed a fragment identifier format for referring to a region of text within a page, of the form
#FWS+C
, where ''F'' is the length of the first word (up to five characters), ''W'' is the first word itself, ''S'' is the length of the selected text and ''C'' is a 32-bit
CRC of the selected text. They implemented a variant of this scheme as an extension for the Firefox browser, using the form
#LFWS+C
, where ''L'' is the length of the fragment itself, in two
hex digits. Linking to the word "Fragment" using the implemented variant would yield:
*:
http://example.com/index.html#115Fragm8+-52f89c4c
* Up until Firefox 5, Firefox supported XPath links such as #xpath:/html/body/div
which could be used in conjunction with a bookmarklet such as http://antimatter15.com/wp/2009/11/xpath-bookmark-bookmarklet/ to link within HTML documents that lacked proper IDs. This feature was removed as part of a code cleanup in https://bugzilla.mozilla.org/show_bug.cgi?id=457102
*In
ePub
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes stylized as ''ePUB''. EPUB is supported by many e-readers, and compatible software is available for most smart ...
electronic book format, the EPUB Canonical Fragment Identifier (epubcfi,
2011-2017) defines a
W3C
The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
/
IDPF-standardized method for referencing arbitrary content using fragment identifiers to locate non-anchored text ranges via document structure and
pattern matching
In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually must be exact: "either it will or will not be a ...
. These dynamic deep links assist in locating content after text is updated and are used, for example, in
Apple Books.
See also
*
Query string
A query string is a part of a uniform resource locator ( URL) that assigns values to specified parameters. A query string commonly includes fields added to a base URL by a Web browser or other client application, for example as part of an HTML doc ...
*
URI normalization
URI normalization is the process by which URIs are modified and standardized in a consistent manner. The goal of the normalization process is to transform a URI into a normalized URI so it is possible to determine if two syntactically differen ...
*
URL
A uniform resource locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identi ...
(Uniform Resource Locator)
*
Clean URL
Clean URLs (also known as user-friendly URLs, pretty URLs, search-engine–friendly URLs or RESTful URLs) are web addresses or Uniform Resource Locators (URLs) intended to improve the usability and accessibility of a website, web application, o ...
*
URI scheme
A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world obje ...
References
{{reflist
External links
* W3
Media FragmentsWorking Group, establishing a URI syntax and semantics to address media fragments in audiovisual material (such as a region in an image or a sub-clip of a video)
* MediaMixe
Community Portalcollects presentations, tutorials, use cases and demonstrators related to use of Media Fragment technology
URI schemes
Identifiers
Hypertext