A uniform resource locator (URL), colloquially known as an address on the
Web
Web most often refers to:
* Spider web, a silken structure created by the animal
* World Wide Web or the Web, an Internet-based hypertext system
Web, WEB, or the Web may also refer to:
Computing
* WEB, a literate programming system created by ...
, is a reference to a
resource
''Resource'' refers to all the materials available in our environment which are Technology, technologically accessible, Economics, economically feasible and Culture, culturally Sustainability, sustainable and help us to satisfy our needs and want ...
that specifies its location on a
computer network
A computer network is a collection of communicating computers and other devices, such as printers and smart phones. In order to communicate, the computers and devices must be connected by wired media like copper cables, optical fibers, or b ...
and a mechanism for retrieving it. A URL is a specific type of
Uniform Resource Identifier
A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world obje ...
(URI), although many people use the two terms interchangeably. URLs occur most commonly to reference
web page
A web page (or webpage) is a World Wide Web, Web document that is accessed in a web browser. A website typically consists of many web pages hyperlink, linked together under a common domain name. The term "web page" is therefore a metaphor of pap ...
s (
HTTP
HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, wher ...
/
HTTPS
Hypertext Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer Protocol (HTTP). It uses encryption for secure communication over a computer network, and is widely used on the Internet. In HTTPS, the communication protoc ...
) but are also used for file transfer (
FTP
The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and dat ...
), email (
mailto), database access (
JDBC
Java Database Connectivity (JDBC) is an application programming interface (API) for the Java (programming language), Java programming language which defines how a client may access a database. It is a Java-based data access technology used for Java ...
), and many other applications.
Most
web browser
A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
s display the URL of a web page above the page in an
address bar
In a web browser, the address bar (also location bar or URL bar) is the element that shows the current URL. The user can type a URL into it to navigate to a chosen website. In most modern browsers, non-URLs are automatically sent to a search eng ...
. A typical URL could have the form
http://www.example.com/index.html
, which indicates a protocol (
http
), a
hostname (
www.example.com
), and a file name (
index.html
).
History
Uniform Resource Locators were defined in in 1994 by
Tim Berners-Lee
Sir Timothy John Berners-Lee (born 8 June 1955), also known as TimBL, is an English computer scientist best known as the inventor of the World Wide Web, the HTML markup language, the URL system, and HTTP. He is a professorial research fellow a ...
, the inventor of the
World Wide Web
The World Wide Web (WWW or simply the Web) is an information system that enables Content (media), content sharing over the Internet through user-friendly ways meant to appeal to users beyond Information technology, IT specialists and hobbyis ...
, and the URI working group of the
Internet Engineering Task Force
The Internet Engineering Task Force (IETF) is a standards organization for the Internet standard, Internet and is responsible for the technical standards that make up the Internet protocol suite (TCP/IP). It has no formal membership roster ...
(IETF), as an outcome of collaboration started at the IETF Living Documents
birds of a feather session in 1992.
The format combines the pre-existing system of
domain name
In the Internet, a domain name is a string that identifies a realm of administrative autonomy, authority, or control. Domain names are often used to identify services provided through the Internet, such as websites, email services, and more. ...
s (created in 1985) with
file path syntax, where
slashes are used to separate
directory and
filename
A filename or file name is a name used to uniquely identify a computer file in a file system. Different file systems impose different restrictions on filename lengths.
A filename may (depending on the file system) include:
* name – base ...
s. Conventions already existed where server names could be prefixed to complete file paths, preceded by a double slash (
//
).
Berners-Lee later expressed regret at the use of dots to separate the parts of the
domain name
In the Internet, a domain name is a string that identifies a realm of administrative autonomy, authority, or control. Domain names are often used to identify services provided through the Internet, such as websites, email services, and more. ...
within
URIs, wishing he had used slashes throughout, and also said that, given the colon following the first component of a URI, the two slashes before the domain name were unnecessary.
Early
WorldWideWeb
WorldWideWeb (later renamed Nexus to avoid confusion between the software and the World Wide Web) is the first web browser and web page editor. It was discontinued in 1994. It was the first WYSIWYG HTML editor.
The source code was released i ...
collaborators including Berners-Lee originally proposed the use of UDIs: Universal Document Identifiers.
An early (1993) draft of the HTML Specification referred to "Universal" Resource Locators. This was dropped some time between June 1994 () and October 1994 (draft-ietf-uri-url-08.txt). In his book ''
Weaving the Web'', Berners-Lee emphasizes his preference for the original inclusion of "universal" in the expansion rather than the word "uniform", to which it was later changed, and he gives a brief account of the contention that led to the change.
Syntax
Every HTTP URL conforms to the syntax of a generic URI.
A web browser will usually
dereference
In computer science, a pointer is an object (computer science), object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped I/O, memo ...
a URL by performing an
HTTP
HTTP (Hypertext Transfer Protocol) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, wher ...
request to the specified host, by default on port number 80. URLs using the
https
scheme require that requests and responses be made over a
secure connection to the website.
Internationalized URL
Internet users are distributed throughout the world using a wide variety of languages and alphabets, and expect to be able to create URLs in their own local alphabets. An
Internationalized Resource Identifier (IRI) is a form of URL that includes
Unicode
Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
characters. All modern browsers support IRIs. The parts of the URL requiring special treatment for different alphabets are the domain name and path.
The domain name in the IRI is known as an
Internationalized Domain Name
An internationalized domain name (IDN) is an Internet domain name that contains at least one label displayed in software applications, in whole or in part, in non-Latin script or alphabet or in the Latin alphabet-based characters with diacrit ...
(IDN). Web and Internet software automatically convert the domain name into
punycode usable by the
Domain Name System
The Domain Name System (DNS) is a hierarchical and distributed name service that provides a naming system for computers, services, and other resources on the Internet or other Internet Protocol (IP) networks. It associates various information ...
; for example, the Chinese URL
http://例子.卷筒纸
becomes
http://xn--fsqu00a.xn--3lr804guic/
. The
xn--
indicates that the character was not originally
ASCII
ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
.
The URL path name can also be specified by the user in the local writing system. If not already encoded, it is converted to
UTF-8
UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8.
UTF-8 supports all 1,112,0 ...
, and any characters not part of the basic URL character set are escaped as
hexadecimal
Hexadecimal (also known as base-16 or simply hex) is a Numeral system#Positional systems in detail, positional numeral system that represents numbers using a radix (base) of sixteen. Unlike the decimal system representing numbers using ten symbo ...
using
percent-encoding
URL encoding, officially known as percent-encoding, is a method to binary-to-text encoding, encode arbitrary data in a uniform resource identifier (URI) using only the ASCII, US-ASCII characters legal within a URI. Although it is known as ''URL en ...
; for example, the Japanese URL
http://example.com/引き割り.html
becomes
http://example.com/%E5%BC%95%E3%81%8D%E5%89%B2%E3%82%8A.html
. The target computer decodes the address and displays the page.
Protocol-relative URLs
Protocol-relative links (PRL), also known as protocol-relative URLs (PRURL), are URLs that have no protocol specified. For example,
//example.com
will use the protocol of the current page, typically HTTP or HTTPS.
See also
*
Hyperlink
In computing, a hyperlink, or simply a link, is a digital reference providing direct access to Data (computing), data by a user (computing), user's point and click, clicking or touchscreen, tapping. A hyperlink points to a whole document or to ...
*
PURL
A persistent uniform resource locator (PURL) is a uniform resource locator (URL) (i.e., location-based uniform resource identifier or URI) that is used to URL redirection, redirect to the location of the requested web resource. PURLs redirect HTT ...
– Persistent URL
*
CURIE (Compact URI)
*
URI fragment
*
Internet resource locator (IRL)
*
Internationalized Resource Identifier (IRI)
*
Clean URL
Clean URLs (also known as user-friendly URLs, pretty URLs, search-engine–friendly URLs or RESTful URLs) are web addresses or Uniform Resource Locators (URLs) intended to improve the usability and accessibility of a website, web application, o ...
*
Typosquatting
*
Uniform Resource Identifier
A Uniform Resource Identifier (URI), formerly Universal Resource Identifier, is a unique sequence of characters that identifies an abstract or physical resource, such as resources on a webpage, mail address, phone number, books, real-world obje ...
*
URI normalization
*
Use of slashes in networking
Notes
Citations
References
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
External links
URL specificationat
WHATWG
The Web Hypertext Application Technology Working Group (WHATWG) is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, ...
URL splitterthat splits any URI into its parts
{{Authority control
Identifiers
Internet properties established in 1994