History
– Uniform Resource Locators (URLs) were defined in RFC1738 in 1994 by Tim Berners-Lee and the URI working group of the Internet Engineering Task Force (IETF).
– Early collaborators proposed the use of Universal Document Identifiers (UDIs) before settling on URLs.
– The term ‘universal’ was originally preferred over ‘uniform’ in the expansion of URL, but it was later changed.
– Berners-Lee expressed regret at using dots to separate parts of the domain name within URIs and wished he had used slashes throughout.
Syntax
– Every HTTP URL conforms to the syntax of a generic URI.
– The URI generic syntax consists of five components: scheme, authority, path, query, and fragment.
– URI schemes should be registered with the Internet Assigned Numbers Authority (IANA).
– The authority component can include userinfo, host, and port subcomponents.
– The path component consists of path segments separated by a slash and can resemble a file system path.
Usage
– URLs are commonly used to reference web pages (HTTP/HTTPS).
– URLs are also used for file transfer (FTP), email (mailto), database access (JDBC), and other applications.
– Most web browsers display the URL of a web page in an address bar.
– A typical URL includes a protocol, hostname, and file name.
– In http and https URIs, the last part of the path is often named pathinfo and is used to select dynamic content.
Internationalized URL
– An Internationalized Resource Identifier (IRI) is a form of URL that includes Unicode characters.
– The domain name in an IRI is known as an Internationalized Domain Name (IDN).
– Web and Internet software automatically convert IDNs into punycode usable by the Domain Name System.
– The URL path name can be specified in the user’s local writing system and is converted to UTF-8.
– Characters not part of the basic URL character set are escaped using percent-encoding.
Related Standards
– URL is a specific type of Uniform Resource Identifier (URI).
– Other related standards include URI and URN.
– The World Wide Web relies on the use of URLs to locate resources.
– The URL specification is licensed under CC BY 4.0.
– Organizations such as whatwg.org are involved in maintaining and developing URL standards.
A Uniform Resource Locator (URL), colloquially known as an address on the Web, is a reference to a resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URI), although many people use the two terms interchangeably. URLs occur most commonly to reference web pages (HTTP/HTTPS) but are also used for file transfer (FTP), email (mailto), database access (JDBC), and many other applications.
Uniform Resource Locator | |
Abbreviation | URL |
---|---|
Status | Published |
First published | 1994 |
Latest version | Living Standard 2023 |
Organization | Internet Engineering Task Force (IETF) |
Committee | Web Hypertext Application Technology Working Group (WHATWG) |
Series | Request for Comments (RFC) |
Editors | Anne van Kesteren |
Authors | Tim Berners-Lee |
Base standards |
|
Related standards | URI, URN |
Domain | World Wide Web |
License | CC BY 4.0 |
Website | url |
Most web browsers display the URL of a web page above the page in an address bar. A typical URL could have the form http://www.example.com/index.html
, which indicates a protocol (http
), a hostname (www.example.com
), and a file name (index.html
).