The basic URL structure consists of a scheme name defining the access method and a scheme-specific part separated by a colon:
<scheme>:<scheme-specific-part>
where scheme is often, but not necessarily, the same as the underlying network protocol (for example, ftp or http are, but mailto or file are not).
Possible URL parts are for example at http:
|------------------ scheme-specific part ------------------| https://max:
[email protected]:8080/index.html?p1=A&p2=B#ressource \___/ \_/ \____/ \_____________/ \__/\_________/ \_______/ | | | | | scheme⁺ | password host port path query fragment user
⁺ (equal to network protocol here)
at mailto:
mailto:
[email protected] \____/ \______________/ | schema⁺ | email address according to RFC 5322
⁺ (no network protocol here).
for news (neither a network protocol nor a host address is included in this example):
news:alt.hypertext \__/ \___________/ | schema | newsgroup name
on file:
file:///directory/subdirectory/file \__/ \___________________________________/ | | scheme | Path to a local file in the file system of the computer interpreting the URL
Strictly speaking, this scheme has the form file://<host>/<path> , but the host part is practically not used, since the file scheme can hardly be used meaningfully over a network due to the lack of a way to specify a network protocol for accessing the file. File URLs are used, for example, in the Java programming language to access local files in this way. Depending on the browser, opening file links is often only possible after special client-side configuration or with the help of add-ons etc.
Scheme (scheme)
Specifies the technical method with which the resource is to be addressed. This is usually, but not necessarily, the same as the network protocol used to locate the resource. Examples are HTTP, HTTPS or FTP, but also mailto (for writing an e-mail) or file (for accessing local files).
Scheme-specific part (scheme-specific part)
Depending on the scheme, different specific specifications are required and possible. In most cases, it starts with the character string //, but some variants also define only the colon. The following examples refer to the Hypertext Transfer Protocol (HTTP).
User and password (user, password)
If required, login information consisting of user name (user) and password (password) can also be transmitted. These are separated from each other by a colon and prefixed to the host with a separating at sign (@).
Even though the HTTP protocol was chosen for this example, specifying the user name and password as part of the URL is not part of the HTTP specification! Current browsers accept this URL syntax, but ask the user whether he really wants to log in with the specified data. Internet Explorer 6 (Windows XP SP2 and later) and newer versions are out of the ordinary in that they reject this URL syntax outright as incorrect. With a registry entry you can force them to behave in the same way as their predecessors up to version 5.5: These take over the login data without being asked and pass them directly to the server.
However, for some other protocols, such as FTP, the specification of user data in the form shown is perfectly correct and covered by the standards.
Host
The host component is noted in the form of an IPv4 address in decimal notation separated by periods, in the form of an IPv6 address in hexadecimal notation separated by colons and enclosed in square brackets, or in the form of an FQDN.
Port
The specification of the port allows the control of a TCP port. If no port is specified, the default port of the respective protocol is used - for example, HTTP 80, HTTPS 443 and FTP 21.
Path (Path)
The path describes a specific resource (this can, for example, coincide with the directory structure of the target system, such as a file or directory) on the server. The path can also be empty. An empty path can optionally be replaced by a slash and is equivalent to this.
The interpretation (file or directory; deliver text file or execute script) is left to the server. A typical example of the freedom of interpretation is the behavior when a client requests the path /: Depending on the setting, the server may deliver the contents of a named file (such as /index.html, /README, /HEADER) without this being apparent to the requesting client. In the same way, however, the server can - depending on the protocol - also explicitly forward to this resource or output a directory listing.
Query
→ Main article: Query String
In the case of HTTP, a query string can follow the actual resource pointer - separated by a question mark. This can be used to transfer additional information that can be processed further on the server or client side.
Fragment
→ Main article: Fragment identifier
After a double cross, a part of the resource can be referenced, typically an anchor in an HTML page, which is automatically scrolled down to after the page is called: The URL http://example.com/dokument.html#absatz3 would, in the fictional document here, cause the browser to scroll to the beginning of the third paragraph.