The concept of URLs is familiar to most people. A URL is a web address that is used to direct users to websites on the internet. But what is a URI? The concept of URIs was conceived of by the fore­fa­ther of the World Wide Web, Tim Berners-Lee. When he first used the term in RFC 1630, he was still speaking of a Universal Resource Iden­ti­fi­er. Since then, however, through, among other things, pub­li­ca­tions by the World Wide Web Con­sor­tium (W3C), URI has been es­tab­lished as the acronym for the Uniform Resource Iden­ti­fi­er, and to this day still goes by it. With regard to the original idea, however, nothing has changed.

What is the Uniform Resource Iden­ti­fi­er (URI)?

The Uniform Resource Iden­ti­fi­er (URI) is intended to identify abstract or physical resources on the Internet. What these resources are supposed to be can vary according to the situation. It can thus be, a website, for example. However, email senders and re­cip­i­ents can also be iden­ti­fied via URI. Ap­pli­ca­tions use the un­am­bigu­ous des­ig­na­tion to identify a resource or to request data from it.

Protocols such as HTTP or FTP can function on this basis as the form of iden­ti­fi­ca­tion is pre­de­fined by the URI syntax. From this URI, a system can read where and how certain in­for­ma­tion should be iden­ti­fied.

URI Syntax

A URI consists of up to five parts. However, only two of these are mandatory.

  • scheme: Gives in­for­ma­tion about the protocol being used.
  • authority: Iden­ti­fies the domains.
  • path: Shows the exact path to the resource.
  • query: Rep­re­sents a request action.
  • fragment: Refers to a partial aspect of a resource.

Only scheme and path must appear in every iden­ti­fi­er. In the URI syntax, all com­po­nents are listed suc­ces­sive­ly and separated by specific, pre­de­fined char­ac­ters.

scheme :// authority path ? query # fragment

The double forward slashes after the first colon are then only necessary if the authority part is filled. Fur­ther­more, authority can also contain user in­for­ma­tion that is then detached from the domain’s @ symbol, and finally another port des­ig­na­tion, which in turn is separated from the domain with a colon.

A typical web address is a good example: "https://example.org/test/test1?search=test-question#part2"

  • scheme: https
  • authority: example.org
  • path: test/test1
  • query: search=test-question
  • fragment: part2

In the example, the URI refers to one part of a website. This part (part2) is accessed via HTTP; is located on a device with example.org as the iden­ti­fi­er and can be found at the specified path if one performs a search be­fore­hand. With the Uniform Resource Iden­ti­fi­er, an email address can also be iden­ti­fied: "mailto:user@example.org".

  • scheme: mailto
  • path: user@example.org

In this case, not only are the mandatory com­po­nents contained in the URI. Other potential resources can be iden­ti­fied with this syntax, such as files or even telephone numbers.

Note

Although it’s true that the path is a mandatory spec­i­fi­ca­tion in every URI, the part’s content can be empty. In other words, “http://example.org/” is a valid URI with an empty path.

URI schemes, in other words, the first part of every URI, are managed by the IANA. Although one can also use their own schemes, those that have been confirmed by the or­ga­ni­za­tion are known through­out the entire Internet. The best-known schemes are:

Tip

URI Reference

In order to not always have to write (and save) a complete URI spec­i­fi­ca­tion, many ap­pli­ca­tions use a shorter version of the syntax. For the shortened version to be properly un­der­stood, there must always be a base URI that is fully for­mu­lat­ed. The URI ref­er­ences are then resolved in­ter­nal­ly. For this reason, one dis­tin­guish­es absolute ref­er­ences from relative ones. The absolute URI functions in­de­pen­dent­ly of context and consists of at least scheme, authority and path. The relative reference is in the actual short form. With this form, only the deviation from the base URI is specified. A relative URI must, for this reason, always be located in the same namespace, as is the case with the base URI.

With the relative reference, no scheme is specified. To be able to dis­tin­guish relative URIs from absolute URIs, no colon may appear in the first segment of a path because the part before the colon would then be in­ter­pret­ed as a scheme. Among the relevant ref­er­ences one dis­tin­guish­es three different types that one rec­og­nizes each time via a marker at the beginning of the path:

  • A relative path reference begins without a forward slash.
  • An absolute path reference begins with a forward slash.
  • A network-path reference begins with two forward slashes.

URI vs. URL vs. URN

There is a lot of confusion regarding the very similar sounding URI, URL and URN ab­bre­vi­a­tions. The un­cer­tain­ty is un­der­pinned by the fact that all three concepts are, in technical terms, also related to each other. The Uniform Resource Locator is used to display where a resource is located. For this reason, the URL is also utilised when surfing on the Internet to navigate to specific websites. In contrast, the Uniform Resource Name is location-in­de­pen­dent and per­ma­nent­ly des­ig­nates a resource. Thus, if URLs are primarily known in the form of web addresses, a URN can, for example, also appear as an ISBN to per­ma­nent­ly identify a book.

URL and URN follow the URI syntax. For this reason, both des­ig­na­tion types are URI subsets. URL and URN are thus always URIs. Con­verse­ly, Uniform Resource Iden­ti­fiers are neither URLs nor URNs.

Go to Main Menu