The session ini­ti­a­tion protocol (SIP) is re­spon­si­ble for ini­ti­at­ing and ter­mi­nat­ing audio and video con­nec­tions in real time. You’ll find it often used in IP telephony.

What is SIP?

Video con­fer­enc­ing, instant messaging, file­shar­ing, IP-based telephone calls and other forms of real-time com­mu­ni­ca­tion are part of everyday life for many of us. These tech­nolo­gies have had a major impact on how we com­mu­ni­cate with friends, family and col­leagues. An important part of all these ap­pli­ca­tions is the session ini­ti­a­tion protocol, or SIP for short. This protocol is re­spon­si­ble for the ini­ti­a­tion, man­age­ment and ter­mi­na­tion of audio or video con­ver­sa­tions that use VoIP (Voice over Internet Protocol). This protocol monitors the char­ac­ter­is­tics of IP networks and is an essential part of real-time com­mu­ni­ca­tion.

The in­tro­duc­tion of SIP, defined in RFC 3261, made internet-based telephony a viable al­ter­na­tive to tra­di­tion­al telephone calls, which use hardware-based telephony systems. With SIP, users enjoy increased mobility and benefit from cost savings. Since its in­tro­duc­tion in 2004, SIP has become in­creas­ing­ly important, almost com­plete­ly replacing sta­tion­ary telephone systems.

The session ini­ti­a­tion protocol is text based and, in many ways, similar to HTTP (hypertext transfer protocol) online and SMTP (simple mail transfer protocol) for email.

What does SIP do?

Just like the other two protocols, SIP finds itself on the fifth layer of the OSI model, the session layer. SIP is similar to a switch­board of a telephone company. On a switch­board, operators ensure that a con­ver­sa­tion between two people can be initiated. During the con­ver­sa­tion, the con­nec­tion is main­tained. When both parties are finished, the con­nec­tion is ter­mi­nat­ed and the line is reopened to other calls. This is exactly what SIP does. The session ini­ti­a­tion protocol is not re­spon­si­ble for the other aspects of com­mu­ni­ca­tion.

With SIPS (session ini­ti­a­tion protocol secure), SIP can also manage secure and encrypted con­ver­sa­tions. Since the session and the devices are separate from each other, both their data flows can also the­o­ret­i­cal­ly be encrypted. In order to transfer the con­ver­sa­tion itself, other protocols are used. These include the real-time transport protocol (RTP) and the session de­scrip­tion protocol (SDP), which makes IP addresses available.

How does SIP work?

SIP is based on tra­di­tion­al client-server ar­chi­tec­ture. The base protocol works on requests and responses for which SIP acts as an in­ter­me­di­ary between the connected devices. This means it can work with almost every internet-connected device. SIP receives the requests from clients or user agent clients (UAC) and the responses from the cor­re­spond­ing servers or user agent servers (UAS). Via the [SIP trunk] interface, phone numbers are published. However, other protocols are re­spon­si­ble for the actual transfer of data. Other com­po­nents for SIP com­mu­ni­ca­tion include proxy servers and other gateways.

The session de­scrip­tion protocol de­ter­mines which type of con­nec­tion is possible and regulates modal­i­ties. These various methods are also known as codecs. The network addresses that are used are de­ter­mined by the SDP. Once all this has been es­tab­lished, protocols such as RTP ensure the data is trans­ferred. Once the session ends, the con­nec­tion is ter­mi­nat­ed by SIP.

How is SIP addressed?

SIP uses the uniform resource iden­ti­fi­er (URI) and the domain name system (DNS) to ensure addresses are correct. The addresses that are assigned to par­tic­i­pants are similar to typical email addresses. As with an email address, a SIP address is made up of two parts. The first part is the username or a telephone number, and the second part is a cor­re­spond­ing network. Telephone numbers are common on devices which offer an interface to tra­di­tion­al telephone networks.

What are SIP requests?

SIP rec­og­nizes various requests which are then met with responses. The responses are based on HTTP status codes. SIP requests are separated into simple SIP requests and expanded SIP requests.

Simple SIP requests

  • ACK confirms that a request or a response has been received.
  • BYE signals the correct ter­mi­na­tion of a session.
  • CANCEL withdraws a pending request.
  • INVITE sends a request to a server to create a session.
  • OPTIONS gives devices an overview of the spec­i­fi­ca­tion of other devices.
  • REGISTER registers a device to a service provider.

Expanded SIP requests

  • INFO sends in­for­ma­tion which is not directly related to the SIP session.
  • MESSAGE sends a text message to a device.
  • NOTIFY checks the condition of the con­nec­tion and sends no­ti­fi­ca­tions if there are any changes.
  • PRACK confirms a request in advance.
  • REFER forwards a current con­nec­tion to another par­tic­i­pant.
  • SUBSCRIBE monitors for par­tic­u­lar events and sends a message when they occur.
  • UPDATE changes the status of a call.

What are SIP responses?

SIP responses are es­sen­tial­ly answers to the requests listed above. You can divide these responses into six cat­e­gories:

  • 1xx provides pro­vi­sion­al status in­for­ma­tion before the server responds.
  • 2xx shows that a request was suc­cess­ful.
  • 3xx gives in­for­ma­tion about any possible or necessary for­ward­ing.
  • 4xx shows that a request could not be processed.
  • 5xx is a response to a server failure.
  • 6xx shows that although the server was able to be contacted, due to various reasons, no response is possible.

What’s the dif­fer­ence between SIP and VoIP?

Even though SIP and VoIP are closely related and you may find both protocols being used together, they are not the same. SIP initiates, maintains and ends con­nec­tions. For the actual transfer of data packages across various network types and servers, VoIP is needed.

Go to Main Menu