Selenium tutorial: how to use the cross-browser testing framework

In web development nowadays, it’s absolutely essential to prepare an application for the different display sizes of users or visitors. The responsive approach involves a published web project to automatically adjust to the technical requirements of the respective device and has established itself as the uncontested standard solution. Just as important for modern websites as high flexibility regarding display size is optimal browser compatibility: for the end users, it should make no difference if they access a site with Firefox, Chrome, Safari or different web client.

It’s not surprising then that cross-browser testing has played a major role in programming web projects for many years. One of the most popular tools for performing these tests is the Selenium framework, released by ThoughtWorks. We take a closer look at it in this tutorial.

What is Selenium or Selenium WebDriver?

In order to optimize the development process for a time and cost application in Python, Jason Huggins created the JavaScriptTestRunner in 2004 – the core element of the web testing framework now known as Selenium. Initially, the tool was only used internally by ThoughtWorks, the software company where Huggins worked at the time. After he switched to Google in 2007, he advanced the development of the open-source software (Apache 2.0 license). Following a merger with the API WebDriver, the testing framework was given its current name: Selenium or Selenium WebDriver.

Today’s version of Selenium is based exclusively on HTML and JavaScript, and enables developers to test and record interactions with a web application, and repeat automated testing as often as necessary. The key components that make this testing process possible are:

  • Selenium Core: The core module contains the basic functionality of the framework, including the JavaScriptTestRunner as well as the underlying test command API.
  • Selenium IDE: Selenium IDE is the development environment of the testing framework, which serves as a basis for the IDE extension for Chrome and Firefox, among other functions. This is required for recording and running tests.
  • Selenium WebDriver: WebDriver is the key interface for simulating user interactions in any browser – whether it’s Firefox, Chrome, Edge, Safari or Internet Explorer. Since 2018, the API has been an official W3C standard.
  • Selenium Grid: Selenium Grid is an extension of WebDriver or rather its predecessor Selenium Remote Control (RC), which enables tests to be run simultaneously on multiple servers. This allows the test length to be reduced significantly.

In order to protect your privacy, the video will not load until you click on it.

Where is Selenium WebDriver used?

Selenium is an established name in the world of testing and is popular as a basic foundation for various programs in this context. The best-known examples include the end-to-end framework Protractor, designed specifically for testing Angular and AngularJS applications which uses the WebDriver API to simulate user interactions. The test automation software Appium, designed for native and hybrid mobile apps, also uses the standardized interface to quickly and conveniently perform tests in the chosen programming language.

The well-known, cloud-based web service BrowserStack also utilizes Selenium. The service is developed in India and is available as various paid subscription packages following a free trial phase. It uses the testing framework as a basis for its automated desktop and mobile tests.

Note

Because they are open-source software, Selenium or Selenium WebDriver can be used independently of existing programs. Thanks to the various components that have been combined in the framework, this is no problem at all – provided you have the required skills.

Selenium WebDriver tutorial: How to use the framework for your web tests

Anyone can run their own test cases with Selenium without having to depend on an external service or software manufacturer, and without requiring any special programming skills. Browser testing using the framework is actually designed to be straightforward in order to write test scripts in various languages – including JavaScript, Java, C#, Ruby, and Python. In the following Selenium tutorial, we’ll demonstrate how to set up and use the framework with a Python example.

Note

The following steps of this Selenium WebDriver Tutorial require a current version of Python to be installed.

Step 1: Install Selenium

Clients as well as a range of libraries are available for using Selenium WebDriver. These are known collectively as language bindings and form the basis of the framework or testing process. However, you only need to install the client drivers and libraries for the language you wish to write your script in.

In our Selenium WebDriver tutorial, we need the language bindings for Python, which are installed as standard with the Python package manager “pip” and the following command:

pip install selenium

Alternatively, download the source package via the link above, unpack the tar.gz archive, and use the setup.py file to run the installation. In this case, the required command is as follows:

python setup.py install

Next, download and install the browser driver that is to be connected with Selenium. For example, Firefox requires the driver “geckodriver”; you can find its official releases in the following GitHub repository. A complete list of drivers for the most common browsers (Chrome, Edge, Internet Explorer, Safari etc.) is accessible in the official Selenium documentation.

Note

If you want to use Selenium via remote connection, you’ll also need to install the Selenium server components. The corresponding installation packages can be found on the official download page of the framework (in the “Selenium Server (Grid)” section). Please bear in mind that running the server application requires an up-to-date version of the Java runtime environment (JRE).

Step 2: Choose the right development environment (IDE)

After you’ve met the preconditions for using Selenium WebDriver, you can start writing testing scripts for your web project. Your usual code editor is generally suitable for this purpose. But in this case, we recommend using a Python IDE (or a development environment for the language you are working with) to maximize productivity. Well-known and popular solutions include the following:

  • PyCharm: The Python IDE PyCharm is available as a free, open-source community edition or as a paid professional version. However, you can write scripts for Selenium tests with both variants. Windows, Linux, and macOS are all supported operating systems.
  • PyDev: PyDev is a Python plug-in for the development environment Eclipse, which is essentially intended for the development of Java applications. The extension can either be downloaded and installed via the project page or directly using the update manager. PyDev and Eclipse run on all common Windows, macOS, and Linux systems.

Step 3: Create Python script for browser testing (Firefox) with Selenium

Once you have your desired solution ready, you can start writing individual scripts for automating browser interactions – with the help of Selenium classes and functions. In this Selenium tutorial, we create an example Python script for Firefox, which automatically opens the Google search engine in the Mozilla browser, enters a search term, and then analyzes and records the results. Seen in code form, these automated steps appear like this:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

# Start Firefox session
driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.maximize_window()

# Open web application
driver.get(“http://www.google.com”)

# Localize textbox
search_field = driver.find_element_by_id("lst-ib")
search_field.clear()

# Enter and confirm search term
search_field.send_keys(“search term”)
search_field.submit()

# Open list of search results shown following the search
# using the method find_elements_by_class_name
lists= driver.find_elements_by_class_name("_Rm")

# Run through all elements and return individual text

i=0
for listitem in lists:
  print (listitem.get_attribute("innerHTML"))
  i=i+1
  if(i>10):
    break

# Close browser window
driver.quit()

The meaning of the individual script components is summarized in the table below:

Code Meaning
from selenium import webdriverfrom selenium.webdriver.common.keys import Keys First, the WebDriver module is loaded in order to implement the classes and methods for supporting different browsers. The script will then load the setup for the virtual keyboard for simulating the keyboard entries at a later stage.
driver = webdriver.Firefox()driver.implicitly_wait(30)driver.maximize_window() In the second step, a Firefox instance is created that can later be controlled with Selenium commands. A timeout of 30 seconds is provided for starting the browser; the browser window is also to be maximized.
driver.get(“http://www.google.com”) The script now opens the Google search page that provides a basis for the automated user interactions.
search_field = driver.find_element_by_id("lst-ib")search_field.clear() Once the search engine is opened, the script will look for the Google search textbox – marked by the ID attribute “lst-ib”. As soon as it has been localized, the search field is first cleared with the method: clear().
search_field.send_keys(“search term”)search_field.submit() The text “search term” is entered and confirmed using the method: submit().
lists= driver.find_elements_by_class_name("_Rm") The individual search results are subsequently listed as <a> elements. In order to manage them using the method find_elements_by_class_name, it is inserted in the script at this point.
i=0for listitem in lists: print (listitem.get_attribute("innerHTML")) i=i+1 if(i>10): break In the last active step, the script returns a list of search results – limited to the first ten entries with a <a> tag.
driver.quit() The final line of code then closes the browser instance.
Note

The example used in this Selenium tutorial clearly shows how well the framework is suited to browser testing. But it also demonstrates that Selenium WebDriver is an interesting option for web scraping with Python, which is another versatile usage possibility offered by the testing suite. More detailed information about this web data repository can be found in our article on web scraping.

Selenium WebDriver: not suitable for all scenarios

Selenium or Selenium WebDriver offer first-rate tools for collecting important website data and simulating user interactions. But the open-source framework is not suitable for all areas of application – as the developers make clear in their list of “worst cases”. These include the following cases or website content which cannot be tested or recorded with Selenium:

  • Captchas: The well-known and widespread captchas were developed specifically to protect against bots and spam and are therefore not available for automation processes with Selenium. The input tests should, therefore, be deactivated during testing or scraping, or temporarily replaced with an alternative element.
  • File downloads: Although you can start downloading files in Selenium instances with a simulated link click, the API does not display the progress of the download process.
  • HTTP status codes: Selenium has certain weaknesses when it comes to handling HTTP status codes. But these disadvantages can be offset by using an additional proxy, if necessary.
  • Login with third-party services: From social media platforms and cloud services to email accounts – signing into third-party providers via a Selenium session is not recommended. On the one hand, the providers of these services offer their own APIs for test purposes. On the other, testing with the framework in these cases is very cumbersome.
  • Performance testing: Selenium WebDriver is not suitable for pure performance tests, since the framework is simply not designed for them.

Wait! We’ve got something for you!
Have a look at our great prices for different domain extensions.


Enter the web address of your choice in the search bar to check its availability.
.org
$1/1st year
then $20/year
.com
$1/1st year
then $15/year
.info
$1/1st year
then $20/year
.me
$1/1st year
then $20/year