Blog

How Does Selenium Work with WebDriver: A Guide

Written by Charan Sai Dasagrandhi | Jul 17, 2020 7:26:36 PM

WebDriver is a browser automation framework that works with open-source APIs. The framework operates by accepting commands, sending those commands to a browser, and interacting with applications. WebDriver is the most widely used automation tool for web applications and supports all major browsers, including Chrome, Firefox, Internet Explorer and Edge. Learn more about what is selenium WebDriver and how it works in this article. 

Selenium WebDriver Architecture

From a functional perspective, WebDriver is a public interface that defines the reference variable (driver) of interface types. Any object assigned to this driver variable must be an instance of the class (ChromeDriver or other browser driver class) that implements the interface.

You can instantiate the ChromeDriver class by creating the driver object:

WebDriver driver = new ChromeDriver();

Figure: Selenium WebDriver Automation Framework

The figure above represents the Selenium Client Library Language. In this case, ChromeDriver is the browser driver and Chrome is the browser. The architecture and the functions remain the same across all other Client Libraries, browser drivers and browsers. 

How Does Selenium Work? 

What Does the Selenium Client Library Do? 

The Selenium Client Library supports multiple client libraries, such as Java, Python, .NET, Ruby and more. Selenium’s language binding feature allows Selenium to support multiple languages. The client library and language binding feature are used based on the programming language of the test scripts. For example, if you need to use the browser driver in Java, you must use the Java language bindings. 

How Does JSON Wire Protocol Work? 

JSON is a lightweight data-interchange that lets clients and servers send and receive requests and responses. JSON Wire Protocol transfers requests between Client Library and browser driver HTTP servers through REST APIs.

Sample JSON input format:

Because servers do not understand programming languages, JSON Wire Protocol uses the process of serialization (converting object data to JSON format) and de-serialization (converting JSON format to object). JSON Wire Protocol has REST APIs working over HTTP.

For every Selenium command, there is a respective REST API in JSON Wire Protocol. Selenium works through API commands, such as GET and POST, and will function based on the Selenium script requests it gets. The requests then get sent to the HTTP server of the browser driver, as well as the browsers through HTTP. This interacts with the elements of the application and the response is sent back to the IDE through the same channel. 

Browser Drivers

Each browser has a specific browser driver. Browser drivers can be downloaded from Selenium's official site. Set the path of the driver before the object is created as follows:

System.setProperty(“webdriver.chrome.driver”, E:\\chromedriver.exe”);

WebDriver driver = new ChromeDriver();

Now, initiate the test or program by invoking the Selenium WebDriver interaction between the Selenium Client Library and the real browser.  The browser driver is the key to receiving the HTTP requests from the JSON Wire Protocol through its HTTP server and sends the processed requests to the real browser. This is where the interaction with elements takes place in any operation.

What's the Process for Real Browsers? 

Testers and developers manually interact with the application elements to perform testing or a transaction to ensure the quality of the applications. The entire process can be automated with the Selenium WebDriver architecture. With this architecture, the browser-specific driver's framework can take commands and HTTP requests to the HTTP server and interact with real browsers.

Real browsers receive the requests from their browser driver and operate on the application elements according to the request. The browser responds to its browser driver with the operation's output. Then, the browser driver returns the output details to the client through the same channel.

End-to-End Scenario

Imagine the following criteria: 

  • Application login page is opened with Java, the Selenium Client Library Language
  • ChromeDriver is the driver browser
  • Chrome is the browser 
  •  

System.setPropery("webdriver.chrome.driver", "E:\\chromedriver.exe");

WebDriver driver= new chromeDriver();

driver.get("https://HereYourApplicationURL.com");

In the above scenario, when a test is run, the request to open the application's login page will be sent to JSON Wire Protocol through Selenium, which converts the request to JSON format. The process will continue as follows: 

  1. JSON Wire Protocol sends the request to HTTP Server of ChromeDriver through REST API over HTTP
  2. The ChromeDriver sends the received request to the browser server
  3. The Chrome browser will open the application login page of the provided URL
  4. The Chrome Browser will communicate the response back to the client upon executing the requested operations on the application
  5. This response will be sent back to the client through the same channel 

About The Author

Ajay works as a Senior Test Engineer with V-Soft Digital and has more than six years of software testing experience. His expertise includes agile process methodology, security testing, performance testing, ITIL and ServiceNow. Ajay has hands-on experience in various testing areas, including, functional and automation testing (web & mobile apps), web services, GUI and database testing.