How to scrape data and automate things using selenium.

 Selenium: The selenium package is used to automate web browser interaction from Python. Python language bindings for Selenium WebDriver. Supported Python versions for Selenium are 2.7 and 3.5+.

 Selenium requires a driver to interface with the chosen browser. Selenium server is a Java program. Java Runtime Environment (JRE) 1.6 or a newer version is recommended to run the Selenium server.

To install selenium: pip install selenium  

The selenium.webdriver module provides all the WebDriver implementations. Currently supported WebDriver implementations are Firefox, Chrome, IE, and Remote. The Keys class provides keys in the keyboard like RETURN, F1, ALT, etc.  

What is a web driver? 

WebDriver drives a browser natively, as a user would, either locally or on a remote machine using the Selenium server, marks a leap forward in terms of browser automation. Selenium WebDriver refers to both the language bindings and the implementations of the individual browser controlling code. This is commonly referred to as just WebDriver. WebDriver is designed as a simple and more concise programming interface. WebDriver is a compact object-oriented API. It drives the browser effectively. 

Note: After installing selenium, you must need to install a driver. 

There are various strategies to locate elements on a page. You can use the most appropriate one for your case. Selenium provides the following methods to locate elements on a page:

  • find_element_by_id
  • find_element_by_name
  • find_element_by_xpath
  • find_element_by_class_name
  • find_element_by_css_selectors
Locating by XPath: XPath is the language used for locating nodes in an XML document. As HTML can be an implementation of XML (XHTML), Selenium users can leverage this powerful language to target elements in their web applications. XPath supports the simple methods of locating by id or name attributes and extends them by opening up all sorts of new possibilities such as locating the third checkbox on the page.

How to navigate links using selenium?
First, we need to import the web driver and create an object of the web driver. Normal way to navigate links is by calling the get method.

Syntax: driver.get(url)


How to scrape data from YouTube?


How to find XPath of elements?

  1. Right-click on a web page.
  2. Click on Inspect option.
  3. Then right-click on HTML code for finding XPath.
  4. Select the copy option, then select the Copy XPath option.
How to automate a website using selenium?

send_keys( ):- send_keys method is used to send a text to any field, such as input field of a form or even to anchor tag, paragraph, etc. It replaces its contents on a webpage in your browser.

send_keys(Keys.ENTER ):- Keys.ENTER parameter is used to press enter key.



Important selenium methods:-
  1. Browse Methods
  2. WebElement Methods
  3. Navigation Methods
  4. Wait Methods
  5. Switch Methods
  • Browse Methods: Group of methods that performs actions on a browser. 1. close( ): Close the current active window. 2. get(url): Load a new web page. 3. getCurrentUrl( ): Get a string defining the current page url. 4. getPageSource( ): Get the complete page source. 5.  getTitle( ): Get the current page title. 6. quit( ): Stops running the driver and closes the associated window.
  • WebElement Methods: A web element is called an element. Group of methods that performs actions on web elements. 1. findElement( ): It is important method. It's important because we have first find the WebElement before performing any action on the WebElement.  2. click( ): The click method used to click an element. 3. getText( ): It return visible text of element. 4. getAttribute( ): It returns an attribute's current value or null if there isn't a value. 5. isDisplayed( ): The isDisplayed( ) method returns a boolean value by determining if an element is displayed or not. 6. isEnabled( ): This method returns true if an element is enabled and false if an element is disabled.
  • Navigation Methods: Group of methods used for navigation. 1. navigate( ).refresh( ): This method refreshes the current page thereby reloading all WebElements. 2. navigate( ).back( ): This method moves back a single page in our browser's history. 3, navigate().forward(): This method moves forward one page in our browser's history. 
  • Wait Methods: 1.ImplicitWait() method is basically your way of telling WebDriver the latency that you want to see if a specified web element is not present that WebDriver looking for. 2. ExplicitWait(): It is the custom one. It will be used if we want the execution to wait for some time until some condition is achieved. 3.FluentWait() 
  • Switch Methods: These methods are basically used for switching between frames, windows, and alters. 1. switch_to_frame(id/name): This method is used to identify a frame with the help of a frame id or a frame name then switch the focus to a particular frame. 2. sitch_to_alert: This method switches the focus to alert. 3. switch_to.windows: This method is used for switching between windows with the help of windows_handles id.


Thank you 😊 for reading. Please read other blogs, and share with your friends and family. Also if you have any queries, please comment.


Comments

Popular posts from this blog

How to convert PDF file into audio file?

How to perform operations on emails and folders using imap_tools?

Pillow Libary in Python.