Web Scraping Using Python

Aghera Shyamkumar Bhupendrakumar
2 min readJul 26, 2021

--

What is Web Scraping?

Web scraping is an automated method using which we can extract large amounts of data from websites. Web scraping helps collect unstructured data of websites and store it in a structured manner.

Web Scraping

How you can Extract the data from a website?

You can follow the following steps to extract the data from any website.

  1. Find the URL that you want to scrape
  2. Inspecting the Page
  3. Find the data you want to extract
  4. Write the code
  5. Run the code and extract the data
  6. Store the data in the required format

So, first you need to find the Url that you want to scrape, inspect it, and find which data and division you want to extract.

I used the following libraries for data scraping.

  • Selenium for chrome driver
  • Beautiful Soup for data scraping
  • Pandas for data manipulation

For this Practical, I scrape the data from the Flipkart website.

following is the code that I wrote and execute for scraping the data.

from selenium import webdriver 
from bs4 import BeautifulSoup
import pandas as pd
driver = webdriver.Chrome("C:/Users/HP/Downloads/chromedriver")
products=[]
prices=[]
ratings=[]
spacification = []
driver.get("https://www.flipkart.com/laptops/pr?sid=6bo%2Cb5g&marketplace=FLIPKART&p%5B%5D=facets.price_range.from%3D40000&p%5B%5D=facets.price_range.to%3DMax")content = driver.page_source
soup = BeautifulSoup(content)
for element in soup.findAll('div', attrs={'class':'_1AtVbE col-12-12'}):
name=element.find('div', attrs={'class':'_4rR01T'})
price=element.find('div', attrs={'class':'_30jeq3 _1_WHN1'})
rating=element.find('div', attrs={'class':'_3LWZlK'})
spacifications = element.find('div',attrs={'class':'fMghEO'})
try:
products.append(name.text)
prices.append(price.text)
ratings.append(rating.text)
spacification.append(spacifications.text)
except:
continue
df=pd.DataFrame({'ProductName':products,'Price':prices,'Rating':ratings,'Spacifactions':spacification})
df.to_csv('products.csv', index=False, encoding='utf-8')

After executing this code I get the results like below.

Results of web scripting

We can store this data in a structured manner. we can store this data in a CSV file and we can use that data efficiently.

--

--

No responses yet