What Role Does Web Scraping Play In The Fashion E Commerce?

May 12, 2022
what role does web scraping play in the fashion e commerce

Web data extraction or web scraping, is the form of data mining, which helps you to scrape data from websites as well as convert them to structured data for more analysis. Although there is ample data accessible online, getting data that is authentic, relevant, and present is not an easy job to do. To use them, a collection technique used is called web scraping.

In fact, the procedure involves consolidating applicable data from the given website and analyzing it.

The data extraction online could be done physically. Naturally, however, through automating the procedure, businesses can collect more data in much lesser time.

Examples of Data Scraping Method Uses

Web scraping represents itself as a valuable tool for businesses in diverse areas as well as for different requirements. The method can be utilized, for instance, to get access to the industry statistics, produce leads as well as do conduct market research. Let’s see a few examples of the uses for commercial objectives.

  • Data Science & Data Analytics: Collection of Machine Learning training data as well as enterprise database improvement.
  • Established Communication: Gather news about the business.
  • Finance: Financial data
  • Sales & Marketing: Price comparison, SEO, product description searching, lead generation, consumer sentiment monitoring, and website testing.
  • Strategy: Market research.
Examples of Data Scraping Method UsesExamples of Data Scraping Method Uses

Study Objective

  • Gather data from a fashion e-commerce website.
  • Implement the knowledge of web scrapping method using Requests, BeautifulSoup libraries.
  • Gather all data including product name, product id, and product price.
  • Compare pricing for a particular product category.
Study Objective

Company Studied

The fashion e-commerce website studied was H&M, which is a Swedish multinational fashion company available in 74 markets having over 5000 stores.

The business model of H&M is Quality and Fashion at the Best Prices, in a Maintainable Way. Along with its key brand, the H&M Group owns Monki, COS, Weekday, Arket, Afound, & Other Stories, as well as Sellpy brands.

Code Time

Importando as bibliotecas

from bs4 import BeautifulSoup
import requests
import pandas as pd
from datetime import datetime
import numpy as np

# url a ser feita o web scrapping
#url = 'https://www2.hm.com/en_us/men/products/jeans.html'

# Aqui mudei para pegar todos os produtos de todas as páginas
url = 'https://www2.hm.com/en_us/men/products/jeans.html?sort=stock&image-size=small&image=model&offset=0&page-size=102'
# Quem faz a requisição é um browser
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5),AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}

# Guardando a resposta dessa requisição em uma variável
page = requests.get( url, headers=headers )

soup = BeautifulSoup (page.text,'html.parser')

# Pegando todas as informações da vitrine do site 

products = soup.find('ul', class_ = 'products-listing small')

product_list = products.find_all( 'article', class_ = 'hm-product-item')

product_list [3].get('data-articlecode')

# product id

product_id= [p.get('data-articlecode') for p in product_list]


# product category

product_category= [p.get('data-category') for p in product_list]


product_list = products.find_all( 'a', class_='link' )
product_name = [p.get_text() for p in product_list]

# price

product_list = products.find_all ("span", class_ = "price regular")

product_price = [p.get_text() for p in product_list]


# inserindo em um dataframe no pandas

data = pd.DataFrame([product_id, product_category, product_name, product_price]).T
data.columns = ['product_id', 'product_category', 'product_name', 'product_price']

# Hora que foi realizado o scrapy

data["scrapy_datetime"] = datetime.now().strftime('%Y-%m-%d %H:%M:%S' )

# Removendo o caracter $ da coluna product_price

data['product_price'] = data['product_price'].str.replace('$','')

Final Code Results

sample data


Having data given in the CSV format, you can easily do an examining analysis of data to recognize which products experienced price changes as per the time and day web scraping method was executed.

For Fashion e-commerce data scraping, contact X-Byte Enterprise Crawling or ask for a free quote!