How Do Python and BeautifulSoup Analyze Competitor’s Prices on eBay?

Monitoring competitor’s price differences are one of the most important applications of Web Scraping in retail and e-commerce. When done correctly, this can result in additional revenue while also allowing the store to stay in the game and avoid being caught off guard by what the competitor does.

This can be accomplished with the help of a simple script. We’ll utilize BeautifulSoup and Python to assist us to extract data, and we’ll keep an eye on eBay prices.

To begin, this is the basic code we’ll need to create an Amazon page and set up BeautifulSoup to enable us search the page for useful data using CSS selectors.

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requestsheaders = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'}
url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=iphone &_sacat=0&LH_TitleDesc=0&Model=Apple%20iPhone%208&_sop=12&LH_PrefLoc=0&rt=nc&Storage%20Capacity=64%20GB&_dcat=9355'response=requests.get(url,headers=headers)
soup=BeautifulSoup(response.content,'lxml')

We also pass the user-agent headers to replicate a browser call, for not getting blocked.

Let us take a look at the eBay website for iPhone clients, especially the iPhone 8 with 64GB capacity. This is how it appears to be.

How Do Python and BeautifulSoup Analyze Competitor’s Prices on eBay?

When we examine the website, we notice that each item’s HTML is contained within a tag having class s-item.

How Do Python and BeautifulSoup Analyze Competitor’s Prices on eBay?

We could just utilize this to divide the HTML page into such cards, each of which has unique item information.

# -*- coding: utf-8 -*- from bs4 import BeautifulSoup import requests headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2)AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'} 
url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=iphone &_sacat=0&LH_TitleDesc=0&Model=Apple%20iPhone%208&_sop=12&LH_PrefLoc=0&rt=nc&Storage%20Capacity=64%20GB&_dcat=9355' response=requests.get(url,headers=headers) soup=BeautifulSoup(response.content,'lxml')
#print(soup.select('.a-carousel-card')[0].get_text()) for item in soup.select('.s-item'): 
try: print('----------------------------------------') 
print(item.select('.s-item__title')[0].get_text().strip()) 
print(item.select('.s-item__subtitle')[0].get_text()) 
print(item.select('.s-item__reviews')[0].get_text()) 
print(item.select('.clipped')[0].get_text()) 
print(item.select('.s-item__reviews-count span')[0].get_text()) 
print(item.select('.s-item__price')[0].get_text()) 
print(item.select('.s-item__logisticsCost')[0].get_text()) 
print(item.select('.s-item__location')[0].get_text()) 
print('----------------------------------------') except Exception as e: #raise e print('')

When you run the code

python3 PriceTracker.py

On closer inspection, you will see that the code isolates the HTML cards since the product title always possesses the a-item title class and the price will possess the s-item price class. Now, let us count the number of reviews and ratings

print(item.select('.s-item__title')[0].get_text().strip()) 
print(item.select('.s-item__subtitle')[0].get_text()) 
print(item.select('.s-item__reviews')[0].get_text()) 
print(item.select('.clipped')[0].get_text()) 
print(item.select('.s-item__reviews-count span')[0].get_text()) 
print(item.select('.s-item__price')[0].get_text()) 
print(item.select('.s-item__logisticsCost')[0].get_text()) 
print(item.select('.s-item__location')[0].get_text())

We have also tried to fetch the location and the shipping price, all necessary data. The whole code will display as below:

# -*- coding: utf-8 -*- from bs4 import BeautifulSoup 
import requests headers = {'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9'} 
url = 'https://www.ebay.com/sch/i.html?_from=R40&_nkw=iphone &_sacat=0&LH_TitleDesc=0&Model=Apple%20iPhone%208&_sop=12&LH_PrefLoc=0&rt=nc&Storage%20Capacity=64%20GB&_dcat=9355' response=requests.get(url,headers=headers) soup=BeautifulSoup(response.content,'lxml') 
#print(soup.select('.a-carousel-card')[0].get_text()) 
for item in soup.select('.s-item'): 
try: 
print('----------------------------------------') 
print(item.select('.s-item__title')[0].get_text().strip()) 
print(item.select('.s-item__subtitle')[0].get_text()) 
print(item.select('.s-item__reviews')[0].get_text()) 
print(item.select('.clipped')[0].get_text()) 
print(item.select('.s-item__reviews-count span')[0].get_text()) 
print(item.select('.s-item__price')[0].get_text()) 
print(item.select('.s-item__logisticsCost')[0].get_text()) 
print(item.select('.s-item__location')[0].get_text()) 
print('----------------------------------------') except Exception as e: #raise e 
print('')

When you execute the code.

How Do Python and BeautifulSoup Analyze Competitor’s Prices on eBay?

It produces the required data.

You may even need to rotate the User-Agent string in more complicated implementations so eBay doesn’t think you’re using the same browser!

If we go one step further, you’ll see that eBay can just block your IP address, neglecting all other efforts. This is disappointing and hence most of the web crawling projects fail at this time.

Getting Around IP Blocks

Investment in a personal rotating proxy service like Proxies API can often mean the difference between a successful and a pain-free web scraping operation that continuously completes the work.

Plus, with the original bid of 1000 free API calls, there is almost nothing to lose by comparing notes while using our rotating proxy. It simply takes a single line of integration to make it almost unnoticeable.

Our Proxies API provides a simple API that can instantly fix all IP Blocking issues.

  • There are billions of high-speed spinning proxies scattered over the globe.
  • Using our automatic IP rotation.
  • Using our automated User-Agent-String rotation.
  • Using automated CAPTCHA solving technology.

Several clients have used a simple API to tackle the problem of IP restrictions.

In any programming language, a simple API can access the entire thing, as seen below.

curl https://xbyte.io/?key=API_KEY&url=https://example.com