Goal of the mini-project
The aim here is to –
Verify that product schema (JSON-LD) is implemented correctly on example.co.uk after the migration to Adobe Commerce (Magento).
The script crawls your chosen product URLs and reports if required fields like price, brand, sku, and availability are present.
Step 1 – Open a terminal
Click the black terminal icon on the Pi desktop.
Step 2 – Check Python 3
python3 --version
You should see something like Python 3.9.2 (any 3.7+ is fine).
Step 3 – Install libraries
sudo apt update pip3 install requests beautifulsoup4
Step 4 – Create a working folder
mkdir ~/schema_check cd ~/schema_check
Step 5 – Create the script file
nano check_schema.py
Then paste this entire script:
import requests, json, csv, time
from bs4 import BeautifulSoup
# ---------- configuration ----------
# Put your product URLs here (you can add as many as you like)
urls = [
"https://www.example.co.uk/example-product-1",
"https://www.example.co.uk/example-product-2"
]
# Fields you want to confirm exist in the Product schema
required_fields = ["name", "brand", "sku", "price", "priceCurrency", "availability"]
# Optional delay between requests (seconds)
delay = 2
# ---------- functions ----------
def extract_product_schema(url):
try:
r = requests.get(url, timeout=15)
soup = BeautifulSoup(r.text, "html.parser")
for tag in soup.find_all("script", type="application/ld+json"):
try:
data = json.loads(tag.string)
if isinstance(data, list):
for item in data:
if item.get("@type") == "Product":
return item
elif data.get("@type") == "Product":
return data
except Exception:
continue
except Exception as e:
print(f"Error fetching {url}: {e}")
return None
def check_fields(product_json):
found = json.dumps(product_json)
return [f for f in required_fields if f not in found]
# ---------- main ----------
results = []
for u in urls:
print(f"Checking {u} ...")
product = extract_product_schema(u)
if not product:
print(f"❌ No Product schema found: {u}")
results.append([u, "No Product schema", ""])
else:
missing = check_fields(product)
if missing:
print(f"⚠ Missing: {', '.join(missing)}")
results.append([u, "Missing fields", ", ".join(missing)])
else:
print(f"✅ All key fields present")
results.append([u, "All fields present", ""])
time.sleep(delay)
# ---------- save to CSV ----------
with open("schema_results.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerow(["URL", "Status", "Missing Fields"])
writer.writerows(results)
print("\nDone! Results saved to schema_results.csv")
Save and exit:
- Ctrl + O, Enter → save
- Ctrl + X → exit
Step 6 – Edit your URLs
Later, open the script again (nano check_schema.py) and replace the two example links with your 10–50 product URLs.
Each URL must be inside quotes and separated by commas.
Step 7 – Run the script
python3 check_schema.py
It will:
- Fetch each page
- Extract the Product JSON-LD
- Report any missing fields
- Save a summary to
schema_results.csvin the same folder
Step 8 – View the results
cat schema_results.csv
or open the file in LibreOffice Calc / Excel.
Example output:
URL,Status,Missing Fields https://www.example.co.uk/football-goal.html,All fields present, https://www.example.co.uk/tennis-net.html,Missing fields,priceCurrency availability https://www.example.co.uk/baseball-bat.html,No Product schema,
Optional tweaks
- Increase
delay = 2to5if you test hundreds of URLs (avoids rate limits). - You can import hundreds of URLs from a CSV by editing the script — ask later if you’d like that version.
- Re-run anytime to confirm schema fixes.
Quick recap
1Open terminal(click icon)
2Check Pythonpython3 --version
3Install depspip3 install requests beautifulsoup4
4Make foldermkdir ~/schema_check & cd ~/schema_check
5Create scriptnano check_schema.py
6Edit URLsinside script
7Run itpython3 check_schema.py
8View resultscat schema_results.csv
That’s it. Job done.
You’ve now got a simple tool that checks your product schema in seconds. No fancy platforms. No monthly fees. Just a Raspberry Pi doing proper work.
Run it whenever you push changes. Catch broken schema before Google does. Keep your rich results intact.
The script sits there, ready to go. Update your URLs. Hit run. Get answers.
This is what proper validation looks like – fast, local, and under your control.
Next steps?
- Bookmark this guide for when you migrate sites
- Test 50 products now, then spot-check monthly
- If you need the CSV import version, you know where I am
Your structured data matters. Now you can actually prove it’s working.
Go check your products. Then sleep better knowing your schema’s solid.
Questions? Issues? The comments are open. MOFOs