March 24, 2026

How to Monitor AI Model Deprecations in Real-Time

clock
7
min read
Copied!

Tom Shaked

linkedin
No items found.
How to Monitor AI Model Deprecations in Real-Time
March 24, 2026

How to Monitor AI Model Deprecations in Real-Time

clock
7
min read
Copied!

Tom Shaked

linkedin
No items found.
How to Monitor AI Model Deprecations in Real-Time

LLM providers deprecate models with relatively short notice, and there's no official alert system. Teams find out when their API calls start returning errors, not when the deprecation notice goes live. The window between deprecation announcement and model shutdown is typically 3–6 months, but only if you're actively checking the right documentation pages.

The Problem

Model deprecations are routine. OpenAI deprecated gpt-3.5-turbo-0301, gpt-4-0314, and others with 3–6 months notice. That notice lived on a documentation page, not in your inbox. The cost of missing one is real: production systems break when deprecated models are sunset. Migrations take time. Finding out from an API error is the worst possible discovery method.

Teams need to know as early as possible—ideally the day the deprecation notice goes up, not the day the model gets turned off. A week of warning means a frantic rewrite. Three months of warning means you can plan the migration properly. The difference between those two scenarios is having an automated monitor that you actually trust.

What to Monitor

These are the pages worth tracking:

platform.openai.com/docs/deprecations — OpenAI's official deprecation log with model names, deprecation dates, and recommended replacements.

platform.openai.com/docs/models — The full model catalog with status indicators, pricing, and context window info.

ai.google.dev/gemini-api/docs/models — Gemini model availability and status, including experimental and stable tiers.

x.ai/api — xAI's model listing and pricing.

For each of these, the fields that matter are: model name, deprecation date (or announcement date), shutdown date, and recommended replacement model. The structure differs across providers, which is part of why this problem is hard to solve at scale.

Building It with Python

Let's start simple and work toward something production-ready.

Step 1: Fetch the deprecations page

import requests
from bs4 import BeautifulSoup

response = requests.get(
    "https://platform.openai.com/docs/deprecations",
    headers={"User-Agent": "Mozilla/5.0"}
)
soup = BeautifulSoup(response.text, "html.parser")
print(soup.get_text()[:500])

This won't work. OpenAI's docs are React-rendered. The deprecation table isn't in the initial HTML response—it's built client-side. You'll get the page shell but not the data.

Step 2: Add JavaScript rendering with Playwright

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()
    page.goto("https://platform.openai.com/docs/deprecations")
    page.wait_for_load_state("networkidle")
    content = page.content()
    browser.close()

print(content[:1000])

This is better. Playwright loads a real browser and waits for the network to settle. But OpenAI sits behind Cloudflare. Headless Chromium often gets blocked. You might get through once or twice, then hit a 403 or a CAPTCHA.

Step 3: Add stealth and extract the data

from playwright.sync_api import sync_playwright
from playwright_stealth import stealth_sync
from bs4 import BeautifulSoup

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()
    stealth_sync(page)
    page.goto("https://platform.openai.com/docs/deprecations")
    page.wait_for_load_state("networkidle")
    html = page.content()
    browser.close()

soup = BeautifulSoup(html, "html.parser")

# Find the deprecation table — selector is fragile and will break
table = soup.find("table")
if table:
    rows = table.find_all("tr")
    for row in rows:
        cols = row.find_all(["th", "td"])
        print([col.get_text(strip=True) for col in cols])

Stealth helps bypass basic headless detection. You'll get further. But parsing the exact structure is fragile. Selectors change when OpenAI redesigns their docs page. Your table parser breaks silently, and you don't notice until you check the logs two weeks later.

Step 4: Track changes over time

import hashlib
import json
from datetime import datetime

def snapshot_page(content):
    return {
        "timestamp": datetime.now().isoformat(),
        "hash": hashlib.sha256(content.encode()).hexdigest(),
        "content": content
    }

def detect_new_deprecations(current_snapshot, previous_snapshot):
    if current_snapshot["hash"] != previous_snapshot["hash"]:
        print(f"[{datetime.now()}] Deprecation page changed")
        # diff the content to find what's new
        return True
    return False

Hash-based diffing tells you when something changed, but not what changed. A page redesign triggers an alert even if no deprecations were added. You need smarter diff logic—ideally parsing the actual model data and comparing field by field.

Step 5: Send an alert

import requests as req

def send_slack_alert(webhook_url, message):
    req.post(webhook_url, json={"text": message})

# Example usage
send_slack_alert(
    "https://hooks.slack.com/services/YOUR/WEBHOOK/URL",
    "⚠️ OpenAI deprecations page has changed — check platform.openai.com/docs/deprecations"
)

This works. But you'll get alerts for every page redesign, every typo fix, every CSS change. Real deprecations get buried in the noise.

Where It Gets Hard to Maintain

OpenAI's Cloudflare protection means your Playwright script works until it doesn't. Detection rules update on their schedule, not yours. You'll get blocked and have no way to know until the alert fails silently.

The deprecation page structure changes when OpenAI redesigns their docs. Your CSS selectors break without warning. You need to manually update the selectors every time, which defeats the purpose of automation.

You need to monitor multiple providers—OpenAI, Google, xAI, Anthropic, others. Each has a different page structure, different bot detection, different rendering patterns. One solution doesn't scale.

Scheduling the monitor reliably is its own problem. Cron jobs fail. Servers go down. A silent failure means you miss the deprecation notice you were trying to catch. Now you're building retry logic, error handling, and monitoring for the monitor.

False positives create alert fatigue. A page change that isn't a deprecation triggers a notification. Your team stops trusting the alerts. Real deprecations get ignored because they're mixed in with noise.

Making It Production-Ready

There are two paths: single-page extraction and parallel monitoring across multiple sources.

Single page extraction

from nimble_python import Nimble

nimble = Nimble(api_key="YOUR_API_KEY")

result = nimble.extract(
    url="https://platform.openai.com/docs/deprecations",
    render=True,
    driver="vx10",
    formats=["markdown"]
)

print(result.data.markdown)

This gives you consistent, rendered markdown from the deprecations page. No Cloudflare bypass needed. No selector maintenance. The page is parsed and structured automatically. You get a clean markdown representation every time.

Monitoring all providers in parallel with async

from nimble_python import Nimble
import time
import hashlib
from datetime import datetime

nimble = Nimble(api_key="YOUR_API_KEY")

pages = [
    {"provider": "OpenAI", "url": "https://platform.openai.com/docs/deprecations"},
    {"provider": "OpenAI Models", "url": "https://platform.openai.com/docs/models"},
    {"provider": "Gemini", "url": "https://ai.google.dev/gemini-api/docs/models"},
    {"provider": "xAI", "url": "https://x.ai/api"},
]

# Submit all at once
tasks = []
for page in pages:
    response = nimble.extract_async(
        url=page["url"],
        render=True,
        driver="vx10",
        formats=["markdown"]
    )
    tasks.append({**page, "task_id": response.task_id})
    print(f"Submitted: {page['provider']} → {response.task_id}")

# Collect results and check for changes
previous_hashes = {}  # load from persistent storage in practice

for task in tasks:
    while True:
        status = nimble.tasks.get(task["task_id"])
        if status.task.state == "success":
            result = nimble.tasks.results(task["task_id"])
            content = result.data.markdown
            current_hash = hashlib.sha256(content.encode()).hexdigest()

            if task["url"] in previous_hashes and previous_hashes[task["url"]] != current_hash:
                print(f"[{datetime.now()}] CHANGE DETECTED: {task['provider']} — {task['url']}")
                # send Slack alert, store diff, trigger review workflow

            previous_hashes[task["url"]] = current_hash
            break
        elif status.task.state == "failed":
            print(f"Failed: {task['provider']}")
            break
        time.sleep(2)

This submits all extraction jobs in parallel. Nimble handles the rendering, bot detection, and parsing. You get structured markdown back from each provider. You can diff across time, alert on real changes, and eliminate false positives by parsing the actual data.

For scheduled runs, Nimble's Managed Service handles execution and delivery. No infrastructure to maintain. No cron jobs. No servers. You define the schedule once, and it runs reliably.

Getting Started with Nimble

Install the SDK:

pip install nimble_python

Extract a single page:

from nimble_python import Nimble

nimble = Nimble(api_key="YOUR_API_KEY")

result = nimble.extract(
    url="https://platform.openai.com/docs/deprecations",
    render=True,
    driver="vx10"
)

print(result.data.markdown)

Extract starts at $0.90 per 1,000 URLs. Free trial: 5,000 pages.

Sign up at https://app.nimbleway.com/signup

Continue Exploring

Deprecations are one piece of the picture. These posts cover other LLM data worth tracking and how to collect it.

FAQ

Answers to frequently asked questions

No items found.