media-downloader/docs/SCRAPER_PROXY_SYSTEM.md

# Scraper Proxy Configuration System

## Overview

This document describes the design and implementation plan for a centralized scraper configuration system that provides:

1. **Per-scraper proxy settings** - Configure different proxies for different scrapers
2. **Centralized cookie management** - Store cookies in database instead of files
3. **FlareSolverr integration** - Test connections and refresh Cloudflare cookies
4. **Cookie upload support** - Upload cookies from browser extensions for authenticated access
5. **Unified Settings UI** - Single place to manage all scraper configurations

## Background

### Problem Statement

- Proxy settings are not configurable per-module
- Cookies are stored in scattered JSON files
- No UI to test FlareSolverr connections or manage cookies
- Adding new forums requires code changes
- No visibility into cookie freshness or scraper health

### Solution

A new `scrapers` database table that:
- Stores configuration for all automated scrapers
- Provides proxy settings per-scraper
- Centralizes cookie storage with merge logic
- Syncs automatically with platform configurations
- Exposes management via Settings UI

---

## Database Schema

### Table: `scrapers`

```sql
CREATE TABLE scrapers (
    id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    type TEXT NOT NULL,               -- 'direct', 'proxy', 'forum', 'cli_tool'
    module TEXT,                      -- Python module name, NULL for cli_tool
    base_url TEXT,                    -- Primary URL for the scraper
    target_platform TEXT,             -- 'instagram', 'snapchat', 'tiktok', NULL for forums/cli
    enabled INTEGER DEFAULT 1,        -- Enable/disable scraper

    -- Proxy settings
    proxy_enabled INTEGER DEFAULT 0,
    proxy_url TEXT,                   -- e.g., "socks5://user:pass@host:port"

    -- Cloudflare/Cookie settings
    flaresolverr_required INTEGER DEFAULT 0,
    cookies_json TEXT,                -- JSON blob of cookies
    cookies_updated_at TEXT,          -- ISO timestamp of last cookie update

    -- Test status
    last_test_at TEXT,                -- ISO timestamp of last test
    last_test_status TEXT,            -- 'success', 'failed', 'timeout'
    last_test_message TEXT,           -- Error message if failed

    -- Module-specific settings
    settings_json TEXT,               -- Additional JSON settings per-scraper

    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
    updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);
```

### Column Definitions

| Column | Type | Description |
|--------|------|-------------|
| `id` | TEXT | Unique identifier (e.g., 'imginn', 'forum_phun') |
| `name` | TEXT | Display name shown in UI |
| `type` | TEXT | One of: 'direct', 'proxy', 'forum', 'cli_tool' |
| `module` | TEXT | Python module name (e.g., 'imginn_module'), NULL for CLI tools |
| `base_url` | TEXT | Primary URL for the service |
| `target_platform` | TEXT | What platform this scraper downloads from (instagram, snapchat, tiktok, NULL) |
| `enabled` | INTEGER | 1=enabled, 0=disabled |
| `proxy_enabled` | INTEGER | 1=use proxy, 0=direct connection |
| `proxy_url` | TEXT | Proxy URL (http, https, socks5 supported) |
| `flaresolverr_required` | INTEGER | 1=needs FlareSolverr for Cloudflare bypass |
| `cookies_json` | TEXT | JSON array of cookie objects |
| `cookies_updated_at` | TEXT | When cookies were last updated |
| `last_test_at` | TEXT | When connection was last tested |
| `last_test_status` | TEXT | Result of last test: 'success', 'failed', 'timeout' |
| `last_test_message` | TEXT | Error message from last failed test |
| `settings_json` | TEXT | Module-specific settings as JSON |

### Scraper Types

| Type | Description | Examples |
|------|-------------|----------|
| `direct` | Downloads directly from the platform | instagram, tiktok, snapchat, coppermine |
| `proxy` | Uses a proxy service to download | imginn, fastdl, toolzu |
| `forum` | Forum scraper | forum_phun, forum_hqcelebcorner, forum_picturepub |
| `cli_tool` | Command-line tool wrapper | ytdlp, gallerydl |

### Target Platforms

The `target_platform` field indicates what platform the scraper actually downloads content from:

| Scraper | Target Platform | Notes |
|---------|-----------------|-------|
| imginn | instagram | Proxy service for Instagram |
| fastdl | instagram | Proxy service for Instagram |
| toolzu | instagram | Proxy service for Instagram |
| snapchat | snapchat | Direct via Playwright scraper |
| instagram | instagram | Direct via Instaloader |
| tiktok | tiktok | Direct via yt-dlp internally |
| coppermine | NULL | Not a social platform |
| forum_* | NULL | Not a social platform |
| ytdlp | NULL | Generic tool, multiple platforms |
| gallerydl | NULL | Generic tool, multiple platforms |

---

## Seed Data

Initial scrapers to populate on first run:

| id | name | type | module | base_url | target_platform | flaresolverr_required |
|----|------|------|--------|----------|-----------------|----------------------|
| imginn | Imginn | proxy | imginn_module | https://imginn.com | instagram | 1 |
| fastdl | FastDL | proxy | fastdl_module | https://fastdl.app | instagram | 1 |
| toolzu | Toolzu | proxy | toolzu_module | https://toolzu.com | instagram | 1 |
| snapchat | Snapchat Direct | direct | snapchat_scraper | https://snapchat.com | snapchat | 0 |
| instagram | Instagram (Direct) | direct | instaloader_module | https://instagram.com | instagram | 0 |
| tiktok | TikTok | direct | tiktok_module | https://tiktok.com | tiktok | 0 |
| coppermine | Coppermine | direct | coppermine_module | https://hqdiesel.net | NULL | 1 |
| forum_phun | Phun.org | forum | forum_downloader | https://forum.phun.org | NULL | 1 |
| forum_hqcelebcorner | HQCelebCorner | forum | forum_downloader | https://hqcelebcorner.com | NULL | 0 |
| forum_picturepub | PicturePub | forum | forum_downloader | https://picturepub.net | NULL | 0 |
| ytdlp | yt-dlp | cli_tool | NULL | NULL | NULL | 0 |
| gallerydl | gallery-dl | cli_tool | NULL | NULL | NULL | 0 |

### Notes on Seed Data

1. **Snapchat**: Uses direct Playwright-based scraper with optional proxy support (configured per-scraper in Scrapers settings page)

2. **Forums**: Derived from existing `forum_threads` table entries and cookie files

3. **Excluded scrapers**: YouTube and Bilibili are NOT included - they are on-demand downloaders from the Video Downloader page, not scheduled scrapers

---

## Auto-Sync Logic

The scrapers table stays in sync with platform configurations automatically:

### When Forums Change
- New forum added in Forums settings → Create scraper entry with `type='forum'`
- Forum removed from settings → Remove scraper entry

### When Modules Are Enabled/Disabled
- Module enabled → Ensure scraper entry exists
- Module disabled → Scraper entry remains but `enabled=0`

### No Manual Add/Delete
- The Scrapers UI does NOT have Add or Delete buttons
- Scrapers are managed through their respective platform configuration pages
- Scrapers UI only manages: proxy settings, testing, cookies

---

## Cookie Management

### Storage Format

Cookies are stored as JSON in the `cookies_json` column:

```json
{
  "cookies": [
    {
      "name": "cf_clearance",
      "value": "abc123...",
      "domain": ".imginn.com",
      "path": "/",
      "expiry": 1735689600
    },
    {
      "name": "session_id",
      "value": "xyz789...",
      "domain": "imginn.com",
      "path": "/",
      "expiry": -1
    }
  ],
  "user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36..."
}
```

### Cookie Merge Logic

**CRITICAL**: When updating cookies, MERGE with existing - never wipe:

```python
def merge_cookies(existing_cookies: list, new_cookies: list) -> list:
    """
    Merge new cookies into existing, preserving non-updated cookies.

    This ensures:
    - Cloudflare cookies (cf_clearance, __cf_bm) get refreshed
    - Site session/auth cookies are preserved
    - No data loss on test/refresh
    """
    # Index existing by name
    cookie_map = {c['name']: c for c in existing_cookies}

    # Update/add from new cookies
    for cookie in new_cookies:
        cookie_map[cookie['name']] = cookie

    return list(cookie_map.values())
```

### Cookie Sources

1. **FlareSolverr** - Automated Cloudflare bypass, returns CF cookies
2. **Upload** - User uploads JSON from browser extension (EditThisCookie, Cookie-Editor)
3. **Module** - Some modules save cookies during operation

### Cookie File Migration

Existing cookie files to migrate on first run:

| File | Scraper ID |
|------|------------|
| `cookies/coppermine_cookies.json` | coppermine |
| `cookies/imginn_cookies.json` | imginn |
| `cookies/fastdl_cookies.json` | fastdl |
| `cookies/snapchat_cookies.json` | snapchat |
| `cookies/forum_cookies_phun.org.json` | forum_phun |
| `cookies/forum_cookies_HQCelebCorner.json` | forum_hqcelebcorner |
| `cookies/forum_cookies_PicturePub.json` | forum_picturepub |

---

## Proxy Configuration

### Supported Proxy Formats

```
http://host:port
http://user:pass@host:port
https://host:port
https://user:pass@host:port
socks5://host:port
socks5://user:pass@host:port
```

### FlareSolverr Proxy Integration

When a scraper has `proxy_enabled=1`, the proxy is passed to FlareSolverr:

```python
payload = {
    "cmd": "request.get",
    "url": url,
    "maxTimeout": 120000
}
if proxy_url:
    payload["proxy"] = {"url": proxy_url}
```

**Important**: Cloudflare cookies are tied to IP address. If FlareSolverr uses a proxy, subsequent requests MUST use the same proxy or cookies will be invalid.

### Per-Module Proxy Usage

| Module | How Proxy is Used |
|--------|-------------------|
| coppermine_module | `requests.Session(proxies={...})` |
| imginn_module | Playwright `proxy` option |
| fastdl_module | Playwright `proxy` option |
| toolzu_module | Playwright `proxy` option |
| snapchat_scraper | Playwright `proxy` option (optional, configured in Scrapers page) |
| instaloader_module | Instaloader `proxy` parameter |
| tiktok_module | yt-dlp `--proxy` flag |
| forum_downloader | Playwright `proxy` option + requests |
| ytdlp | `--proxy` flag |
| gallerydl | `--proxy` flag |

---

## API Endpoints

### GET /api/scrapers

List all scrapers with optional type filter.

**Query Parameters:**
- `type` (optional): Filter by type ('direct', 'proxy', 'forum', 'cli_tool')

**Response:**
```json
{
  "scrapers": [
    {
      "id": "imginn",
      "name": "Imginn",
      "type": "proxy",
      "module": "imginn_module",
      "base_url": "https://imginn.com",
      "target_platform": "instagram",
      "enabled": true,
      "proxy_enabled": false,
      "proxy_url": null,
      "flaresolverr_required": true,
      "cookies_count": 23,
      "cookies_updated_at": "2025-12-01T10:30:00",
      "cookies_fresh": true,
      "last_test_at": "2025-12-01T10:30:00",
      "last_test_status": "success",
      "last_test_message": null
    }
  ]
}
```

### GET /api/scrapers/{id}

Get single scraper configuration.

### PUT /api/scrapers/{id}

Update scraper settings.

**Request Body:**
```json
{
  "enabled": true,
  "proxy_enabled": true,
  "proxy_url": "socks5://user:pass@host:port",
  "base_url": "https://new-domain.com"
}
```

### POST /api/scrapers/{id}/test

Test connection via FlareSolverr (if required) and save cookies on success.

**Response:**
```json
{
  "success": true,
  "message": "Connection successful, 23 cookies saved",
  "cookies_count": 23
}
```

### POST /api/scrapers/{id}/cookies

Upload cookies from JSON file. Merges with existing cookies.

**Request Body:**
```json
{
  "cookies": [
    {"name": "session", "value": "abc123", "domain": ".example.com"}
  ]
}
```

**Response:**
```json
{
  "success": true,
  "message": "Merged 5 cookies (total: 28)",
  "cookies_count": 28
}
```

### DELETE /api/scrapers/{id}/cookies

Clear all cookies for a scraper.

---

## Frontend UI

### Settings > Scrapers Tab

The Scrapers tab displays all scrapers grouped by type/platform:

```
┌───────────────────────────────────────────────────────────────────────┐
│ Settings > Scrapers                                                   │
├───────────────────────────────────────────────────────────────────────┤
│                                              Filter: [All Types ▼]    │
│                                                                       │
│ ─── Instagram Proxies ────────────────────────────────────────────── │
│                                                                       │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ ● Imginn                                              [Enabled ▼] │ │
│ │   https://imginn.com                                              │ │
│ │   ☐ Use Proxy  [                                              ]   │ │
│ │   Cloudflare: Required │ Cookies: ✓ Fresh (2h ago, 23 cookies)    │ │
│ │   [Test Connection] [Upload Cookies] [Clear Cookies]              │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│                                                                       │
│ ─── Direct ───────────────────────────────────────────────────────── │
│                                                                       │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ ● Instagram (Direct)                                  [Enabled ▼] │ │
│ │   https://instagram.com                                           │ │
│ │   ☐ Use Proxy  [                                              ]   │ │
│ │   Cloudflare: Not Required │ Cookies: ✓ 12 cookies                │ │
│ │   [Test Connection] [Upload Cookies] [Clear Cookies]              │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│                                                                       │
│ ─── Forums ───────────────────────────────────────────────────────── │
│                                                                       │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ ● Phun.org                                            [Enabled ▼] │ │
│ │   https://forum.phun.org                                          │ │
│ │   ☐ Use Proxy  [                                              ]   │ │
│ │   Cloudflare: Required │ Cookies: ⚠ Expired (3 days)              │ │
│ │   [Test Connection] [Upload Cookies] [Clear Cookies]              │ │
│ └───────────────────────────────────────────────────────────────────┘ │
│                                                                       │
│ ─── CLI Tools ────────────────────────────────────────────────────── │
│                                                                       │
│ ┌───────────────────────────────────────────────────────────────────┐ │
│ │ ● yt-dlp                                              [Enabled ▼] │ │
│ │   Generic video downloader                                        │ │
│ │   ☐ Use Proxy  [                                              ]   │ │
│ │   [Test Connection] [Upload Cookies]                              │ │
│ └───────────────────────────────────────────────────────────────────┘ │
└───────────────────────────────────────────────────────────────────────┘
```

### Button Visibility

| Button | When Shown |
|--------|------------|
| Test Connection | Always |
| Upload Cookies | Always |
| Clear Cookies | When cookies exist |

### No Add/Delete Buttons

Scrapers are NOT added or deleted from this UI. They are managed through:
- Forums settings (for forum scrapers)
- Platform settings (for other scrapers)

This UI only manages:
- Enable/disable
- Proxy configuration
- Cookie testing/upload/clear

---

## Module Integration

### Common Pattern

All modules follow this pattern to load scraper configuration:

```python
class SomeModule:
    def __init__(self, unified_db=None, scraper_id='some_scraper', ...):
        self.db = unified_db
        self.scraper_id = scraper_id

        # Load config from DB
        self.config = self.db.get_scraper(scraper_id) if self.db else {}

        # Check if enabled
        if not self.config.get('enabled', True):
            raise ScraperDisabledError(f"{scraper_id} is disabled")

        # Get base URL from DB (not hardcoded)
        self.base_url = self.config.get('base_url', 'https://default.com')

        # Get proxy config
        self.proxy_url = None
        if self.config.get('proxy_enabled') and self.config.get('proxy_url'):
            self.proxy_url = self.config['proxy_url']

        # Initialize CloudflareHandler with DB storage
        self.cf_handler = CloudflareHandler(
            module_name=self.scraper_id,
            scraper_id=self.scraper_id,
            unified_db=self.db,
            proxy_url=self.proxy_url,
            ...
        )
```

### CloudflareHandler Changes

```python
class CloudflareHandler:
    def __init__(self,
                 module_name: str,
                 scraper_id: str = None,      # For DB cookie storage
                 unified_db = None,            # DB reference
                 proxy_url: str = None,        # Proxy support
                 cookie_file: str = None,      # DEPRECATED: backwards compat
                 ...):
        self.scraper_id = scraper_id
        self.db = unified_db
        self.proxy_url = proxy_url

    def get_cookies_via_flaresolverr(self, url: str, max_retries: int = 2) -> bool:
        payload = {
            "cmd": "request.get",
            "url": url,
            "maxTimeout": 120000
        }
        # Add proxy if configured
        if self.proxy_url:
            payload["proxy"] = {"url": self.proxy_url}

        # ... rest of implementation

        # On success, merge cookies (don't replace)
        if success:
            existing = self.load_cookies_from_db()
            merged = self.merge_cookies(existing, new_cookies)
            self.save_cookies_to_db(merged)

    def load_cookies_from_db(self) -> list:
        if self.db and self.scraper_id:
            config = self.db.get_scraper(self.scraper_id)
            if config and config.get('cookies_json'):
                data = json.loads(config['cookies_json'])
                return data.get('cookies', [])
        return []

    def save_cookies_to_db(self, cookies: list, user_agent: str = None):
        if self.db and self.scraper_id:
            data = {
                'cookies': cookies,
                'user_agent': user_agent
            }
            self.db.update_scraper_cookies(self.scraper_id, json.dumps(data))

    def merge_cookies(self, existing: list, new: list) -> list:
        cookie_map = {c['name']: c for c in existing}
        for cookie in new:
            cookie_map[cookie['name']] = cookie
        return list(cookie_map.values())
```

---

## Scheduler Integration

The scheduler uses the scrapers table to determine what to run:

```python
def run_scheduled_downloads(self):
    # Get all enabled scrapers
    scrapers = self.db.get_all_scrapers()
    enabled_scrapers = [s for s in scrapers if s['enabled']]

    for scraper in enabled_scrapers:
        if scraper['type'] == 'forum':
            self.run_forum_download(scraper['id'])
        elif scraper['id'] == 'coppermine':
            self.run_coppermine_download()
        elif scraper['id'] == 'instagram':
            self.run_instagram_download()
        elif scraper['id'] == 'tiktok':
            self.run_tiktok_download()
        # etc.
```

---

## Migration Plan

### Step 1: Create Table

Add to `unified_database.py`:

```python
def _create_scrapers_table(self):
    self.cursor.execute('''
        CREATE TABLE IF NOT EXISTS scrapers (
            id TEXT PRIMARY KEY,
            name TEXT NOT NULL,
            type TEXT NOT NULL,
            module TEXT,
            base_url TEXT,
            target_platform TEXT,
            enabled INTEGER DEFAULT 1,
            proxy_enabled INTEGER DEFAULT 0,
            proxy_url TEXT,
            flaresolverr_required INTEGER DEFAULT 0,
            cookies_json TEXT,
            cookies_updated_at TEXT,
            last_test_at TEXT,
            last_test_status TEXT,
            last_test_message TEXT,
            settings_json TEXT,
            created_at TEXT DEFAULT CURRENT_TIMESTAMP,
            updated_at TEXT DEFAULT CURRENT_TIMESTAMP
        )
    ''')
```

### Step 2: Seed Initial Data

```python
def _seed_scrapers(self):
    scrapers = [
        ('imginn', 'Imginn', 'proxy', 'imginn_module', 'https://imginn.com', 'instagram', 1),
        ('fastdl', 'FastDL', 'proxy', 'fastdl_module', 'https://fastdl.app', 'instagram', 1),
        ('toolzu', 'Toolzu', 'proxy', 'toolzu_module', 'https://toolzu.com', 'instagram', 1),
        ('snapchat', 'Snapchat Direct', 'direct', 'snapchat_scraper', 'https://snapchat.com', 'snapchat', 0),
        ('instagram', 'Instagram (Direct)', 'direct', 'instaloader_module', 'https://instagram.com', 'instagram', 0),
        ('tiktok', 'TikTok', 'direct', 'tiktok_module', 'https://tiktok.com', 'tiktok', 0),
        ('coppermine', 'Coppermine', 'direct', 'coppermine_module', 'https://hqdiesel.net', None, 1),
        ('forum_phun', 'Phun.org', 'forum', 'forum_downloader', 'https://forum.phun.org', None, 1),
        ('forum_hqcelebcorner', 'HQCelebCorner', 'forum', 'forum_downloader', 'https://hqcelebcorner.com', None, 0),
        ('forum_picturepub', 'PicturePub', 'forum', 'forum_downloader', 'https://picturepub.net', None, 0),
        ('ytdlp', 'yt-dlp', 'cli_tool', None, None, None, 0),
        ('gallerydl', 'gallery-dl', 'cli_tool', None, None, None, 0),
    ]

    for s in scrapers:
        self.cursor.execute('''
            INSERT OR IGNORE INTO scrapers
            (id, name, type, module, base_url, target_platform, flaresolverr_required)
            VALUES (?, ?, ?, ?, ?, ?, ?)
        ''', s)
```

### Step 3: Migrate Cookies

```python
def _migrate_cookies_to_db(self):
    cookie_files = {
        'coppermine': '/opt/media-downloader/cookies/coppermine_cookies.json',
        'imginn': '/opt/media-downloader/cookies/imginn_cookies.json',
        'fastdl': '/opt/media-downloader/cookies/fastdl_cookies.json',
        'snapchat': '/opt/media-downloader/cookies/snapchat_cookies.json',
        'forum_phun': '/opt/media-downloader/cookies/forum_cookies_phun.org.json',
        'forum_hqcelebcorner': '/opt/media-downloader/cookies/forum_cookies_HQCelebCorner.json',
        'forum_picturepub': '/opt/media-downloader/cookies/forum_cookies_PicturePub.json',
    }

    for scraper_id, cookie_file in cookie_files.items():
        if os.path.exists(cookie_file):
            try:
                with open(cookie_file, 'r') as f:
                    data = json.load(f)

                # Store in DB
                self.cursor.execute('''
                    UPDATE scrapers
                    SET cookies_json = ?, cookies_updated_at = ?
                    WHERE id = ?
                ''', (json.dumps(data), datetime.now().isoformat(), scraper_id))

                self.logger.info(f"Migrated cookies for {scraper_id}")
            except Exception as e:
                self.logger.error(f"Failed to migrate cookies for {scraper_id}: {e}")
```

### Step 4: Migrate Snapchat proxy_domain

```python
def _migrate_snapchat_proxy_domain(self):
    # Get current proxy_domain from settings
    settings = self.get_setting('snapchat')
    if settings and 'proxy_domain' in settings:
        proxy_domain = settings['proxy_domain']
        base_url = f"https://{proxy_domain}"

        self.cursor.execute('''
            UPDATE scrapers SET base_url = ? WHERE id = 'snapchat'
        ''', (base_url,))

        # Remove from settings (now in scrapers table)
        del settings['proxy_domain']
        self.save_setting('snapchat', settings)
```

---

## Implementation Order

| Step | Task | Files to Modify |
|------|------|-----------------|
| 1 | Database schema + migration | `unified_database.py` |
| 2 | Backend API endpoints | `api.py` |
| 3 | CloudflareHandler proxy + DB storage + merge logic | `cloudflare_handler.py` |
| 4 | Frontend Scrapers tab | `ScrapersTab.tsx`, `Settings.tsx`, `api.ts` |
| 5 | Update coppermine_module (test case) | `coppermine_module.py` |
| 6 | Test end-to-end | - |
| 7 | Update remaining modules | `imginn_module.py`, `fastdl_module.py`, `toolzu_module.py`, `snapchat_scraper.py`, `instaloader_module.py`, `tiktok_module.py`, `forum_downloader.py` |
| 8 | Update scheduler | `scheduler.py` |
| 9 | Cookie file cleanup | Remove old cookie files after verification |

---

## Testing Checklist

### Database
- [ ] Table created on first run
- [ ] Seed data populated correctly
- [ ] Cookies migrated from files
- [ ] Snapchat proxy_domain migrated

### API
- [ ] GET /api/scrapers returns all scrapers
- [ ] GET /api/scrapers?type=forum filters correctly
- [ ] PUT /api/scrapers/{id} updates settings
- [ ] POST /api/scrapers/{id}/test works with FlareSolverr
- [ ] POST /api/scrapers/{id}/test works with proxy
- [ ] POST /api/scrapers/{id}/cookies merges correctly
- [ ] DELETE /api/scrapers/{id}/cookies clears cookies

### Frontend
- [ ] Scrapers tab displays all scrapers
- [ ] Grouping by type works
- [ ] Filter dropdown works
- [ ] Enable/disable toggle works
- [ ] Proxy checkbox and URL input work
- [ ] Test Connection button works
- [ ] Upload Cookies button works
- [ ] Clear Cookies button works
- [ ] Cookie status shows correctly (fresh/expired/none)

### Modules
- [ ] coppermine_module loads config from DB
- [ ] coppermine_module uses proxy when configured
- [ ] coppermine_module uses cookies from DB
- [ ] All other modules updated and working

### Scheduler
- [ ] Only runs enabled scrapers
- [ ] Passes correct scraper_id to modules

---

## Rollback Plan

If issues occur:

1. **Database**: The old cookie files are preserved as backups
2. **Modules**: Can fall back to reading cookie files if DB fails
3. **API**: Add backwards compatibility for old endpoints if needed

---

## Future Enhancements

Potential additions not in initial scope:

1. **Rotating proxies** - Support proxy pools with rotation
2. **Proxy health monitoring** - Track proxy success/failure rates
3. **Auto-refresh cookies** - Background job to refresh expiring cookies
4. **Cookie export** - Download cookies as JSON for backup
5. **Scraper metrics** - Track download success rates per scraper