# Media Downloader - GUI Design & Implementation Plan **Version:** 1.0 **Date:** October 25, 2025 **Status:** Planning Phase --- ## Table of Contents 1. [Executive Summary](#executive-summary) 2. [Current System Analysis](#current-system-analysis) 3. [GUI Architecture Options](#gui-architecture-options) 4. [Recommended Approach](#recommended-approach) 5. [Technology Stack](#technology-stack) 6. [Implementation Phases](#implementation-phases) 7. [Feature Roadmap](#feature-roadmap) 8. [API Specification](#api-specification) 9. [UI/UX Design](#uiux-design) 10. [Database Integration](#database-integration) 11. [Real-time Updates](#real-time-updates) 12. [Security Considerations](#security-considerations) 13. [Development Timeline](#development-timeline) --- ## Executive Summary The Media Downloader GUI project aims to create a modern, user-friendly web interface for managing automated media downloads from multiple platforms (Instagram, TikTok, Snapchat, Forums). The GUI will be modeled after the proven **backup-central** architecture, using Node.js/Express backend with vanilla JavaScript frontend. ### Key Goals: - **Maintain existing Python backend** - Preserve all battle-tested scraping logic - **Modern web interface** - Real-time updates, responsive design, dark/light themes - **Easy management** - Visual account configuration, manual triggers, scheduler control - **Enterprise-grade** - Similar to backup-central's polished UI and reliability --- ## Current System Analysis ### Existing Architecture ``` media-downloader.py (Python Orchestrator) ├── Unified Database (SQLite with WAL mode) │ ├── downloads table (1,183+ records) │ ├── forum_threads, forum_posts │ ├── scheduler_state, download_queue │ └── File hash deduplication (NEW) │ ├── Platform Modules (16 modules) │ ├── instaloader_module.py (Instagram via API) │ ├── fastdl_module.py (Instagram web scraper) │ ├── imginn_module.py (Instagram alternative) │ ├── toolzu_module.py (High-res Instagram 1920x1440) │ ├── snapchat_scraper.py (direct Playwright scraper) │ ├── tiktok_module.py (yt-dlp wrapper) │ └── forum_downloader.py (7 forum types) │ ├── Subprocess Wrappers (Playwright automation) │ ├── fastdl_subprocess_wrapper.py │ ├── imginn_subprocess_wrapper.py │ ├── toolzu_subprocess_wrapper.py │ ├── snapchat_subprocess_wrapper.py │ └── forum_subprocess_wrapper.py │ ├── Support Systems │ ├── scheduler.py (randomized intervals, persistent state) │ ├── move_module.py (file operations + deduplication) │ ├── pushover_notifier.py (push notifications) │ ├── download_manager.py (multi-threaded downloads) │ └── unified_database.py (connection pooling, WAL mode) │ └── Configuration └── config/settings.json (100+ parameters) ``` ### Current Capabilities **Supported Platforms:** - Instagram (4 methods: InstaLoader, FastDL, ImgInn, Toolzu) - TikTok (via yt-dlp) - Snapchat Stories - Forums (XenForo, vBulletin, phpBB, Discourse, IPB, MyBB, SMF) **Advanced Features:** - Quality upgrade merging (FastDL + Toolzu) - File hash deduplication (SHA256-based) - Timestamp preservation (EXIF metadata) - Randomized scheduler intervals - Pushover notifications with thumbnails - Immich photo library integration - Cookie-based authentication - 2captcha CAPTCHA solving - Browser automation (Playwright) **Statistics:** - 19,100+ lines of production Python code - 1,183+ downloads tracked - 213 files with SHA256 hashes - 30 duplicate groups detected - 8 database tables with 17 indexes --- ## GUI Architecture Options ### Option 1: Hybrid Approach ⭐ **RECOMMENDED** **Architecture:** ``` ┌─────────────────────────────────────┐ │ Node.js Web GUI │ │ - Express.js API server │ │ - Vanilla JS frontend │ │ - Real-time WebSocket updates │ │ - Chart.js analytics │ └──────────────┬──────────────────────┘ │ REST API + WebSocket ▼ ┌─────────────────────────────────────┐ │ Existing Python Backend │ │ - All platform downloaders │ │ - Database layer │ │ - Scheduler │ │ - Browser automation │ └─────────────────────────────────────┘ ``` **Pros:** ✅ Preserves all battle-tested scraping logic ✅ Modern, responsive web UI ✅ Lower risk, faster development (4-8 weeks) ✅ Python ecosystem better for web scraping ✅ Can develop frontend and API simultaneously **Cons:** ⚠️ Two codebases to maintain (Node.js + Python) ⚠️ Inter-process communication overhead --- ### Option 2: Full Node.js Rewrite **Architecture:** ``` ┌─────────────────────────────────────┐ │ Full Node.js/TypeScript Stack │ │ - Express/Fastify API │ │ - React/Next.js frontend │ │ - Playwright Node.js bindings │ │ - Prisma ORM │ └─────────────────────────────────────┘ ``` **Pros:** ✅ Unified JavaScript/TypeScript codebase ✅ Modern tooling, better IDE support ✅ Easier for full-stack JS developers **Cons:** ❌ 3-6 months minimum development time ❌ Need to reimplement all platform scraping ❌ Risk of losing subtle platform-specific fixes ❌ No instaloader equivalent in Node.js ❌ Complex authentication flows need rediscovery **Verdict:** Only consider if planning long-term open-source project with JavaScript contributors. --- ### Option 3: Simple Dashboard (Quickest) **Architecture:** ``` Node.js Dashboard (read-only) ├── Reads SQLite database directly ├── Displays stats, history, schedules ├── Tails Python logs └── No control features (view-only) ``` **Timeline:** 1-2 weeks **Use Case:** Quick visibility without control features --- ## Recommended Approach ### **Hybrid Architecture with Backup-Central Design Pattern** After analyzing `/opt/backup-central`, we recommend adopting its proven architecture: **Backend Stack:** - Express.js (HTTP server) - WebSocket (ws package) for real-time updates - SQLite3 (reuse existing unified database) - Winston (structured logging) - node-cron (scheduler coordination) - Helmet + Compression (security & performance) **Frontend Stack:** - **Vanilla JavaScript** (no React/Vue - faster, simpler) - Chart.js (analytics visualizations) - Font Awesome (icons) - Inter font (modern typography) - Mobile-responsive CSS - Dark/Light theme support **Why Backup-Central's Approach:** 1. Proven in production 2. Simple to understand and maintain 3. Fast loading (no framework overhead) 4. Real-time updates work flawlessly 5. Beautiful, modern UI without complexity --- ## Technology Stack ### Backend (Node.js) ```json { "dependencies": { "express": "^4.18.2", "ws": "^8.14.2", "sqlite3": "^5.1.7", "winston": "^3.18.3", "node-cron": "^4.2.1", "compression": "^1.8.1", "helmet": "^8.1.0", "dotenv": "^17.2.3", "express-session": "^1.18.2", "jsonwebtoken": "^9.0.2" } } ``` ### Frontend (Vanilla JS) ```html ``` ### Python Integration ```javascript // Subprocess execution for Python backend const { spawn } = require('child_process'); function triggerDownload(platform, username) { return spawn('python3', [ 'media-downloader.py', '--platform', platform, '--username', username ]); } ``` --- ## Implementation Phases ### **Phase 1: Backend API Foundation** (Week 1-2) **Deliverables:** ``` media-downloader-gui/ ├── server.js (Express + WebSocket) ├── .env.example ├── package.json └── lib/ ├── db-helper.js (SQLite wrapper) ├── python-bridge.js (subprocess manager) ├── logger.js (Winston) └── api-v1/ ├── downloads.js ├── accounts.js ├── stats.js ├── scheduler.js └── config.js ``` **API Endpoints:** - `GET /api/downloads` - Query download history - `GET /api/downloads/recent` - Last 100 downloads - `POST /api/downloads/trigger` - Manual download trigger - `GET /api/accounts` - List all configured accounts - `POST /api/accounts` - Add new account - `PUT /api/accounts/:id` - Update account - `DELETE /api/accounts/:id` - Remove account - `GET /api/stats` - Platform statistics - `GET /api/scheduler/status` - Scheduler state - `POST /api/scheduler/start` - Start scheduler - `POST /api/scheduler/stop` - Stop scheduler - `GET /api/config` - Read configuration - `PUT /api/config` - Update configuration - `GET /api/logs` - Tail Python logs - `WS /api/live` - Real-time updates --- ### **Phase 2: Core Frontend UI** (Week 3-4) **Dashboard Layout:** ``` ┌─────────────────────────────────────────────────────┐ │ Header: Media Downloader | [Theme] [Profile] [⚙️] │ ├─────────────────────────────────────────────────────┤ │ Platform Cards │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │Instagram │ │ TikTok │ │ Snapchat │ │ │ │ 523 DL │ │ 87 DL │ │ 142 DL │ │ │ │ ▶️ Trigger│ │ ▶️ Trigger│ │ ▶️ Trigger│ │ │ └──────────┘ └──────────┘ └──────────┘ │ ├─────────────────────────────────────────────────────┤ │ Recent Downloads (Live Feed) │ │ 🟢 evalongoria_20251025... (Instagram/evalongoria) │ │ 🟢 20251025_TikTok... (TikTok/evalongoria) │ │ ⚠️ Duplicate skipped: photo.jpg (hash match) │ ├─────────────────────────────────────────────────────┤ │ Statistics (Chart.js) │ │ 📊 Downloads per Platform | 📈 Timeline Graph │ └─────────────────────────────────────────────────────┘ ``` **Components:** 1. **Dashboard** (`public/index.html`) - Platform overview cards - Live download feed (WebSocket) - Quick stats 2. **Accounts Manager** (`public/accounts.html`) - Add/Edit/Delete Instagram usernames - Add/Edit/Delete TikTok accounts - Add/Edit/Delete Forum configurations - Per-account interval settings 3. **Download History** (`public/history.html`) - Searchable table - Filter by platform/source/date - Thumbnail previews - Duplicate indicators 4. **Scheduler Control** (`public/scheduler.html`) - Enable/Disable scheduler - View next run times - Adjust global intervals - Force run specific tasks 5. **Configuration Editor** (`public/config.html`) - JSON editor with validation - Platform-specific settings - Notification configuration - Immich integration settings 6. **Logs Viewer** (`public/logs.html`) - Tail Python application logs - Filter by level (DEBUG/INFO/WARNING/ERROR) - Search functionality - Auto-scroll toggle --- ### **Phase 3: Advanced Features** (Week 5-6) **Real-time Features:** ```javascript // WebSocket message types { type: 'download_start', platform: 'instagram', username: 'evalongoria', content_type: 'story' } { type: 'download_complete', platform: 'instagram', filename: 'evalongoria_20251025_123456.jpg', file_size: 245678, duplicate: false } { type: 'duplicate_detected', filename: 'photo.jpg', existing_file: 'photo_original.jpg', platform: 'instagram' } { type: 'scheduler_update', task_id: 'instagram:evalongoria', next_run: '2025-10-25T23:00:00Z' } ``` **Features:** - Live download progress bars - Duplicate detection alerts - Scheduler countdown timers - Platform health indicators - Download speed metrics --- ### **Phase 4: Polish & Deploy** (Week 7-8) **Final Touches:** - Mobile-responsive design - Dark mode implementation - Keyboard shortcuts - Toast notifications (success/error) - Loading skeletons - Error boundary handling - Performance optimization - Security hardening - Documentation - Deployment scripts --- ## Feature Roadmap ### **MVP Features** (Phase 1-2) ✅ View download history ✅ See platform statistics ✅ Manual download triggers ✅ Account management (CRUD) ✅ Real-time download feed ✅ Dark/Light theme ✅ Mobile responsive ### **Enhanced Features** (Phase 3) 🔄 Scheduler control (start/stop/adjust) 🔄 Configuration editor 🔄 Logs viewer 🔄 Advanced search/filtering 🔄 Duplicate management UI 🔄 Download queue management ### **Future Features** (Phase 4+) 📋 Batch operations (delete/retry multiple) 📋 Download rules engine (auto-skip based on criteria) 📋 Analytics dashboard (trends, insights) 📋 Export/Import configurations 📋 Webhook integrations 📋 Multi-user support with authentication 📋 API key management 📋 Browser screenshot viewer (see Playwright automation) 📋 Cookie editor (manage authentication) --- ## API Specification ### REST API Endpoints #### Downloads **GET /api/downloads** ```javascript // Query downloads with filters GET /api/downloads?platform=instagram&limit=50&offset=0 Response: { "total": 1183, "downloads": [ { "id": 1, "url": "https://...", "url_hash": "sha256...", "platform": "instagram", "source": "evalongoria", "content_type": "story", "filename": "evalongoria_20251025_123456.jpg", "file_path": "/opt/immich/md/social media/instagram/...", "file_size": 245678, "file_hash": "sha256...", "post_date": "2025-10-25T12:34:56Z", "download_date": "2025-10-25T12:35:00Z", "status": "completed", "metadata": {} } ] } ``` **POST /api/downloads/trigger** ```javascript // Trigger manual download POST /api/downloads/trigger { "platform": "instagram", "username": "evalongoria", "content_types": ["stories", "posts"] } Response: { "status": "started", "job_id": "instagram_evalongoria_1729900000", "message": "Download started in background" } ``` #### Accounts **GET /api/accounts** ```javascript GET /api/accounts?platform=instagram Response: { "instagram": [ { "username": "evalongoria", "enabled": true, "check_interval_hours": 6, "content_types": { "posts": true, "stories": true, "reels": false } } ], "tiktok": [...], "snapchat": [...] } ``` **POST /api/accounts** ```javascript POST /api/accounts { "platform": "instagram", "username": "newuser", "check_interval_hours": 12, "content_types": { "posts": true, "stories": false } } Response: { "success": true, "account": { ... } } ``` #### Statistics **GET /api/stats** ```javascript GET /api/stats Response: { "platforms": { "instagram": { "total": 523, "completed": 520, "failed": 3, "duplicates": 15, "total_size": 1234567890 }, "tiktok": { ... }, "snapchat": { ... } }, "recent_activity": { "last_24h": 45, "last_7d": 312 } } ``` #### Scheduler **GET /api/scheduler/status** ```javascript GET /api/scheduler/status Response: { "running": true, "tasks": [ { "task_id": "instagram:evalongoria", "last_run": "2025-10-25T12:00:00Z", "next_run": "2025-10-25T18:00:00Z", "interval_hours": 6, "status": "active" } ] } ``` #### Configuration **GET /api/config** ```javascript GET /api/config Response: { "instagram": { ... }, "tiktok": { ... }, "pushover": { ... }, "immich": { ... } } ``` **PUT /api/config** ```javascript PUT /api/config { "instagram": { "enabled": true, "check_interval_hours": 8 } } Response: { "success": true, "config": { ... } } ``` ### WebSocket Events **Client → Server:** ```javascript // Subscribe to live updates { "action": "subscribe", "channels": ["downloads", "scheduler", "duplicates"] } ``` **Server → Client:** ```javascript // Download started { "type": "download_start", "timestamp": "2025-10-25T12:34:56Z", "platform": "instagram", "username": "evalongoria" } // Download completed { "type": "download_complete", "timestamp": "2025-10-25T12:35:00Z", "platform": "instagram", "filename": "evalongoria_20251025_123456.jpg", "file_size": 245678, "duplicate": false } // Duplicate detected { "type": "duplicate_detected", "timestamp": "2025-10-25T12:35:05Z", "filename": "photo.jpg", "existing_file": { "filename": "photo_original.jpg", "platform": "instagram", "source": "evalongoria" } } ``` --- ## UI/UX Design ### Design System (Inspired by Backup-Central) **Colors:** ```css :root { /* Light Theme */ --primary-color: #2563eb; --secondary-color: #64748b; --success-color: #10b981; --warning-color: #f59e0b; --error-color: #ef4444; --bg-color: #f8fafc; --card-bg: #ffffff; --text-color: #1e293b; --border-color: #e2e8f0; } [data-theme="dark"] { /* Dark Theme */ --primary-color: #3b82f6; --bg-color: #0f172a; --card-bg: #1e293b; --text-color: #f1f5f9; --border-color: #334155; } ``` **Typography:** ```css font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif; ``` **Components:** - Cards with subtle shadows - Rounded corners (8px border-radius) - Smooth transitions (0.3s ease) - Gradient accents on hover - Loading skeletons - Toast notifications (top-right) --- ## Database Integration ### Database Access Strategy **Read Operations (Node.js):** ```javascript // Direct SQLite reads for fast queries const db = require('better-sqlite3')('/opt/media-downloader/database/media_downloader.db'); const downloads = db.prepare(` SELECT * FROM downloads WHERE platform = ? ORDER BY download_date DESC LIMIT ? `).all('instagram', 50); ``` **Write Operations (Python):** ```javascript // Route through Python backend for consistency const { spawn } = require('child_process'); function addAccount(platform, username) { // Update config.json // Trigger Python process to reload config } ``` **Why This Approach:** - Python maintains database writes (consistency) - Node.js reads for fast UI queries - No duplicate database logic - Leverages existing connection pooling --- ## Real-time Updates ### WebSocket Architecture **Server-Side (Node.js):** ```javascript const WebSocket = require('ws'); const wss = new WebSocket.Server({ server }); // Broadcast to all connected clients function broadcast(message) { wss.clients.forEach(client => { if (client.readyState === WebSocket.OPEN) { client.send(JSON.stringify(message)); } }); } // Watch Python logs for events const { spawn } = require('child_process'); const pythonProcess = spawn('python3', ['media-downloader.py', '--daemon']); pythonProcess.stdout.on('data', (data) => { // Parse log output and broadcast events const event = parseLogEvent(data.toString()); if (event) broadcast(event); }); ``` **Client-Side (JavaScript):** ```javascript const ws = new WebSocket('ws://localhost:3000/api/live'); ws.onmessage = (event) => { const data = JSON.parse(event.data); switch(data.type) { case 'download_complete': addToDownloadFeed(data); updateStats(); showToast(`Downloaded ${data.filename}`, 'success'); break; case 'duplicate_detected': showToast(`Duplicate skipped: ${data.filename}`, 'warning'); break; } }; ``` --- ## Security Considerations ### Authentication (Optional for Single-User) **Simple Auth:** - Environment variable password - Session-based auth (express-session) - No registration needed **Enhanced Auth (Future):** - TOTP/2FA (speakeasy) - Passkeys (WebAuthn) - JWT tokens - Per-user configurations ### API Security ```javascript // Helmet for security headers app.use(helmet()); // CORS configuration app.use(cors({ origin: process.env.ALLOWED_ORIGINS?.split(',') || '*', credentials: true })); // Rate limiting const rateLimit = require('express-rate-limit'); const limiter = rateLimit({ windowMs: 15 * 60 * 1000, // 15 minutes max: 100 // limit each IP to 100 requests per windowMs }); app.use('/api/', limiter); ``` ### Environment Variables ```bash # .env NODE_ENV=production PORT=3000 SESSION_SECRET=random_secret_key PYTHON_PATH=/opt/media-downloader/venv/bin/python3 DATABASE_PATH=/opt/media-downloader/database/media_downloader.db CONFIG_PATH=/opt/media-downloader/config/settings.json ``` --- ## Development Timeline ### **Estimated Timeline: 8 Weeks** **Week 1-2: Backend API** - Express server setup - Database integration - Python subprocess bridge - Basic API endpoints - WebSocket setup **Week 3-4: Core Frontend** - Dashboard layout - Platform cards - Download feed - Account management UI - Basic stats **Week 5-6: Advanced Features** - Real-time updates - Scheduler control - Config editor - Logs viewer - Search/filtering **Week 7-8: Polish** - Mobile responsive - Dark mode - Error handling - Testing - Documentation - Deployment --- ## Next Steps ### Immediate Actions: 1. **✅ File Hash Deduplication** - COMPLETED - Added SHA256 hashing to unified_database.py - Implemented automatic duplicate detection in move_module.py - Created utilities for backfilling and managing hashes - Scanned 213 existing files and found 30 duplicate groups 2. **✅ Directory Cleanup** - COMPLETED - Moved test files to `tests/` directory - Moved one-time scripts to `archive/` - Organized utilities in `utilities/` directory - Removed obsolete documentation 3. **📋 Begin GUI Development** - Initialize Node.js project - Set up Express server - Create basic API endpoints - Build dashboard prototype --- ## References - **Backup-Central:** `/opt/backup-central` - Reference implementation - **Python Backend:** `/opt/media-downloader/media-downloader.py` - **Database Schema:** `/opt/media-downloader/modules/unified_database.py` - **Existing Docs:** `/opt/media-downloader/archive/` (old GUI plans) --- ## Appendix ### Directory Structure After Cleanup ``` /opt/media-downloader/ ├── media-downloader.py (main application) ├── setup.py (installation script) ├── INSTALL.md (installation guide) ├── GUI_DESIGN_PLAN.md (this document) ├── requirements.txt ├── config/ │ └── settings.json ├── database/ │ ├── media_downloader.db │ └── scheduler_state.db ├── modules/ (16 Python modules) │ ├── unified_database.py │ ├── scheduler.py │ ├── move_module.py │ ├── instaloader_module.py │ ├── fastdl_module.py │ ├── imginn_module.py │ ├── toolzu_module.py │ ├── snapchat_module.py │ ├── tiktok_module.py │ ├── forum_downloader.py │ └── ... (10 more modules) ├── utilities/ │ ├── backfill_file_hashes.py │ ├── cleanup_database_filenames.py │ └── scan_and_hash_files.py ├── archive/ (old docs, one-time scripts) │ ├── HIGH_RES_DOWNLOAD.md │ ├── SNAPCHAT_*.md │ ├── TOOLZU-TIMESTAMPS.md │ ├── WEB_GUI_*.md (4 old GUI docs) │ ├── cleanup_last_week.py │ ├── merge-quality-upgrade.py │ ├── reset_database.py │ └── debug_snapchat.py ├── tests/ (7 test scripts) │ ├── test_all_notifications.py │ ├── test_pushover.py │ └── ... (5 more tests) ├── subprocess wrappers/ (5 wrappers) │ ├── fastdl_subprocess_wrapper.py │ ├── imginn_subprocess_wrapper.py │ ├── toolzu_subprocess_wrapper.py │ ├── snapchat_subprocess_wrapper.py │ └── forum_subprocess_wrapper.py ├── venv/ (Python virtual environment) ├── logs/ (application logs) ├── temp/ (temporary download directories) └── ... (other directories) ``` --- **End of Document** For questions or updates, refer to this document as the single source of truth for GUI development planning.