25 KiB
Media Downloader - GUI Design & Implementation Plan
Version: 1.0 Date: October 25, 2025 Status: Planning Phase
Table of Contents
- Executive Summary
- Current System Analysis
- GUI Architecture Options
- Recommended Approach
- Technology Stack
- Implementation Phases
- Feature Roadmap
- API Specification
- UI/UX Design
- Database Integration
- Real-time Updates
- Security Considerations
- Development Timeline
Executive Summary
The Media Downloader GUI project aims to create a modern, user-friendly web interface for managing automated media downloads from multiple platforms (Instagram, TikTok, Snapchat, Forums). The GUI will be modeled after the proven backup-central architecture, using Node.js/Express backend with vanilla JavaScript frontend.
Key Goals:
- Maintain existing Python backend - Preserve all battle-tested scraping logic
- Modern web interface - Real-time updates, responsive design, dark/light themes
- Easy management - Visual account configuration, manual triggers, scheduler control
- Enterprise-grade - Similar to backup-central's polished UI and reliability
Current System Analysis
Existing Architecture
media-downloader.py (Python Orchestrator)
├── Unified Database (SQLite with WAL mode)
│ ├── downloads table (1,183+ records)
│ ├── forum_threads, forum_posts
│ ├── scheduler_state, download_queue
│ └── File hash deduplication (NEW)
│
├── Platform Modules (16 modules)
│ ├── instaloader_module.py (Instagram via API)
│ ├── fastdl_module.py (Instagram web scraper)
│ ├── imginn_module.py (Instagram alternative)
│ ├── toolzu_module.py (High-res Instagram 1920x1440)
│ ├── snapchat_scraper.py (direct Playwright scraper)
│ ├── tiktok_module.py (yt-dlp wrapper)
│ └── forum_downloader.py (7 forum types)
│
├── Subprocess Wrappers (Playwright automation)
│ ├── fastdl_subprocess_wrapper.py
│ ├── imginn_subprocess_wrapper.py
│ ├── toolzu_subprocess_wrapper.py
│ ├── snapchat_subprocess_wrapper.py
│ └── forum_subprocess_wrapper.py
│
├── Support Systems
│ ├── scheduler.py (randomized intervals, persistent state)
│ ├── move_module.py (file operations + deduplication)
│ ├── pushover_notifier.py (push notifications)
│ ├── download_manager.py (multi-threaded downloads)
│ └── unified_database.py (connection pooling, WAL mode)
│
└── Configuration
└── config/settings.json (100+ parameters)
Current Capabilities
Supported Platforms:
- Instagram (4 methods: InstaLoader, FastDL, ImgInn, Toolzu)
- TikTok (via yt-dlp)
- Snapchat Stories
- Forums (XenForo, vBulletin, phpBB, Discourse, IPB, MyBB, SMF)
Advanced Features:
- Quality upgrade merging (FastDL + Toolzu)
- File hash deduplication (SHA256-based)
- Timestamp preservation (EXIF metadata)
- Randomized scheduler intervals
- Pushover notifications with thumbnails
- Immich photo library integration
- Cookie-based authentication
- 2captcha CAPTCHA solving
- Browser automation (Playwright)
Statistics:
- 19,100+ lines of production Python code
- 1,183+ downloads tracked
- 213 files with SHA256 hashes
- 30 duplicate groups detected
- 8 database tables with 17 indexes
GUI Architecture Options
Option 1: Hybrid Approach ⭐ RECOMMENDED
Architecture:
┌─────────────────────────────────────┐
│ Node.js Web GUI │
│ - Express.js API server │
│ - Vanilla JS frontend │
│ - Real-time WebSocket updates │
│ - Chart.js analytics │
└──────────────┬──────────────────────┘
│ REST API + WebSocket
▼
┌─────────────────────────────────────┐
│ Existing Python Backend │
│ - All platform downloaders │
│ - Database layer │
│ - Scheduler │
│ - Browser automation │
└─────────────────────────────────────┘
Pros: ✅ Preserves all battle-tested scraping logic ✅ Modern, responsive web UI ✅ Lower risk, faster development (4-8 weeks) ✅ Python ecosystem better for web scraping ✅ Can develop frontend and API simultaneously
Cons: ⚠️ Two codebases to maintain (Node.js + Python) ⚠️ Inter-process communication overhead
Option 2: Full Node.js Rewrite
Architecture:
┌─────────────────────────────────────┐
│ Full Node.js/TypeScript Stack │
│ - Express/Fastify API │
│ - React/Next.js frontend │
│ - Playwright Node.js bindings │
│ - Prisma ORM │
└─────────────────────────────────────┘
Pros: ✅ Unified JavaScript/TypeScript codebase ✅ Modern tooling, better IDE support ✅ Easier for full-stack JS developers
Cons: ❌ 3-6 months minimum development time ❌ Need to reimplement all platform scraping ❌ Risk of losing subtle platform-specific fixes ❌ No instaloader equivalent in Node.js ❌ Complex authentication flows need rediscovery
Verdict: Only consider if planning long-term open-source project with JavaScript contributors.
Option 3: Simple Dashboard (Quickest)
Architecture:
Node.js Dashboard (read-only)
├── Reads SQLite database directly
├── Displays stats, history, schedules
├── Tails Python logs
└── No control features (view-only)
Timeline: 1-2 weeks Use Case: Quick visibility without control features
Recommended Approach
Hybrid Architecture with Backup-Central Design Pattern
After analyzing /opt/backup-central, we recommend adopting its proven architecture:
Backend Stack:
- Express.js (HTTP server)
- WebSocket (ws package) for real-time updates
- SQLite3 (reuse existing unified database)
- Winston (structured logging)
- node-cron (scheduler coordination)
- Helmet + Compression (security & performance)
Frontend Stack:
- Vanilla JavaScript (no React/Vue - faster, simpler)
- Chart.js (analytics visualizations)
- Font Awesome (icons)
- Inter font (modern typography)
- Mobile-responsive CSS
- Dark/Light theme support
Why Backup-Central's Approach:
- Proven in production
- Simple to understand and maintain
- Fast loading (no framework overhead)
- Real-time updates work flawlessly
- Beautiful, modern UI without complexity
Technology Stack
Backend (Node.js)
{
"dependencies": {
"express": "^4.18.2",
"ws": "^8.14.2",
"sqlite3": "^5.1.7",
"winston": "^3.18.3",
"node-cron": "^4.2.1",
"compression": "^1.8.1",
"helmet": "^8.1.0",
"dotenv": "^17.2.3",
"express-session": "^1.18.2",
"jsonwebtoken": "^9.0.2"
}
}
Frontend (Vanilla JS)
<!-- Libraries -->
<script src="chart.min.js"></script>
<link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700">
Python Integration
// Subprocess execution for Python backend
const { spawn } = require('child_process');
function triggerDownload(platform, username) {
return spawn('python3', [
'media-downloader.py',
'--platform', platform,
'--username', username
]);
}
Implementation Phases
Phase 1: Backend API Foundation (Week 1-2)
Deliverables:
media-downloader-gui/
├── server.js (Express + WebSocket)
├── .env.example
├── package.json
└── lib/
├── db-helper.js (SQLite wrapper)
├── python-bridge.js (subprocess manager)
├── logger.js (Winston)
└── api-v1/
├── downloads.js
├── accounts.js
├── stats.js
├── scheduler.js
└── config.js
API Endpoints:
GET /api/downloads- Query download historyGET /api/downloads/recent- Last 100 downloadsPOST /api/downloads/trigger- Manual download triggerGET /api/accounts- List all configured accountsPOST /api/accounts- Add new accountPUT /api/accounts/:id- Update accountDELETE /api/accounts/:id- Remove accountGET /api/stats- Platform statisticsGET /api/scheduler/status- Scheduler statePOST /api/scheduler/start- Start schedulerPOST /api/scheduler/stop- Stop schedulerGET /api/config- Read configurationPUT /api/config- Update configurationGET /api/logs- Tail Python logsWS /api/live- Real-time updates
Phase 2: Core Frontend UI (Week 3-4)
Dashboard Layout:
┌─────────────────────────────────────────────────────┐
│ Header: Media Downloader | [Theme] [Profile] [⚙️] │
├─────────────────────────────────────────────────────┤
│ Platform Cards │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │Instagram │ │ TikTok │ │ Snapchat │ │
│ │ 523 DL │ │ 87 DL │ │ 142 DL │ │
│ │ ▶️ Trigger│ │ ▶️ Trigger│ │ ▶️ Trigger│ │
│ └──────────┘ └──────────┘ └──────────┘ │
├─────────────────────────────────────────────────────┤
│ Recent Downloads (Live Feed) │
│ 🟢 evalongoria_20251025... (Instagram/evalongoria) │
│ 🟢 20251025_TikTok... (TikTok/evalongoria) │
│ ⚠️ Duplicate skipped: photo.jpg (hash match) │
├─────────────────────────────────────────────────────┤
│ Statistics (Chart.js) │
│ 📊 Downloads per Platform | 📈 Timeline Graph │
└─────────────────────────────────────────────────────┘
Components:
-
Dashboard (
public/index.html)- Platform overview cards
- Live download feed (WebSocket)
- Quick stats
-
Accounts Manager (
public/accounts.html)- Add/Edit/Delete Instagram usernames
- Add/Edit/Delete TikTok accounts
- Add/Edit/Delete Forum configurations
- Per-account interval settings
-
Download History (
public/history.html)- Searchable table
- Filter by platform/source/date
- Thumbnail previews
- Duplicate indicators
-
Scheduler Control (
public/scheduler.html)- Enable/Disable scheduler
- View next run times
- Adjust global intervals
- Force run specific tasks
-
Configuration Editor (
public/config.html)- JSON editor with validation
- Platform-specific settings
- Notification configuration
- Immich integration settings
-
Logs Viewer (
public/logs.html)- Tail Python application logs
- Filter by level (DEBUG/INFO/WARNING/ERROR)
- Search functionality
- Auto-scroll toggle
Phase 3: Advanced Features (Week 5-6)
Real-time Features:
// WebSocket message types
{
type: 'download_start',
platform: 'instagram',
username: 'evalongoria',
content_type: 'story'
}
{
type: 'download_complete',
platform: 'instagram',
filename: 'evalongoria_20251025_123456.jpg',
file_size: 245678,
duplicate: false
}
{
type: 'duplicate_detected',
filename: 'photo.jpg',
existing_file: 'photo_original.jpg',
platform: 'instagram'
}
{
type: 'scheduler_update',
task_id: 'instagram:evalongoria',
next_run: '2025-10-25T23:00:00Z'
}
Features:
- Live download progress bars
- Duplicate detection alerts
- Scheduler countdown timers
- Platform health indicators
- Download speed metrics
Phase 4: Polish & Deploy (Week 7-8)
Final Touches:
- Mobile-responsive design
- Dark mode implementation
- Keyboard shortcuts
- Toast notifications (success/error)
- Loading skeletons
- Error boundary handling
- Performance optimization
- Security hardening
- Documentation
- Deployment scripts
Feature Roadmap
MVP Features (Phase 1-2)
✅ View download history ✅ See platform statistics ✅ Manual download triggers ✅ Account management (CRUD) ✅ Real-time download feed ✅ Dark/Light theme ✅ Mobile responsive
Enhanced Features (Phase 3)
🔄 Scheduler control (start/stop/adjust) 🔄 Configuration editor 🔄 Logs viewer 🔄 Advanced search/filtering 🔄 Duplicate management UI 🔄 Download queue management
Future Features (Phase 4+)
📋 Batch operations (delete/retry multiple) 📋 Download rules engine (auto-skip based on criteria) 📋 Analytics dashboard (trends, insights) 📋 Export/Import configurations 📋 Webhook integrations 📋 Multi-user support with authentication 📋 API key management 📋 Browser screenshot viewer (see Playwright automation) 📋 Cookie editor (manage authentication)
API Specification
REST API Endpoints
Downloads
GET /api/downloads
// Query downloads with filters
GET /api/downloads?platform=instagram&limit=50&offset=0
Response:
{
"total": 1183,
"downloads": [
{
"id": 1,
"url": "https://...",
"url_hash": "sha256...",
"platform": "instagram",
"source": "evalongoria",
"content_type": "story",
"filename": "evalongoria_20251025_123456.jpg",
"file_path": "/opt/immich/md/social media/instagram/...",
"file_size": 245678,
"file_hash": "sha256...",
"post_date": "2025-10-25T12:34:56Z",
"download_date": "2025-10-25T12:35:00Z",
"status": "completed",
"metadata": {}
}
]
}
POST /api/downloads/trigger
// Trigger manual download
POST /api/downloads/trigger
{
"platform": "instagram",
"username": "evalongoria",
"content_types": ["stories", "posts"]
}
Response:
{
"status": "started",
"job_id": "instagram_evalongoria_1729900000",
"message": "Download started in background"
}
Accounts
GET /api/accounts
GET /api/accounts?platform=instagram
Response:
{
"instagram": [
{
"username": "evalongoria",
"enabled": true,
"check_interval_hours": 6,
"content_types": {
"posts": true,
"stories": true,
"reels": false
}
}
],
"tiktok": [...],
"snapchat": [...]
}
POST /api/accounts
POST /api/accounts
{
"platform": "instagram",
"username": "newuser",
"check_interval_hours": 12,
"content_types": {
"posts": true,
"stories": false
}
}
Response:
{
"success": true,
"account": { ... }
}
Statistics
GET /api/stats
GET /api/stats
Response:
{
"platforms": {
"instagram": {
"total": 523,
"completed": 520,
"failed": 3,
"duplicates": 15,
"total_size": 1234567890
},
"tiktok": { ... },
"snapchat": { ... }
},
"recent_activity": {
"last_24h": 45,
"last_7d": 312
}
}
Scheduler
GET /api/scheduler/status
GET /api/scheduler/status
Response:
{
"running": true,
"tasks": [
{
"task_id": "instagram:evalongoria",
"last_run": "2025-10-25T12:00:00Z",
"next_run": "2025-10-25T18:00:00Z",
"interval_hours": 6,
"status": "active"
}
]
}
Configuration
GET /api/config
GET /api/config
Response:
{
"instagram": { ... },
"tiktok": { ... },
"pushover": { ... },
"immich": { ... }
}
PUT /api/config
PUT /api/config
{
"instagram": {
"enabled": true,
"check_interval_hours": 8
}
}
Response:
{
"success": true,
"config": { ... }
}
WebSocket Events
Client → Server:
// Subscribe to live updates
{
"action": "subscribe",
"channels": ["downloads", "scheduler", "duplicates"]
}
Server → Client:
// Download started
{
"type": "download_start",
"timestamp": "2025-10-25T12:34:56Z",
"platform": "instagram",
"username": "evalongoria"
}
// Download completed
{
"type": "download_complete",
"timestamp": "2025-10-25T12:35:00Z",
"platform": "instagram",
"filename": "evalongoria_20251025_123456.jpg",
"file_size": 245678,
"duplicate": false
}
// Duplicate detected
{
"type": "duplicate_detected",
"timestamp": "2025-10-25T12:35:05Z",
"filename": "photo.jpg",
"existing_file": {
"filename": "photo_original.jpg",
"platform": "instagram",
"source": "evalongoria"
}
}
UI/UX Design
Design System (Inspired by Backup-Central)
Colors:
:root {
/* Light Theme */
--primary-color: #2563eb;
--secondary-color: #64748b;
--success-color: #10b981;
--warning-color: #f59e0b;
--error-color: #ef4444;
--bg-color: #f8fafc;
--card-bg: #ffffff;
--text-color: #1e293b;
--border-color: #e2e8f0;
}
[data-theme="dark"] {
/* Dark Theme */
--primary-color: #3b82f6;
--bg-color: #0f172a;
--card-bg: #1e293b;
--text-color: #f1f5f9;
--border-color: #334155;
}
Typography:
font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
Components:
- Cards with subtle shadows
- Rounded corners (8px border-radius)
- Smooth transitions (0.3s ease)
- Gradient accents on hover
- Loading skeletons
- Toast notifications (top-right)
Database Integration
Database Access Strategy
Read Operations (Node.js):
// Direct SQLite reads for fast queries
const db = require('better-sqlite3')('/opt/media-downloader/database/media_downloader.db');
const downloads = db.prepare(`
SELECT * FROM downloads
WHERE platform = ?
ORDER BY download_date DESC
LIMIT ?
`).all('instagram', 50);
Write Operations (Python):
// Route through Python backend for consistency
const { spawn } = require('child_process');
function addAccount(platform, username) {
// Update config.json
// Trigger Python process to reload config
}
Why This Approach:
- Python maintains database writes (consistency)
- Node.js reads for fast UI queries
- No duplicate database logic
- Leverages existing connection pooling
Real-time Updates
WebSocket Architecture
Server-Side (Node.js):
const WebSocket = require('ws');
const wss = new WebSocket.Server({ server });
// Broadcast to all connected clients
function broadcast(message) {
wss.clients.forEach(client => {
if (client.readyState === WebSocket.OPEN) {
client.send(JSON.stringify(message));
}
});
}
// Watch Python logs for events
const { spawn } = require('child_process');
const pythonProcess = spawn('python3', ['media-downloader.py', '--daemon']);
pythonProcess.stdout.on('data', (data) => {
// Parse log output and broadcast events
const event = parseLogEvent(data.toString());
if (event) broadcast(event);
});
Client-Side (JavaScript):
const ws = new WebSocket('ws://localhost:3000/api/live');
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
switch(data.type) {
case 'download_complete':
addToDownloadFeed(data);
updateStats();
showToast(`Downloaded ${data.filename}`, 'success');
break;
case 'duplicate_detected':
showToast(`Duplicate skipped: ${data.filename}`, 'warning');
break;
}
};
Security Considerations
Authentication (Optional for Single-User)
Simple Auth:
- Environment variable password
- Session-based auth (express-session)
- No registration needed
Enhanced Auth (Future):
- TOTP/2FA (speakeasy)
- Passkeys (WebAuthn)
- JWT tokens
- Per-user configurations
API Security
// Helmet for security headers
app.use(helmet());
// CORS configuration
app.use(cors({
origin: process.env.ALLOWED_ORIGINS?.split(',') || '*',
credentials: true
}));
// Rate limiting
const rateLimit = require('express-rate-limit');
const limiter = rateLimit({
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100 // limit each IP to 100 requests per windowMs
});
app.use('/api/', limiter);
Environment Variables
# .env
NODE_ENV=production
PORT=3000
SESSION_SECRET=random_secret_key
PYTHON_PATH=/opt/media-downloader/venv/bin/python3
DATABASE_PATH=/opt/media-downloader/database/media_downloader.db
CONFIG_PATH=/opt/media-downloader/config/settings.json
Development Timeline
Estimated Timeline: 8 Weeks
Week 1-2: Backend API
- Express server setup
- Database integration
- Python subprocess bridge
- Basic API endpoints
- WebSocket setup
Week 3-4: Core Frontend
- Dashboard layout
- Platform cards
- Download feed
- Account management UI
- Basic stats
Week 5-6: Advanced Features
- Real-time updates
- Scheduler control
- Config editor
- Logs viewer
- Search/filtering
Week 7-8: Polish
- Mobile responsive
- Dark mode
- Error handling
- Testing
- Documentation
- Deployment
Next Steps
Immediate Actions:
-
✅ File Hash Deduplication - COMPLETED
- Added SHA256 hashing to unified_database.py
- Implemented automatic duplicate detection in move_module.py
- Created utilities for backfilling and managing hashes
- Scanned 213 existing files and found 30 duplicate groups
-
✅ Directory Cleanup - COMPLETED
- Moved test files to
tests/directory - Moved one-time scripts to
archive/ - Organized utilities in
utilities/directory - Removed obsolete documentation
- Moved test files to
-
📋 Begin GUI Development
- Initialize Node.js project
- Set up Express server
- Create basic API endpoints
- Build dashboard prototype
References
- Backup-Central:
/opt/backup-central- Reference implementation - Python Backend:
/opt/media-downloader/media-downloader.py - Database Schema:
/opt/media-downloader/modules/unified_database.py - Existing Docs:
/opt/media-downloader/archive/(old GUI plans)
Appendix
Directory Structure After Cleanup
/opt/media-downloader/
├── media-downloader.py (main application)
├── setup.py (installation script)
├── INSTALL.md (installation guide)
├── GUI_DESIGN_PLAN.md (this document)
├── requirements.txt
├── config/
│ └── settings.json
├── database/
│ ├── media_downloader.db
│ └── scheduler_state.db
├── modules/ (16 Python modules)
│ ├── unified_database.py
│ ├── scheduler.py
│ ├── move_module.py
│ ├── instaloader_module.py
│ ├── fastdl_module.py
│ ├── imginn_module.py
│ ├── toolzu_module.py
│ ├── snapchat_module.py
│ ├── tiktok_module.py
│ ├── forum_downloader.py
│ └── ... (10 more modules)
├── utilities/
│ ├── backfill_file_hashes.py
│ ├── cleanup_database_filenames.py
│ └── scan_and_hash_files.py
├── archive/ (old docs, one-time scripts)
│ ├── HIGH_RES_DOWNLOAD.md
│ ├── SNAPCHAT_*.md
│ ├── TOOLZU-TIMESTAMPS.md
│ ├── WEB_GUI_*.md (4 old GUI docs)
│ ├── cleanup_last_week.py
│ ├── merge-quality-upgrade.py
│ ├── reset_database.py
│ └── debug_snapchat.py
├── tests/ (7 test scripts)
│ ├── test_all_notifications.py
│ ├── test_pushover.py
│ └── ... (5 more tests)
├── subprocess wrappers/ (5 wrappers)
│ ├── fastdl_subprocess_wrapper.py
│ ├── imginn_subprocess_wrapper.py
│ ├── toolzu_subprocess_wrapper.py
│ ├── snapchat_subprocess_wrapper.py
│ └── forum_subprocess_wrapper.py
├── venv/ (Python virtual environment)
├── logs/ (application logs)
├── temp/ (temporary download directories)
└── ... (other directories)
End of Document
For questions or updates, refer to this document as the single source of truth for GUI development planning.