462
docs/IMPLEMENTATION_GUIDE.md
Normal file
462
docs/IMPLEMENTATION_GUIDE.md
Normal file
@@ -0,0 +1,462 @@
|
||||
# Code Improvements Implementation Guide
|
||||
|
||||
**Generated**: 2025-11-09
|
||||
**Estimated Total Time**: 7-11 hours
|
||||
**Tasks**: 18
|
||||
|
||||
---
|
||||
|
||||
## PHASE 1: CRITICAL SECURITY (Priority: HIGHEST)
|
||||
|
||||
### 1. Fix Token Exposure in URLs ⏱️ 45min
|
||||
|
||||
**Problem**: Tokens passed as query parameters expose them in logs, browser history, referer headers
|
||||
|
||||
**Current Code** (`web/frontend/src/lib/api.ts:558-568`):
|
||||
```typescript
|
||||
getMediaThumbnailUrl(filePath: string, mediaType: 'image' | 'video') {
|
||||
const token = localStorage.getItem('auth_token')
|
||||
const tokenParam = token ? `&token=${encodeURIComponent(token)}` : ''
|
||||
return `${API_BASE}/media/thumbnail?file_path=${encodeURIComponent(filePath)}&media_type=${mediaType}${tokenParam}`
|
||||
}
|
||||
```
|
||||
|
||||
**Solution**: Use session cookies for media endpoints
|
||||
|
||||
**Backend Changes**:
|
||||
```python
|
||||
# web/backend/api.py - Remove token parameter, rely on cookie auth
|
||||
@app.get("/api/media/thumbnail")
|
||||
async def get_media_thumbnail(
|
||||
request: Request,
|
||||
file_path: str,
|
||||
media_type: str,
|
||||
current_user: Dict = Depends(get_current_user_from_cookie) # Use cookie only
|
||||
):
|
||||
# Remove: token: str = None parameter
|
||||
pass
|
||||
```
|
||||
|
||||
**Frontend Changes**:
|
||||
```typescript
|
||||
// web/frontend/src/lib/api.ts
|
||||
getMediaThumbnailUrl(filePath: string, mediaType: 'image' | 'video') {
|
||||
// Remove token handling - browser will send cookie automatically
|
||||
return `${API_BASE}/media/thumbnail?file_path=${encodeURIComponent(filePath)}&media_type=${mediaType}`
|
||||
}
|
||||
```
|
||||
|
||||
**Testing**:
|
||||
- [ ] Thumbnails still load after login
|
||||
- [ ] 401 returned when not authenticated
|
||||
- [ ] No tokens visible in browser Network tab URLs
|
||||
|
||||
---
|
||||
|
||||
### 2. Add Path Traversal Validation ⏱️ 30min
|
||||
|
||||
**Problem**: File paths from frontend not validated, risk of `../../../etc/passwd` attacks
|
||||
|
||||
**Solution**: Create path validation utility
|
||||
|
||||
**New File** (`web/backend/security.py`):
|
||||
```python
|
||||
from pathlib import Path
|
||||
from fastapi import HTTPException
|
||||
|
||||
def validate_file_path(file_path: str, allowed_base: Path) -> Path:
|
||||
"""
|
||||
Validate file path prevents directory traversal
|
||||
|
||||
Args:
|
||||
file_path: User-provided file path
|
||||
allowed_base: Base directory that file must be under
|
||||
|
||||
Returns:
|
||||
Resolved Path object
|
||||
|
||||
Raises:
|
||||
HTTPException: If path traversal detected
|
||||
"""
|
||||
try:
|
||||
# Resolve to absolute path
|
||||
real_path = Path(file_path).resolve()
|
||||
allowed_base = allowed_base.resolve()
|
||||
|
||||
# Check if path is under allowed base
|
||||
if not str(real_path).startswith(str(allowed_base)):
|
||||
raise HTTPException(
|
||||
status_code=403,
|
||||
detail="Access denied: Path traversal detected"
|
||||
)
|
||||
|
||||
# Check if file exists
|
||||
if not real_path.exists():
|
||||
raise HTTPException(status_code=404, detail="File not found")
|
||||
|
||||
return real_path
|
||||
|
||||
except Exception as e:
|
||||
raise HTTPException(status_code=400, detail=f"Invalid file path: {e}")
|
||||
```
|
||||
|
||||
**Usage in endpoints**:
|
||||
```python
|
||||
from web.backend.security import validate_file_path
|
||||
|
||||
@app.get("/api/media/preview")
|
||||
async def get_media_preview(file_path: str, ...):
|
||||
# Validate path
|
||||
downloads_base = Path("/opt/media-downloader/downloads")
|
||||
safe_path = validate_file_path(file_path, downloads_base)
|
||||
|
||||
# Use safe_path from here on
|
||||
return FileResponse(safe_path)
|
||||
```
|
||||
|
||||
**Testing**:
|
||||
- [ ] Normal paths work: `/downloads/user/image.jpg`
|
||||
- [ ] Traversal blocked: `/downloads/../../etc/passwd` → 403
|
||||
- [ ] Absolute paths blocked: `/etc/passwd` → 403
|
||||
|
||||
---
|
||||
|
||||
### 3. Add CSRF Protection ⏱️ 40min
|
||||
|
||||
**Problem**: No CSRF tokens, POST/PUT/DELETE endpoints vulnerable
|
||||
|
||||
**Solution**: Add CSRF middleware
|
||||
|
||||
**Install dependency**:
|
||||
```bash
|
||||
pip install starlette-csrf
|
||||
```
|
||||
|
||||
**Backend Changes** (`web/backend/api.py`):
|
||||
```python
|
||||
from starlette_csrf import CSRFMiddleware
|
||||
|
||||
# Add after other middleware
|
||||
app.add_middleware(
|
||||
CSRFMiddleware,
|
||||
secret="<GENERATE-STRONG-SECRET>", # Use same JWT secret
|
||||
cookie_name="csrftoken",
|
||||
header_name="X-CSRFToken",
|
||||
cookie_secure=True, # HTTPS only in production
|
||||
cookie_httponly=False, # JS needs to read for SPA
|
||||
cookie_samesite="strict"
|
||||
)
|
||||
```
|
||||
|
||||
**Frontend Changes** (`web/frontend/src/lib/api.ts`):
|
||||
```typescript
|
||||
private async request<T>(
|
||||
method: string,
|
||||
endpoint: string,
|
||||
data?: any
|
||||
): Promise<T> {
|
||||
const token = localStorage.getItem('auth_token')
|
||||
|
||||
// Get CSRF token from cookie
|
||||
const csrfToken = document.cookie
|
||||
.split('; ')
|
||||
.find(row => row.startsWith('csrftoken='))
|
||||
?.split('=')[1]
|
||||
|
||||
const headers: Record<string, string> = {
|
||||
'Content-Type': 'application/json',
|
||||
}
|
||||
|
||||
if (token) {
|
||||
headers['Authorization'] = `Bearer ${token}`
|
||||
}
|
||||
|
||||
// Add CSRF token to non-GET requests
|
||||
if (method !== 'GET' && csrfToken) {
|
||||
headers['X-CSRFToken'] = csrfToken
|
||||
}
|
||||
|
||||
// ... rest of request
|
||||
}
|
||||
```
|
||||
|
||||
**Testing**:
|
||||
- [ ] GET requests work without CSRF token
|
||||
- [ ] POST/PUT/DELETE work with CSRF token
|
||||
- [ ] POST/PUT/DELETE fail (403) without CSRF token
|
||||
|
||||
---
|
||||
|
||||
### 4. Add Rate Limiting to Endpoints ⏱️ 20min
|
||||
|
||||
**Problem**: Rate limiting configured but not applied to most routes
|
||||
|
||||
**Solution**: Add `@limiter.limit()` decorators
|
||||
|
||||
**Current State** (`web/backend/api.py:320-325`):
|
||||
```python
|
||||
limiter = Limiter(
|
||||
key_func=get_remote_address,
|
||||
default_limits=["200/minute"]
|
||||
)
|
||||
# But not applied to routes!
|
||||
```
|
||||
|
||||
**Fix - Add to all sensitive endpoints**:
|
||||
```python
|
||||
# Auth endpoints - strict
|
||||
@app.post("/api/auth/login")
|
||||
@limiter.limit("5/minute") # Add this
|
||||
async def login(credentials: LoginRequest, request: Request):
|
||||
pass
|
||||
|
||||
# Config updates - moderate
|
||||
@app.put("/api/settings/config")
|
||||
@limiter.limit("30/minute") # Add this
|
||||
async def update_config(...):
|
||||
pass
|
||||
|
||||
# Download triggers - moderate
|
||||
@app.post("/api/scheduler/trigger")
|
||||
@limiter.limit("10/minute") # Add this
|
||||
async def trigger_download(...):
|
||||
pass
|
||||
|
||||
# Media endpoints already have limits - verify they work
|
||||
@app.get("/api/media/thumbnail")
|
||||
@limiter.limit("5000/minute") # Already present ✓
|
||||
async def get_media_thumbnail(...):
|
||||
pass
|
||||
```
|
||||
|
||||
**Testing**:
|
||||
- [ ] Login limited to 5 attempts/minute
|
||||
- [ ] Repeated config updates return 429 after limit
|
||||
- [ ] Rate limit resets after time window
|
||||
|
||||
---
|
||||
|
||||
### 5. Add Input Validation on Config Updates ⏱️ 35min
|
||||
|
||||
**Problem**: Config updates lack validation, could set invalid values
|
||||
|
||||
**Solution**: Use Pydantic models for validation
|
||||
|
||||
**Create validation models** (`web/backend/models.py`):
|
||||
```python
|
||||
from pydantic import BaseModel, Field, validator
|
||||
from typing import Optional
|
||||
|
||||
class PushoverConfig(BaseModel):
|
||||
enabled: bool
|
||||
user_key: Optional[str] = Field(None, min_length=30, max_length=30)
|
||||
api_token: Optional[str] = Field(None, min_length=30, max_length=30)
|
||||
priority: int = Field(0, ge=-2, le=2)
|
||||
sound: str = Field("pushover", regex="^[a-z_]+$")
|
||||
|
||||
@validator('user_key', 'api_token')
|
||||
def validate_keys(cls, v):
|
||||
if v and not v.isalnum():
|
||||
raise ValueError("Keys must be alphanumeric")
|
||||
return v
|
||||
|
||||
class SchedulerConfig(BaseModel):
|
||||
enabled: bool
|
||||
interval_hours: int = Field(24, ge=1, le=168) # 1 hour to 1 week
|
||||
randomize: bool = True
|
||||
randomize_minutes: int = Field(30, ge=0, le=180)
|
||||
|
||||
class ConfigUpdate(BaseModel):
|
||||
pushover: Optional[PushoverConfig]
|
||||
scheduler: Optional[SchedulerConfig]
|
||||
# ... other config sections
|
||||
```
|
||||
|
||||
**Use in endpoint**:
|
||||
```python
|
||||
@app.put("/api/settings/config")
|
||||
@limiter.limit("30/minute")
|
||||
async def update_config(
|
||||
config: ConfigUpdate, # Pydantic will validate
|
||||
current_user: Dict = Depends(get_current_user)
|
||||
):
|
||||
# Config is already validated by Pydantic
|
||||
# Safe to use
|
||||
pass
|
||||
```
|
||||
|
||||
**Testing**:
|
||||
- [ ] Valid config updates succeed
|
||||
- [ ] Invalid values return 422 with details
|
||||
- [ ] SQL injection attempts blocked
|
||||
- [ ] XSS attempts sanitized
|
||||
|
||||
---
|
||||
|
||||
## PHASE 2: PERFORMANCE (Priority: HIGH)
|
||||
|
||||
### 6. Add Database Indexes ⏱️ 15min
|
||||
|
||||
**Problem**: Missing composite index for deduplication queries
|
||||
|
||||
**Solution**: Add indexes to unified_database.py
|
||||
|
||||
```python
|
||||
# modules/unified_database.py - In _create_indexes()
|
||||
def _create_indexes(self, cursor):
|
||||
"""Create indexes for better query performance"""
|
||||
|
||||
# Existing indexes...
|
||||
|
||||
# NEW: Composite index for deduplication
|
||||
cursor.execute('''
|
||||
CREATE INDEX IF NOT EXISTS idx_file_hash_platform
|
||||
ON downloads(file_hash, platform)
|
||||
WHERE file_hash IS NOT NULL
|
||||
''')
|
||||
|
||||
# NEW: Index for metadata searches (if using JSON_EXTRACT)
|
||||
cursor.execute('''
|
||||
CREATE INDEX IF NOT EXISTS idx_metadata_media_id
|
||||
ON downloads(json_extract(metadata, '$.media_id'))
|
||||
WHERE metadata IS NOT NULL
|
||||
''')
|
||||
```
|
||||
|
||||
**Testing**:
|
||||
```sql
|
||||
EXPLAIN QUERY PLAN
|
||||
SELECT * FROM downloads
|
||||
WHERE file_hash = 'abc123' AND platform = 'fastdl';
|
||||
-- Should show "USING INDEX idx_file_hash_platform"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 7. Fix JSON Metadata Searches ⏱️ 45min
|
||||
|
||||
**Problem**: `LIKE '%json%'` searches are slow, cause full table scans
|
||||
|
||||
**Current Code** (`modules/unified_database.py:576-590`):
|
||||
```python
|
||||
cursor.execute('''
|
||||
SELECT ... WHERE metadata LIKE ? OR metadata LIKE ?
|
||||
''', (f'%"media_id": "{media_id}"%', f'%"media_id"%{media_id}%'))
|
||||
```
|
||||
|
||||
**Solution Option 1**: Extract media_id to separate column (BEST)
|
||||
```python
|
||||
# Add column
|
||||
cursor.execute('ALTER TABLE downloads ADD COLUMN media_id TEXT')
|
||||
cursor.execute('CREATE INDEX idx_media_id ON downloads(media_id)')
|
||||
|
||||
# When inserting:
|
||||
media_id = metadata_dict.get('media_id')
|
||||
cursor.execute('''
|
||||
INSERT INTO downloads (..., metadata, media_id)
|
||||
VALUES (..., ?, ?)
|
||||
''', (json.dumps(metadata), media_id))
|
||||
|
||||
# Query becomes fast:
|
||||
cursor.execute('SELECT * FROM downloads WHERE media_id = ?', (media_id,))
|
||||
```
|
||||
|
||||
**Solution Option 2**: Use JSON_EXTRACT (if SQLite 3.38+)
|
||||
```python
|
||||
cursor.execute('''
|
||||
SELECT * FROM downloads
|
||||
WHERE json_extract(metadata, '$.media_id') = ?
|
||||
''', (media_id,))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 8. Add Redis Result Caching ⏱️ 60min
|
||||
|
||||
**Requires**: Redis server
|
||||
**Install**: `pip install redis`
|
||||
|
||||
**Setup** (`web/backend/cache.py`):
|
||||
```python
|
||||
import redis
|
||||
import json
|
||||
from functools import wraps
|
||||
from typing import Optional
|
||||
|
||||
redis_client = redis.Redis(
|
||||
host='localhost',
|
||||
port=6379,
|
||||
decode_responses=True
|
||||
)
|
||||
|
||||
def cache_result(ttl: int = 300):
|
||||
"""
|
||||
Decorator to cache function results
|
||||
|
||||
Args:
|
||||
ttl: Time to live in seconds
|
||||
"""
|
||||
def decorator(func):
|
||||
@wraps(func)
|
||||
async def wrapper(*args, **kwargs):
|
||||
# Create cache key
|
||||
key = f"cache:{func.__name__}:{hash(str(args) + str(kwargs))}"
|
||||
|
||||
# Try to get from cache
|
||||
cached = redis_client.get(key)
|
||||
if cached:
|
||||
return json.loads(cached)
|
||||
|
||||
# Execute function
|
||||
result = await func(*args, **kwargs)
|
||||
|
||||
# Store in cache
|
||||
redis_client.setex(key, ttl, json.dumps(result))
|
||||
|
||||
return result
|
||||
return wrapper
|
||||
return decorator
|
||||
```
|
||||
|
||||
**Usage**:
|
||||
```python
|
||||
from web.backend.cache import cache_result
|
||||
|
||||
@app.get("/api/stats/platforms")
|
||||
@cache_result(ttl=300) # Cache 5 minutes
|
||||
async def get_platform_stats():
|
||||
# Expensive database query
|
||||
return stats
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## PHASE 3-5: Additional Tasks
|
||||
|
||||
Due to space constraints, see separate files:
|
||||
- `docs/IMPLEMENTATION_CODE_QUALITY.md` - Tasks 9-12
|
||||
- `docs/IMPLEMENTATION_RELIABILITY.md` - Tasks 13-16
|
||||
- `docs/IMPLEMENTATION_UI.md` - Tasks 17-18
|
||||
|
||||
---
|
||||
|
||||
## Quick Start Checklist
|
||||
|
||||
**Today (30-60 min):**
|
||||
- [ ] Task 2: Path validation (30min) - Highest security ROI
|
||||
- [ ] Task 4: Rate limiting (20min) - Easy win
|
||||
- [ ] Task 6: Database indexes (15min) - Instant performance boost
|
||||
|
||||
**This Week (2-3 hours):**
|
||||
- [ ] Task 1: Token exposure fix
|
||||
- [ ] Task 3: CSRF protection
|
||||
- [ ] Task 5: Input validation
|
||||
|
||||
**Next Week (4-6 hours):**
|
||||
- [ ] Performance optimizations (Tasks 7-8)
|
||||
- [ ] Code quality improvements (Tasks 9-12)
|
||||
|
||||
**Later (2-3 hours):**
|
||||
- [ ] Reliability improvements (Tasks 13-16)
|
||||
- [ ] UI enhancements (Tasks 17-18)
|
||||
|
||||
Reference in New Issue
Block a user