Files
media-downloader/docs/REFACTORING_GUIDE.md
Todd 0d7b2b1aab Initial commit
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 22:42:55 -04:00

322 lines
8.7 KiB
Markdown

# Code Refactoring Guide
**Version:** 6.52.38
**Date:** 2025-12-05
**Status:** In Progress - Gradual Migration
---
## Overview
This document describes the code refactoring infrastructure added to address critical technical debt issues identified in the comprehensive code review.
## Changes Introduced
### 1. New Core Infrastructure (`web/backend/core/`)
#### `core/config.py` - Unified Configuration Manager
- **Purpose:** Single source of truth for all configuration values
- **Benefits:** Eliminates 4+ different config loading approaches
- **Usage:**
```python
from web.backend.core.config import settings
# Access configuration
db_path = settings.DB_PATH
timeout = settings.PROCESS_TIMEOUT_MEDIUM
media_base = settings.MEDIA_BASE_PATH
```
**Priority Hierarchy:**
1. Environment variables (highest)
2. .env file values
3. Database settings
4. Hardcoded defaults (lowest)
---
#### `core/exceptions.py` - Custom Exception Classes
- **Purpose:** Replace broad `except Exception` with specific exceptions
- **Benefits:** Better error handling, debugging, and HTTP status code mapping
- **Usage:**
```python
from web.backend.core.exceptions import (
DatabaseError,
DatabaseQueryError,
RecordNotFoundError,
DownloadError,
NetworkError,
ValidationError,
handle_exceptions
)
# Raising specific exceptions
if not record:
raise RecordNotFoundError("Download not found", {"id": download_id})
# Using decorator for automatic HTTP conversion
@router.get("/api/something")
@handle_exceptions
async def get_something():
# Exceptions automatically converted to proper HTTP responses
pass
```
**Exception Mapping:**
| Exception | HTTP Status |
|-----------|-------------|
| ValidationError | 400 |
| AuthError | 401 |
| InsufficientPermissionsError | 403 |
| RecordNotFoundError | 404 |
| DuplicateRecordError | 409 |
| RateLimitError | 429 |
| DatabaseError | 500 |
| NetworkError | 502 |
| PlatformUnavailableError | 503 |
---
#### `core/dependencies.py` - Shared Dependencies
- **Purpose:** Centralized FastAPI dependencies for authentication and services
- **Benefits:** Consistent auth behavior across all routers
- **Usage:**
```python
from web.backend.core.dependencies import (
get_current_user,
get_current_user_optional,
get_current_user_media,
require_admin,
get_database,
get_settings_manager,
get_app_state
)
@router.get("/api/protected")
async def protected_endpoint(current_user: Dict = Depends(get_current_user)):
# User is authenticated
pass
@router.delete("/api/admin-only")
async def admin_endpoint(current_user: Dict = Depends(require_admin)):
# User must be admin
pass
```
---
#### `core/responses.py` - Standardized Response Format
- **Purpose:** Consistent response structure and date handling
- **Benefits:** Uniform API contract, ISO 8601 dates everywhere
- **Usage:**
```python
from web.backend.core.responses import (
success,
error,
paginated,
to_iso8601,
from_iso8601,
now_iso8601
)
# Success response
return success(data={"id": 1}, message="Created successfully")
# Output: {"success": true, "message": "Created successfully", "data": {"id": 1}}
# Paginated response
return paginated(items=results, total=100, page=1, page_size=20)
# Output: {"items": [...], "total": 100, "page": 1, "page_size": 20, "has_more": true}
# Date formatting
timestamp = now_iso8601() # "2025-12-05T10:30:00Z"
dt = from_iso8601("2025-12-05T10:30:00Z") # datetime object
```
---
### 2. Modular Routers (`web/backend/routers/`)
#### Structure
```
web/backend/routers/
├── __init__.py
├── auth.py # Authentication endpoints
├── health.py # Health check endpoints
└── (more to be added)
```
#### Creating New Routers
```python
# Example: routers/downloads.py
from fastapi import APIRouter, Depends
from ..core.dependencies import get_current_user
from ..core.exceptions import handle_exceptions
router = APIRouter(prefix="/api/downloads", tags=["Downloads"])
@router.get("/")
@handle_exceptions
async def list_downloads(current_user: Dict = Depends(get_current_user)):
# Implementation
pass
```
---
### 3. Pydantic Models (`web/backend/models/`)
#### `models/api_models.py`
- **Purpose:** Centralized request/response models with validation
- **Benefits:** Type safety, automatic validation, documentation
- **Usage:**
```python
from web.backend.models.api_models import (
LoginRequest,
DownloadResponse,
BatchDeleteRequest,
PaginatedResponse
)
@router.post("/batch-delete")
async def batch_delete(request: BatchDeleteRequest):
# request.file_paths is validated as List[str] with min 1 item
pass
```
---
### 4. Base Instagram Downloader (`modules/instagram/`)
#### `modules/instagram/base.py`
- **Purpose:** Extract common functionality from FastDL, ImgInn, Toolzu modules
- **Benefits:** 60-70% code reduction, consistent behavior, easier maintenance
#### Common Features Extracted:
- Cookie management (database and file-based)
- FlareSolverr/Cloudflare bypass integration
- Rate limiting and batch delays
- Browser management (Playwright)
- Download tracking
- Logging standardization
#### Usage:
```python
from modules.instagram.base import BaseInstagramDownloader
class MyDownloader(BaseInstagramDownloader):
SCRAPER_ID = "my_scraper"
BASE_URL = "https://example.com"
def _get_content_urls(self, username, content_type):
# Implementation specific to this scraper
pass
def _parse_content(self, html, content_type):
# Implementation specific to this scraper
pass
def _extract_download_url(self, item):
# Implementation specific to this scraper
pass
```
---
## Migration Plan
### Phase 1: Infrastructure (Complete)
- [x] Create `core/config.py` - Unified configuration
- [x] Create `core/exceptions.py` - Custom exceptions
- [x] Create `core/dependencies.py` - Shared dependencies
- [x] Create `core/responses.py` - Response standardization
- [x] Create `models/api_models.py` - Pydantic models
- [x] Create `modules/instagram/base.py` - Base class
### Phase 2: Router Migration (In Progress)
- [x] Create `routers/auth.py`
- [x] Create `routers/health.py`
- [ ] Create `routers/downloads.py`
- [ ] Create `routers/media.py`
- [ ] Create `routers/scheduler.py`
- [ ] Create `routers/face_recognition.py`
- [ ] Create `routers/recycle.py`
- [ ] Create `routers/review.py`
- [ ] Create `routers/video.py`
- [ ] Create remaining routers
### Phase 3: Module Refactoring (Pending)
- [ ] Refactor `fastdl_module.py` to use base class
- [ ] Refactor `imginn_module.py` to use base class
- [ ] Refactor `toolzu_module.py` to use base class
- [ ] Update tests
### Phase 4: Cleanup (Pending)
- [ ] Replace broad exception handlers gradually
- [ ] Migrate sync HTTP to async httpx
- [ ] Remove deprecated code
- [ ] Update documentation
---
## Backwards Compatibility
The new infrastructure is designed for gradual migration:
1. **api.py remains functional** - The monolithic file continues to work
2. **New routers can be added incrementally** - Include in main app as ready
3. **Base classes are optional** - Existing modules work unchanged
4. **No breaking changes** - All existing API contracts preserved
---
## Testing
When migrating an endpoint to a router:
1. Create the router file
2. Move endpoint code
3. Update imports to use new core modules
4. Add `@handle_exceptions` decorator
5. Test endpoint manually
6. Add unit tests
7. Remove from api.py when confident
---
## Files Created
| File | Purpose | Lines |
|------|---------|-------|
| `web/backend/core/__init__.py` | Core module init | 1 |
| `web/backend/core/config.py` | Configuration manager | 95 |
| `web/backend/core/exceptions.py` | Custom exceptions | 250 |
| `web/backend/core/dependencies.py` | Shared dependencies | 150 |
| `web/backend/core/responses.py` | Response formatting | 140 |
| `web/backend/routers/__init__.py` | Routers init | 1 |
| `web/backend/routers/auth.py` | Auth endpoints | 170 |
| `web/backend/routers/health.py` | Health endpoints | 300 |
| `web/backend/models/__init__.py` | Models init | 1 |
| `web/backend/models/api_models.py` | Pydantic models | 350 |
| `web/backend/services/__init__.py` | Services init | 1 |
| `modules/instagram/__init__.py` | Instagram module init | 2 |
| `modules/instagram/base.py` | Base downloader class | 400 |
**Total new code:** ~1,860 lines
---
## Next Steps
1. **Immediate:** Test routers with current api.py
2. **Short-term:** Migrate remaining routers gradually
3. **Medium-term:** Refactor Instagram modules to use base class
4. **Long-term:** Replace all broad exception handlers, add async HTTP
---
## Related Documentation
- `docs/COMPREHENSIVE_CODE_REVIEW.md` - Full code review
- `docs/TECHNICAL_DEBT_ANALYSIS.md` - Original technical debt analysis
- `docs/FEATURE_ROADMAP_2025.md` - Feature roadmap