291
docs/PLAN-standardized-filenames.md
Normal file
291
docs/PLAN-standardized-filenames.md
Normal file
@@ -0,0 +1,291 @@
|
||||
# Plan: Standardized Filename Format with EXIF Metadata
|
||||
|
||||
## Overview
|
||||
Standardize filenames across all download platforms to a consistent format while storing descriptive metadata (title, caption, description) in file EXIF/metadata rather than filenames.
|
||||
|
||||
### Target Filename Format
|
||||
```
|
||||
{source}_{YYYYMMDD}_{HHMMSS}_{media_id}.{ext}
|
||||
```
|
||||
|
||||
### Current vs Target by Platform
|
||||
|
||||
| Platform | Current Format | Status |
|
||||
|----------|---------------|--------|
|
||||
| Instagram | `evalongoria_20251016_123456_18529350958013602.jpg` | Already correct |
|
||||
| Snapchat | `evalongoria_20251113_140600_Xr8sJ936p31PrqwxCaDKQ.mp4` | Already correct |
|
||||
| TikTok | `20251218_title here_7585297468103855391_0.mp4` | Needs change |
|
||||
| YouTube | `title [video_id].mp4` | Needs change |
|
||||
| Dailymotion | `title_video_id.mp4` | Needs change |
|
||||
| Bilibili | `title_video_id.mp4` | Needs change |
|
||||
| Erome | `title_video_id.mp4` | Needs change |
|
||||
|
||||
### User Preferences (Confirmed)
|
||||
- **Migration**: Migrate existing files to new format
|
||||
- **Video metadata**: Use ffmpeg remux (fast, no re-encoding)
|
||||
- **Missing date**: Use existing filesystem timestamp
|
||||
- **Channel folders**: Organize video downloads by channel subfolder (except TikTok)
|
||||
|
||||
### Target Directory Structure
|
||||
|
||||
Videos (except TikTok) will be organized by channel:
|
||||
```
|
||||
/opt/immich/md/youtube/{channel_name}/{filename}.mp4
|
||||
/opt/immich/md/dailymotion/{channel_name}/{filename}.mp4
|
||||
/opt/immich/md/bilibili/{channel_name}/{filename}.mp4
|
||||
/opt/immich/md/erome/{channel_name}/{filename}.mp4
|
||||
```
|
||||
|
||||
TikTok stays flat (no channel folders):
|
||||
```
|
||||
/opt/immich/md/tiktok/{filename}.mp4
|
||||
```
|
||||
|
||||
Example:
|
||||
- Before: `/opt/immich/md/youtube/20251112_Video Title_abc123.mp4`
|
||||
- After: `/opt/immich/md/youtube/snapthefamous/snapthefamous_20251112_abc123.mp4`
|
||||
|
||||
### Existing Metadata Status
|
||||
|
||||
yt-dlp already embeds: `title`, `artist`, `date`, `comment` (URL), `description`, `synopsis`
|
||||
|
||||
| Platform | Has Embedded Metadata? | Migration Action |
|
||||
|----------|----------------------|------------------|
|
||||
| YouTube | Yes (verified via ffprobe) | Rename only |
|
||||
| Dailymotion | Yes (yt-dlp) | Rename only |
|
||||
| Bilibili | Yes (verified via ffprobe) | Rename only |
|
||||
| Erome | Yes (yt-dlp) | Rename only |
|
||||
| TikTok | No | Rename + write metadata |
|
||||
| Instagram | No | Rename + write metadata |
|
||||
| Snapchat | No | Filename already OK, add metadata |
|
||||
|
||||
**Key insight:** Existing files have embedded metadata but the lightbox doesn't READ it.
|
||||
The lightbox only shows database fields, not actual file metadata.
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Create Shared Metadata Utilities
|
||||
|
||||
**New file:** `/opt/media-downloader/modules/metadata_utils.py`
|
||||
|
||||
### Functions:
|
||||
- `write_image_metadata(file_path, metadata)` - Write to EXIF via exiftool
|
||||
- `write_video_metadata(file_path, metadata)` - Write via ffmpeg remux
|
||||
- `read_file_metadata(file_path)` - Read existing metadata
|
||||
- `generate_standardized_filename(source, date, media_id, ext)` - Generate standard filename
|
||||
|
||||
### EXIF Fields for Images:
|
||||
- `ImageDescription`: title/caption
|
||||
- `XPComment`: full description
|
||||
- `Artist`: source/uploader
|
||||
- `DateTimeOriginal`: post date
|
||||
- `UserComment`: source URL
|
||||
|
||||
### Video Metadata Fields:
|
||||
- `title`, `artist`, `description`, `comment`, `date`
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Update Instagram Modules (Caption Storage)
|
||||
|
||||
Currently caption is extracted but discarded. Store in `downloads.metadata` JSON.
|
||||
|
||||
**Files:**
|
||||
- `/opt/media-downloader/modules/imginn_module.py` - Extract caption in `_download_post()`
|
||||
- `/opt/media-downloader/modules/fastdl_module.py` - Extract in download methods
|
||||
- `/opt/media-downloader/modules/toolzu_module.py` - Extract caption if available
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Update Universal Video Downloader
|
||||
|
||||
**File:** `/opt/media-downloader/modules/universal_video_downloader.py`
|
||||
|
||||
**Note:** yt-dlp already embeds metadata via `--add-metadata` (line 1104). We need to:
|
||||
1. Change the filename format
|
||||
2. Add channel subfolder to output path
|
||||
|
||||
### Changes:
|
||||
|
||||
1. **Sanitize channel name** for folder:
|
||||
```python
|
||||
def sanitize_channel_name(name: str) -> str:
|
||||
"""Sanitize channel name for use as folder name."""
|
||||
if not name:
|
||||
return 'unknown'
|
||||
# Remove/replace invalid filesystem characters
|
||||
sanitized = re.sub(r'[<>:"/\\|?*]', '', name)
|
||||
sanitized = sanitized.strip('. ')
|
||||
return sanitized[:50] or 'unknown' # Limit length
|
||||
```
|
||||
|
||||
2. **Update output template** to include channel folder:
|
||||
```python
|
||||
# Get channel name from video info first
|
||||
info = yt_dlp.YoutubeDL({'quiet': True}).extract_info(url, download=False)
|
||||
channel = sanitize_channel_name(info.get('uploader') or info.get('channel'))
|
||||
|
||||
# Create channel subfolder
|
||||
channel_dir = Path(output_dir) / channel
|
||||
channel_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
'outtmpl': f'{channel_dir}/%(uploader)s_%(upload_date)s_%(id)s.%(ext)s'
|
||||
```
|
||||
|
||||
**No additional metadata writing needed** - yt-dlp already embeds title, artist, description, date.
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Update TikTok Module
|
||||
|
||||
**File:** `/opt/media-downloader/modules/tiktok_module.py`
|
||||
|
||||
Change filename from:
|
||||
```python
|
||||
filename = f"{date_str}_{clean_title}_{video_id}_{idx}.{ext}"
|
||||
```
|
||||
|
||||
To:
|
||||
```python
|
||||
filename = f"{username}_{date_str}_{video_id}.{ext}"
|
||||
```
|
||||
|
||||
**TikTok NEEDS metadata writing** - unlike yt-dlp platforms, TikTok downloads don't have embedded metadata.
|
||||
Call `write_video_metadata()` after download with title, description, username.
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Create Migration Script
|
||||
|
||||
**New file:** `/opt/media-downloader/scripts/migrate_filenames.py`
|
||||
|
||||
### Functionality:
|
||||
1. Query `file_inventory` for all files
|
||||
2. Parse current filename to extract components
|
||||
3. Look up metadata in DB (`downloads`, `video_downloads`)
|
||||
4. Generate new standardized filename
|
||||
5. **For videos (except TikTok)**: Create channel subfolder and move file
|
||||
6. Rename file if needed
|
||||
7. Update `file_inventory.filename` and `file_inventory.file_path`
|
||||
8. Write metadata to file EXIF/ffmpeg (for TikTok/Instagram only)
|
||||
9. Create backup list for rollback
|
||||
|
||||
### Video Migration (Channel Folders):
|
||||
```python
|
||||
# For YouTube, Dailymotion, Bilibili, Erome videos
|
||||
if platform in ['youtube', 'dailymotion', 'bilibili', 'erome']:
|
||||
# Get channel from video_downloads table
|
||||
channel = get_channel_from_db(video_id) or extract_from_embedded_metadata(file_path)
|
||||
channel_safe = sanitize_channel_name(channel)
|
||||
|
||||
# New path: /opt/immich/md/youtube/channelname/file.mp4
|
||||
new_dir = Path(base_dir) / platform / channel_safe
|
||||
new_dir.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
new_path = new_dir / new_filename
|
||||
shutil.move(old_path, new_path)
|
||||
```
|
||||
|
||||
### Missing date handling:
|
||||
- Use file's `mtime` (modification time)
|
||||
- Format as `YYYYMMDD_HHMMSS`
|
||||
|
||||
### Missing channel handling:
|
||||
- Read from `video_downloads.uploader` in database
|
||||
- Fall back to reading embedded metadata via ffprobe
|
||||
- Last resort: use "unknown" folder
|
||||
|
||||
---
|
||||
|
||||
## Phase 6: Update move_module.py
|
||||
|
||||
**File:** `/opt/media-downloader/modules/move_module.py`
|
||||
|
||||
After moving file, call metadata writer:
|
||||
```python
|
||||
if is_image:
|
||||
write_image_metadata(dest, {'title': caption, 'artist': source, ...})
|
||||
elif is_video:
|
||||
write_video_metadata(dest, {...})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 7: Add Metadata Display to Lightbox ✅ COMPLETED
|
||||
|
||||
**Status:** Implemented on 2025-12-21
|
||||
|
||||
The EnhancedLightbox now displays embedded metadata from video files.
|
||||
|
||||
### What was implemented:
|
||||
- **Backend**: `GET /api/media/embedded-metadata` endpoint using ffprobe/exiftool
|
||||
- **Frontend**: Fetches metadata when Details panel is opened
|
||||
- **Display**: Shows Title and Description from embedded file metadata
|
||||
|
||||
### Files modified:
|
||||
- `/opt/media-downloader/web/backend/routers/media.py` - Added endpoint
|
||||
- `/opt/media-downloader/web/frontend/src/components/EnhancedLightbox.tsx` - Added UI
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
1. ~~Phase 7: Add metadata display to lightbox~~ ✅ DONE
|
||||
2. Phase 1: Create `metadata_utils.py` (foundation)
|
||||
3. Phase 3: Update universal video downloader (filename + channel folders)
|
||||
4. Phase 4: Update TikTok module (filename only, no channel folders)
|
||||
5. Phase 2: Update Instagram modules (caption storage)
|
||||
6. Phase 6: Update move_module.py
|
||||
7. Phase 5: Create and run migration script (last - after all new code works)
|
||||
|
||||
---
|
||||
|
||||
## Files Summary
|
||||
|
||||
### New files:
|
||||
- `/opt/media-downloader/modules/metadata_utils.py`
|
||||
- `/opt/media-downloader/scripts/migrate_filenames.py`
|
||||
|
||||
### Modified files:
|
||||
- `/opt/media-downloader/modules/universal_video_downloader.py`
|
||||
- `/opt/media-downloader/modules/tiktok_module.py`
|
||||
- `/opt/media-downloader/modules/imginn_module.py`
|
||||
- `/opt/media-downloader/modules/fastdl_module.py`
|
||||
- `/opt/media-downloader/modules/toolzu_module.py`
|
||||
- `/opt/media-downloader/modules/move_module.py`
|
||||
- `/opt/media-downloader/web/frontend/src/components/EnhancedLightbox.tsx`
|
||||
- `/opt/media-downloader/web/backend/routers/media.py`
|
||||
|
||||
---
|
||||
|
||||
## Pages Using EnhancedLightbox (Automatic Benefits)
|
||||
|
||||
These pages use EnhancedLightbox and will automatically get embedded metadata display:
|
||||
- VideoDownloader.tsx (history section)
|
||||
- Downloads.tsx
|
||||
- Media.tsx
|
||||
- Review.tsx
|
||||
- RecycleBin.tsx
|
||||
- Discovery.tsx
|
||||
- Notifications.tsx
|
||||
- Dashboard.tsx
|
||||
|
||||
**No additional changes needed** - updating EnhancedLightbox updates all pages.
|
||||
|
||||
---
|
||||
|
||||
## Pages with Custom Video Modals (Need Separate Updates)
|
||||
|
||||
**1. DownloadQueue.tsx** (custom Video Player Modal):
|
||||
- Currently shows: title, channel_name, upload_date from database
|
||||
- For completed downloads: Add embedded metadata display (title, description)
|
||||
- For queued items: No file exists yet, keep using DB fields
|
||||
|
||||
**2. CelebrityDiscovery.tsx** (inline video elements):
|
||||
- Consider adding metadata info panel or tooltip
|
||||
- Lower priority - mainly for browsing/discovery, not viewing downloads
|
||||
|
||||
---
|
||||
|
||||
## Version
|
||||
This will be version **11.17.0** (minor release - new feature)
|
||||
Reference in New Issue
Block a user