Files
media-downloader/docs/PLAN-standardized-filenames.md
Todd 0d7b2b1aab Initial commit
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 22:42:55 -04:00

9.4 KiB

Plan: Standardized Filename Format with EXIF Metadata

Overview

Standardize filenames across all download platforms to a consistent format while storing descriptive metadata (title, caption, description) in file EXIF/metadata rather than filenames.

Target Filename Format

{source}_{YYYYMMDD}_{HHMMSS}_{media_id}.{ext}

Current vs Target by Platform

Platform Current Format Status
Instagram evalongoria_20251016_123456_18529350958013602.jpg Already correct
Snapchat evalongoria_20251113_140600_Xr8sJ936p31PrqwxCaDKQ.mp4 Already correct
TikTok 20251218_title here_7585297468103855391_0.mp4 Needs change
YouTube title [video_id].mp4 Needs change
Dailymotion title_video_id.mp4 Needs change
Bilibili title_video_id.mp4 Needs change
Erome title_video_id.mp4 Needs change

User Preferences (Confirmed)

  • Migration: Migrate existing files to new format
  • Video metadata: Use ffmpeg remux (fast, no re-encoding)
  • Missing date: Use existing filesystem timestamp
  • Channel folders: Organize video downloads by channel subfolder (except TikTok)

Target Directory Structure

Videos (except TikTok) will be organized by channel:

/opt/immich/md/youtube/{channel_name}/{filename}.mp4
/opt/immich/md/dailymotion/{channel_name}/{filename}.mp4
/opt/immich/md/bilibili/{channel_name}/{filename}.mp4
/opt/immich/md/erome/{channel_name}/{filename}.mp4

TikTok stays flat (no channel folders):

/opt/immich/md/tiktok/{filename}.mp4

Example:

  • Before: /opt/immich/md/youtube/20251112_Video Title_abc123.mp4
  • After: /opt/immich/md/youtube/snapthefamous/snapthefamous_20251112_abc123.mp4

Existing Metadata Status

yt-dlp already embeds: title, artist, date, comment (URL), description, synopsis

Platform Has Embedded Metadata? Migration Action
YouTube Yes (verified via ffprobe) Rename only
Dailymotion Yes (yt-dlp) Rename only
Bilibili Yes (verified via ffprobe) Rename only
Erome Yes (yt-dlp) Rename only
TikTok No Rename + write metadata
Instagram No Rename + write metadata
Snapchat No Filename already OK, add metadata

Key insight: Existing files have embedded metadata but the lightbox doesn't READ it. The lightbox only shows database fields, not actual file metadata.


Phase 1: Create Shared Metadata Utilities

New file: /opt/media-downloader/modules/metadata_utils.py

Functions:

  • write_image_metadata(file_path, metadata) - Write to EXIF via exiftool
  • write_video_metadata(file_path, metadata) - Write via ffmpeg remux
  • read_file_metadata(file_path) - Read existing metadata
  • generate_standardized_filename(source, date, media_id, ext) - Generate standard filename

EXIF Fields for Images:

  • ImageDescription: title/caption
  • XPComment: full description
  • Artist: source/uploader
  • DateTimeOriginal: post date
  • UserComment: source URL

Video Metadata Fields:

  • title, artist, description, comment, date

Phase 2: Update Instagram Modules (Caption Storage)

Currently caption is extracted but discarded. Store in downloads.metadata JSON.

Files:

  • /opt/media-downloader/modules/imginn_module.py - Extract caption in _download_post()
  • /opt/media-downloader/modules/fastdl_module.py - Extract in download methods
  • /opt/media-downloader/modules/toolzu_module.py - Extract caption if available

Phase 3: Update Universal Video Downloader

File: /opt/media-downloader/modules/universal_video_downloader.py

Note: yt-dlp already embeds metadata via --add-metadata (line 1104). We need to:

  1. Change the filename format
  2. Add channel subfolder to output path

Changes:

  1. Sanitize channel name for folder:
def sanitize_channel_name(name: str) -> str:
    """Sanitize channel name for use as folder name."""
    if not name:
        return 'unknown'
    # Remove/replace invalid filesystem characters
    sanitized = re.sub(r'[<>:"/\\|?*]', '', name)
    sanitized = sanitized.strip('. ')
    return sanitized[:50] or 'unknown'  # Limit length
  1. Update output template to include channel folder:
# Get channel name from video info first
info = yt_dlp.YoutubeDL({'quiet': True}).extract_info(url, download=False)
channel = sanitize_channel_name(info.get('uploader') or info.get('channel'))

# Create channel subfolder
channel_dir = Path(output_dir) / channel
channel_dir.mkdir(parents=True, exist_ok=True)

'outtmpl': f'{channel_dir}/%(uploader)s_%(upload_date)s_%(id)s.%(ext)s'

No additional metadata writing needed - yt-dlp already embeds title, artist, description, date.


Phase 4: Update TikTok Module

File: /opt/media-downloader/modules/tiktok_module.py

Change filename from:

filename = f"{date_str}_{clean_title}_{video_id}_{idx}.{ext}"

To:

filename = f"{username}_{date_str}_{video_id}.{ext}"

TikTok NEEDS metadata writing - unlike yt-dlp platforms, TikTok downloads don't have embedded metadata. Call write_video_metadata() after download with title, description, username.


Phase 5: Create Migration Script

New file: /opt/media-downloader/scripts/migrate_filenames.py

Functionality:

  1. Query file_inventory for all files
  2. Parse current filename to extract components
  3. Look up metadata in DB (downloads, video_downloads)
  4. Generate new standardized filename
  5. For videos (except TikTok): Create channel subfolder and move file
  6. Rename file if needed
  7. Update file_inventory.filename and file_inventory.file_path
  8. Write metadata to file EXIF/ffmpeg (for TikTok/Instagram only)
  9. Create backup list for rollback

Video Migration (Channel Folders):

# For YouTube, Dailymotion, Bilibili, Erome videos
if platform in ['youtube', 'dailymotion', 'bilibili', 'erome']:
    # Get channel from video_downloads table
    channel = get_channel_from_db(video_id) or extract_from_embedded_metadata(file_path)
    channel_safe = sanitize_channel_name(channel)

    # New path: /opt/immich/md/youtube/channelname/file.mp4
    new_dir = Path(base_dir) / platform / channel_safe
    new_dir.mkdir(parents=True, exist_ok=True)

    new_path = new_dir / new_filename
    shutil.move(old_path, new_path)

Missing date handling:

  • Use file's mtime (modification time)
  • Format as YYYYMMDD_HHMMSS

Missing channel handling:

  • Read from video_downloads.uploader in database
  • Fall back to reading embedded metadata via ffprobe
  • Last resort: use "unknown" folder

Phase 6: Update move_module.py

File: /opt/media-downloader/modules/move_module.py

After moving file, call metadata writer:

if is_image:
    write_image_metadata(dest, {'title': caption, 'artist': source, ...})
elif is_video:
    write_video_metadata(dest, {...})

Phase 7: Add Metadata Display to Lightbox COMPLETED

Status: Implemented on 2025-12-21

The EnhancedLightbox now displays embedded metadata from video files.

What was implemented:

  • Backend: GET /api/media/embedded-metadata endpoint using ffprobe/exiftool
  • Frontend: Fetches metadata when Details panel is opened
  • Display: Shows Title and Description from embedded file metadata

Files modified:

  • /opt/media-downloader/web/backend/routers/media.py - Added endpoint
  • /opt/media-downloader/web/frontend/src/components/EnhancedLightbox.tsx - Added UI

Implementation Order

  1. Phase 7: Add metadata display to lightbox DONE
  2. Phase 1: Create metadata_utils.py (foundation)
  3. Phase 3: Update universal video downloader (filename + channel folders)
  4. Phase 4: Update TikTok module (filename only, no channel folders)
  5. Phase 2: Update Instagram modules (caption storage)
  6. Phase 6: Update move_module.py
  7. Phase 5: Create and run migration script (last - after all new code works)

Files Summary

New files:

  • /opt/media-downloader/modules/metadata_utils.py
  • /opt/media-downloader/scripts/migrate_filenames.py

Modified files:

  • /opt/media-downloader/modules/universal_video_downloader.py
  • /opt/media-downloader/modules/tiktok_module.py
  • /opt/media-downloader/modules/imginn_module.py
  • /opt/media-downloader/modules/fastdl_module.py
  • /opt/media-downloader/modules/toolzu_module.py
  • /opt/media-downloader/modules/move_module.py
  • /opt/media-downloader/web/frontend/src/components/EnhancedLightbox.tsx
  • /opt/media-downloader/web/backend/routers/media.py

Pages Using EnhancedLightbox (Automatic Benefits)

These pages use EnhancedLightbox and will automatically get embedded metadata display:

  • VideoDownloader.tsx (history section)
  • Downloads.tsx
  • Media.tsx
  • Review.tsx
  • RecycleBin.tsx
  • Discovery.tsx
  • Notifications.tsx
  • Dashboard.tsx

No additional changes needed - updating EnhancedLightbox updates all pages.


Pages with Custom Video Modals (Need Separate Updates)

1. DownloadQueue.tsx (custom Video Player Modal):

  • Currently shows: title, channel_name, upload_date from database
  • For completed downloads: Add embedded metadata display (title, description)
  • For queued items: No file exists yet, keep using DB fields

2. CelebrityDiscovery.tsx (inline video elements):

  • Consider adding metadata info panel or tooltip
  • Lower priority - mainly for browsing/discovery, not viewing downloads

Version

This will be version 11.17.0 (minor release - new feature)