Files
media-downloader/docs/CACHE_BUILDER.md
Todd 0d7b2b1aab Initial commit
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 22:42:55 -04:00

6.6 KiB

Media Cache Builder

Overview

The Media Cache Builder is a background service that pre-generates thumbnails and caches metadata for all media files in the system. This significantly improves performance by:

  • Pre-generating thumbnails: Thumbnails are created in advance rather than on-demand when viewing media
  • Caching metadata: Resolution, file size, duration, and format information is extracted and cached
  • Reducing API latency: Media gallery and downloads pages load much faster with cached data

Components

1. Background Worker Script

Location: /opt/media-downloader/modules/thumbnail_cache_builder.py

This Python script scans all media files in /opt/immich/md and:

  • Generates 300x300 pixel thumbnails for images and videos
  • Extracts metadata (width, height, duration, format)
  • Stores thumbnails in /opt/media-downloader/database/thumbnails.db
  • Stores metadata in /opt/media-downloader/database/media_metadata.db
  • Skips files that are already cached and haven't been modified
  • Runs with low priority (Nice=19, IOSchedulingClass=idle) to avoid impacting system performance

2. Systemd Service

Location: /etc/systemd/system/media-cache-builder.service

A oneshot systemd service that runs the cache builder script.

Resource Limits:

  • CPU quota: 50% (limited to prevent high CPU usage)
  • I/O scheduling: idle priority
  • Nice level: 19 (lowest CPU priority)

3. Systemd Timer

Location: /etc/systemd/system/media-cache-builder.timer

Automatically runs the cache builder daily at 3:00 AM with a randomized delay of up to 30 minutes.

Schedule:

  • Daily at 3:00 AM
  • Persistent (runs missed timers on boot)
  • Random delay: 0-30 minutes

API Endpoints

Get Cached Metadata

GET /api/media/metadata?file_path=/path/to/file

Returns cached metadata for a media file:

{
  "file_path": "/opt/immich/md/instagram/user/image.jpg",
  "width": 1920,
  "height": 1080,
  "file_size": 245678,
  "duration": null,
  "format": "JPEG",
  "cached": true,
  "cached_at": "2025-10-30T22:36:45.123"
}

Trigger Cache Rebuild

POST /api/media/cache/rebuild

Manually triggers a cache rebuild in the background:

{
  "success": true,
  "message": "Cache rebuild started in background"
}

Get Cache Statistics

GET /api/media/cache/stats

Returns statistics about the cache:

{
  "thumbnails": {
    "exists": true,
    "count": 2126,
    "size_bytes": 52428800
  },
  "metadata": {
    "exists": true,
    "count": 2126,
    "size_bytes": 204800
  }
}

Manual Usage

Run Cache Builder Manually

# Run directly
sudo /usr/bin/python3 /opt/media-downloader/modules/thumbnail_cache_builder.py

# Or via systemd
sudo systemctl start media-cache-builder.service

Check Service Status

# Check if timer is active
sudo systemctl status media-cache-builder.timer

# View logs
sudo journalctl -u media-cache-builder.service -f

# Check when next run is scheduled
systemctl list-timers media-cache-builder.timer

Enable/Disable Automatic Runs

# Disable daily automatic runs
sudo systemctl stop media-cache-builder.timer
sudo systemctl disable media-cache-builder.timer

# Re-enable daily automatic runs
sudo systemctl enable media-cache-builder.timer
sudo systemctl start media-cache-builder.timer

Database Schema

Thumbnails Database

Location: /opt/media-downloader/database/thumbnails.db

CREATE TABLE thumbnails (
    file_hash TEXT PRIMARY KEY,
    file_path TEXT NOT NULL,
    thumbnail_data BLOB NOT NULL,
    created_at TEXT,
    file_mtime REAL
);
CREATE INDEX idx_file_path ON thumbnails(file_path);

Metadata Database

Location: /opt/media-downloader/database/media_metadata.db

CREATE TABLE media_metadata (
    file_hash TEXT PRIMARY KEY,
    file_path TEXT NOT NULL,
    width INTEGER,
    height INTEGER,
    file_size INTEGER,
    duration REAL,
    format TEXT,
    created_at TEXT,
    file_mtime REAL
);
CREATE INDEX idx_meta_file_path ON media_metadata(file_path);

Performance

Typical Performance

  • Processing rate: 15-25 files/second (varies by file size and type)
  • Memory usage: ~900MB - 1GB during operation
  • CPU usage: Limited to 50% of one core
  • I/O priority: Idle (won't interfere with normal operations)

For 2,000 files:

  • Time: ~2-3 minutes
  • Thumbnail cache size: ~50-100MB
  • Metadata cache size: ~200-500KB

Logs

Location: /opt/media-downloader/logs/thumbnail_cache_builder.log

The cache builder logs detailed progress information:

  • Total files processed
  • Thumbnails created
  • Metadata cached
  • Files skipped (already cached)
  • Errors encountered
  • Processing rate and ETA

View logs:

# Live tail
tail -f /opt/media-downloader/logs/thumbnail_cache_builder.log

# Via systemd journal
sudo journalctl -u media-cache-builder.service -f

Troubleshooting

Service Fails to Start

Check logs:

sudo journalctl -xeu media-cache-builder.service

Common issues:

  • Missing dependencies (PIL/Pillow, ffmpeg)
  • Permission issues accessing media directory
  • Database corruption

Thumbnails Not Appearing

  1. Check if cache builder has run:

    sudo systemctl status media-cache-builder.service
    
  2. Manually trigger rebuild:

    curl -X POST http://localhost:8000/api/media/cache/rebuild
    
  3. Check cache stats:

    curl http://localhost:8000/api/media/cache/stats
    

High Memory Usage

The cache builder can use 900MB-1GB of RAM during operation. This is normal due to image processing. The systemd service runs with low priority and won't impact other services.

To reduce memory usage, you can:

  • Reduce the batch size (modify script)
  • Run manually during off-peak hours instead of using timer

Corrupted or Invalid Images

Some files may fail to process (shown in error logs). This is normal for:

  • Corrupted downloads
  • Unsupported formats
  • Incomplete files

These errors don't stop the cache builder from processing other files.

Integration with Frontend

The frontend automatically:

  • Uses cached thumbnails when available
  • Falls back to on-demand generation if cache miss
  • Shows resolution from cache in lightbox (no need to load image first)

No frontend changes are required - caching is transparent to users.

Future Enhancements

Potential improvements:

  • Progressive thumbnail generation (prioritize recently viewed files)
  • Cleanup of thumbnails for deleted files
  • Configurable thumbnail sizes
  • Batch processing with configurable batch sizes
  • Real-time generation triggered by downloads
  • Cache warming based on user access patterns