takeone-youtube-clone/app/Services/LlmLyricsService.php
ghassan f98e5415a3 Add lyrics pipeline, playlist views, admin toggles, and player polish
Lyrics pipeline (Whisper + Demucs + description alignment):
- New GenerateLyricsJob runs WhisperX with VAD filtering and forced word
  alignment, writes per-track JSON to NAS.
- New DecorateLyricsJob calls the active LLM provider to bake one to
  several emojis into each line (heavy decoration prompt).
- LyricsDescriptionParser strips heading content, section markers, and
  emoji-decoration from a song's description while preserving every
  language verbatim.
- correct_whisper_with_description aligner: strong-match anchors only,
  vocal-region-aware gap-fill so missing verses land on actual singing.
- Owner UI for generate/regenerate/edit/delete in the player gear.

Admin pages:
- /admin/lyrics    toggles for VAD, vocal gap-fill, Demucs, master
- /admin/gpu       extracted GPU section, encoder picker, FFmpeg path
- /admin/backup    extracted users-and-settings export/import
- /admin/settings  now AI/LLM only with provider list and Test button
- /admin/nas-storage hosts NAS settings, repair, disable flow, browser
- Shared partials/settings-styles for a uniform look across pages.

Playlist view tracking:
- Migration adds playlists.view_count and playlist_views dedup table.
- Playlist::bumpViewIfNew increments per device with a one-hour window.
- Tracked from /playlists/{id}, /playlists/share/{token}, /ps/{token},
  and /videos/{id}?playlist={token}.  Dispatched after-response so it
  never blocks the page render.
- Loading a playlist on the video page now runs one query instead of
  the four the old getNextVideo/getPreviousVideo path triggered.
- View counts shown on every playlist card and the playlist hero.

Player polish:
- Floating mini-player is draggable, persists its position in
  localStorage, clamps to viewport on resize.
- Mini disabled entirely on mobile (less than 768px).
- New gear-menu Mini Player toggle (persists in localStorage) lets the
  user disable both scroll-activation and SPA-nav-activation.
- Close button keeps media playing when used on the player's own page.
- SPA navigator now swaps a #page-scripts container so per-page JS
  (channel tabs, etc.) gets re-executed after content swaps.

Storage layout:
- Runtime data moved from /storage/* to /data/* and gitignored.
- /ml/venv, /ml/cache, /ml/__pycache__ excluded.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 22:01:47 +03:00

267 lines
13 KiB
PHP
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?php
namespace App\Services;
use App\Models\Setting;
use Illuminate\Support\Facades\Cache;
use Illuminate\Support\Facades\Http;
use Illuminate\Support\Facades\Log;
/**
* Optional LLM helper for the lyrics pipeline.
*
* Supports multiple providers configured in Admin → Settings → AI / LLM
* (local Ollama, hosted Anthropic Claude, or any OpenAI-compatible endpoint).
* Picks the provider flagged "Active" and dispatches the request through the
* matching adapter. Results are cached so a regenerate doesn't re-bill / re-hit
* the local model.
*/
class LlmLyricsService
{
public function isEnabled(): bool
{
return Setting::get('llm_enabled', 'false') === 'true'
&& $this->activeProvider() !== null;
}
public function cleanLyricsEnabled(): bool
{
return $this->isEnabled() && Setting::get('llm_clean_lyrics', 'true') === 'true';
}
public function decorateEnabled(): bool
{
return $this->isEnabled() && Setting::get('llm_decorate_lyrics', 'false') === 'true';
}
/** The currently selected provider config, or null if none. */
public function activeProvider(): ?array
{
$providers = json_decode((string) Setting::get('llm_providers', '[]'), true) ?: [];
if (! $providers) return null;
$activeId = (string) Setting::get('llm_active_id', '');
foreach ($providers as $p) {
if (($p['id'] ?? null) === $activeId) {
$kind = $p['kind'] ?? 'ollama';
// An Ollama provider doesn't need a key; the others do.
if ($kind !== 'ollama' && trim((string) ($p['api_key'] ?? '')) === '') return null;
if (trim((string) ($p['model'] ?? '')) === '') return null;
return $p;
}
}
return null;
}
/** Returns clean lyric lines extracted by the LLM, or [] on any failure. */
public function cleanDescription(?string $description): array
{
if (! $description || ! $this->cleanLyricsEnabled()) return [];
$provider = $this->activeProvider();
// v2 cache key: the prompt was rewritten to stop dropping English/Thai
// lyric lines that happened to carry leading/trailing emoji decoration.
$cacheKey = 'llm_lyrics_clean_v2:' . ($provider['id'] ?? '') . ':' . sha1($description);
return Cache::remember($cacheKey, now()->addDays(30), function () use ($description) {
$prompt = "Extract the SUNG lyric lines from this song description, preserving every\n"
. "language exactly as written. Songs are often MULTILINGUAL (e.g. mixed English\n"
. "and Thai, English and Italian, English and Arabic) — KEEP EVERY LANGUAGE.\n\n"
. "KEEP a line when it contains real lyric words, even if it's wrapped in or\n"
. " punctuated by emojis. Example: '🛡️💻 Met behind the firewalls 🌌' → KEEP.\n"
. " Strip ONLY the emojis themselves; the lyric words stay untouched.\n"
. "DROP a line ONLY when it is one of:\n"
. " • the song title or artist credit\n"
. " • a pure section header (Verse / Chorus / Bridge / Verso / Ritornello /\n"
. " Pre-Chorus / Outro / Intro / 副歌 / 후렴 / كورس / ท่อน / etc.) — typically\n"
. " one or two words, possibly numbered\n"
. " • an instrument or production note inside 【…】 or 〔…〕 brackets\n"
. " • a row that is ONLY emojis / separators / decorative symbols with no words\n"
. " • commentary or social-media call-to-action (subscribe, follow, link in bio)\n\n"
. "Hard rules:\n"
. " - DO NOT translate. DO NOT re-script (no romanising Thai/Arabic, no converting\n"
. " English to Thai). The output of each kept line must be in the SAME language\n"
. " and script as the original line.\n"
. " - DO NOT merge or split lines. One source lyric line → one output entry.\n"
. " - Preserve original punctuation (drop only the emojis).\n"
. " - Maintain the original order.\n\n"
. "Respond with ONLY a JSON array of strings. No prose, no markdown, no code fence.\n\n"
. "DESCRIPTION:\n" . $description;
$raw = $this->call($prompt, 8192);
if ($raw === '') return [];
$raw = trim(preg_replace('/^```(?:json)?\s*|\s*```$/m', '', $raw));
$arr = json_decode($raw, true);
if (! is_array($arr)) return [];
$out = [];
foreach ($arr as $line) {
$line = trim((string) $line);
if ($line === '') continue;
$out[] = $line;
}
return $out;
});
}
/**
* Rewrite each lyric line with heavy, expressive emoji styling. Emojis go
* inside the line AND at the end; multiple per line where it fits. The
* original words are NEVER changed — emojis are layered on top.
*
* Returns [index => decoratedLineText]. The caller swaps line.text and
* re-distributes the word timings across the new tokens.
*/
public function decorateLines(array $lines): array
{
if (! $lines || ! $this->decorateEnabled()) return [];
$provider = $this->activeProvider();
$cacheKey = 'llm_lyrics_deco_v3:' . ($provider['id'] ?? '') . ':' . sha1(json_encode($lines));
return Cache::remember($cacheKey, now()->addDays(30), function () use ($lines) {
$numbered = [];
foreach ($lines as $i => $l) $numbered[] = ($i + 1) . '. ' . $l;
$prompt = "Decorate the following song lyrics with heavy, expressive emoji styling.\n\n"
. "Strict instructions:\n"
. "- Add emojis to almost every line (rich and visually striking, not minimal).\n"
. "- Place emojis both WITHIN lines and AT THE END where they enhance meaning.\n"
. "- Use 24 emojis per line on average, more on emotional peaks.\n"
. "- Match emojis to the line's specific emotion, action, image, or vibe.\n"
. "- VARIETY IS CRITICAL. Across the WHOLE song you must use a wide palette:\n"
. " • Aim for 30+ distinct emojis across the song.\n"
. " • Never reuse the same emoji on two adjacent lines.\n"
. " • Do NOT lean on the same 56 staples (🔥💔✨🎵❤️). Reach for less obvious\n"
. " ones that fit: ⚡🌊🌙🕯️🪞🥀🦋🌪️🗡️👁️🩸🦅🌀💎🪽🌑🪐🩹🌹🫧🌧️🔮🧨🪞🛡️\n"
. " ⚔️🏹🪄💫🥷🧿🪙🥀🎭🩰🪦⛓️🌌🚪🧊🌠💢🪶🩷🫀🪐🕊️ and many others.\n"
. "- Keep the original lyrics 100% UNCHANGED — no rewriting, no translation, no\n"
. " re-spelling, no script conversion. Preserve every original word verbatim.\n"
. "- Style should feel bold, dramatic, pop-star, Gen Z, visually addictive — like a\n"
. " designed lyric post or viral TikTok caption.\n"
. "- Do NOT add section headers, titles, intros, or any new lines. Every input line\n"
. " must map to exactly one output line, in the same order, with the same words.\n\n"
. "Output format: ONLY a JSON object mapping the 1-based line number to the fully\n"
. "decorated line text. No prose, no markdown, no code fence.\n\n"
. "LINES:\n" . implode("\n", $numbered);
$raw = $this->call($prompt, 8192);
if ($raw === '') return [];
$raw = trim(preg_replace('/^```(?:json)?\s*|\s*```$/m', '', $raw));
$obj = json_decode($raw, true);
if (! is_array($obj)) return [];
$out = [];
foreach ($obj as $k => $v) {
if (! is_string($v)) continue;
$v = trim($v);
if ($v === '') continue;
$idx = ((int) $k) - 1;
if ($idx < 0 || ! isset($lines[$idx])) continue;
// Cheap safety check: the original words must survive verbatim
// (the LLM should only LAYER emojis on top). Drop the
// decoration if too many original characters are missing.
if (! self::preservesOriginal($lines[$idx], $v)) continue;
$out[$idx] = $v;
}
return $out;
});
}
/**
* Verify the decorated line still contains every alphanumeric character of
* the original (in the same order). Stops the LLM from quietly rewording
* a line — we keep only decorations that strictly add emojis on top.
*/
private static function preservesOriginal(string $orig, string $decorated): bool
{
$strip = fn (string $s) => mb_strtolower(preg_replace('/[^\p{L}\p{N}]+/u', '', $s) ?? '');
$a = $strip($orig);
$b = $strip($decorated);
if ($a === '') return true;
// Sequential subsequence check: every char of $a must appear in $b in order.
$aLen = mb_strlen($a); $bLen = mb_strlen($b);
$j = 0;
for ($i = 0; $i < $aLen; $i++) {
$needle = mb_substr($a, $i, 1);
$found = false;
for (; $j < $bLen; $j++) {
if (mb_substr($b, $j, 1) === $needle) { $found = true; $j++; break; }
}
if (! $found) return false;
}
return true;
}
/** Dispatch to the active provider's adapter. */
private function call(string $prompt, int $maxTokens): string
{
$p = $this->activeProvider();
if (! $p) return '';
try {
return match ($p['kind']) {
'anthropic' => $this->callAnthropic($p, $prompt, $maxTokens),
'openai' => $this->callOpenAI($p, $prompt, $maxTokens),
default => $this->callOllama($p, $prompt, $maxTokens),
};
} catch (\Throwable $e) {
Log::error('LLM call failed: ' . $e->getMessage(), ['provider' => $p['name'] ?? '?']);
return '';
}
}
private function callOllama(array $p, string $prompt, int $maxTokens): string
{
$endpoint = rtrim((string) ($p['endpoint'] ?? 'http://localhost:11434'), '/');
$resp = Http::timeout(180)->acceptJson()->post($endpoint . '/api/chat', [
'model' => $p['model'],
'messages' => [['role' => 'user', 'content' => $prompt]],
'stream' => false,
'options' => ['num_predict' => $maxTokens, 'temperature' => 0.2],
]);
if (! $resp->successful()) {
Log::warning('Ollama API error', ['status' => $resp->status(), 'body' => substr($resp->body(), 0, 500)]);
return '';
}
$j = $resp->json();
return (string) ($j['message']['content'] ?? '');
}
private function callAnthropic(array $p, string $prompt, int $maxTokens): string
{
$endpoint = rtrim((string) ($p['endpoint'] ?? 'https://api.anthropic.com'), '/');
$resp = Http::timeout(120)->withHeaders([
'x-api-key' => (string) $p['api_key'],
'anthropic-version' => '2023-06-01',
'content-type' => 'application/json',
])->post($endpoint . '/v1/messages', [
'model' => $p['model'],
'max_tokens' => $maxTokens,
'messages' => [['role' => 'user', 'content' => $prompt]],
]);
if (! $resp->successful()) {
Log::warning('Anthropic API error', ['status' => $resp->status(), 'body' => substr($resp->body(), 0, 500)]);
return '';
}
$j = $resp->json();
return (string) ($j['content'][0]['text'] ?? '');
}
private function callOpenAI(array $p, string $prompt, int $maxTokens): string
{
$endpoint = rtrim((string) ($p['endpoint'] ?? 'https://api.openai.com'), '/');
$resp = Http::timeout(120)->withToken((string) $p['api_key'])
->acceptJson()
->post($endpoint . '/v1/chat/completions', [
'model' => $p['model'],
'messages' => [['role' => 'user', 'content' => $prompt]],
'max_tokens' => $maxTokens,
'temperature' => 0.2,
]);
if (! $resp->successful()) {
Log::warning('OpenAI API error', ['status' => $resp->status(), 'body' => substr($resp->body(), 0, 500)]);
return '';
}
$j = $resp->json();
return (string) ($j['choices'][0]['message']['content'] ?? '');
}
}