An open media-meta-server and wire protocol. It indexes media that lives on remote storage or CDNs, identifies and enriches it against TMDB, and hands clients a direct, playable URL — without ever proxying, transcoding, or storing a single media byte.
Sphynx is two things, and this guide is split to match. Part I — The Protocol is the wire contract: the JSON any client or server must speak, independent of language or implementation. Part II — The Reference Server is how this implementation (sphynx-server) happens to work — setup, indexing, enrichment, the admin surface. Build a client or your own server against Part I; run or extend the reference server with Part II.
The wire contract: discovery, auth, browse, the Item shape, resolve, playstate, markers, events, errors — plus how to implement it.
Setup & config, indexing & identification, TMDB enrichment, the admin API, and source drivers — implementation, not contract.
A copy-paste walkthrough: login → source → scan → resolve → play.
Every protocol endpoint with request/response shapes. (The server-only admin API lives in Part II.)
How the reference server turns messy filenames + folders into a confident identity.
Marker editing, Accept-Language negotiation, more storage drivers, and a built-in critic-rating source.
What's the spec, and what's just this server? The line matters if you're building a client or a third-party backend — only the left column is the contract you must honor. Everything on the right is how the reference server happens to work, and a conforming server may do all of it completely differently (or not at all).
| In the spec — the wire protocol must honor | Reference-server only — not the spec |
|---|---|
Discovery & capabilities (/v1/info) · auth & sessions (incl. passkeys) · browse (libraries/items, sort/filter) · the Item shape (images/variants, cast, markers, extra) · resolve (+ tracks/streams) · playstate & per-user state · home shelves · changes feed + tombstones · markers (bi-directional) · events (SSE) · search (optional) · errors.Defined as value types in sphynx-protocol/; this is the JSON any two implementations agree on.
|
The admin API (/v1/admin/* — library/source/item/user/settings CRUD, scan, enrich) · the web admin page (/admin) · the Extensions framework (Diagnostics; the ffprobe media probe) · source drivers (http/local/webdav/smb/ftp/torbox) · TMDB identification & enrichment · the filename/folder parser · background maintenance & per-source auto-refresh.Implementation detail of sphynx-server — useful, but never assumed by a client.
|
What is Sphynx?
Sphynx is a drop-in alternative to the backend role that Jellyfin and Plex play — minus the media byte plane. It does library indexing, content identification, metadata enrichment, users/auth, playstate, and intro-marker lookup, then resolves an item to a direct location the client fetches itself.
The single differentiator: Sphynx assumes the media is remote and direct-streamable, so the entire transcode/segment/serve subsystem those servers carry simply does not exist here. No bytes ever transit the server.
It was designed side-by-side with Ocelot, a native Apple media player whose direct-stream-only engine consumes Sphynx through a thin adapter — but the protocol is provider-neutral and client-agnostic. Items are keyed by TMDB id wherever possible, so any client or third-party server can implement either side.
The one-sentence mental model: the server is brain, not muscle. It answers "what do you have?" and "where exactly is this file right now?" — the client does the streaming.
Core concepts
| Principle | What it means |
|---|---|
| Control plane only | /resolve describes where the bytes are (a direct URL + headers, plus an optional expiry only when the source's link is time-bounded), called late at play time, never cached from a browse response. The server stores only the source reference — never a resolved URL — and resolves fresh on every play. It moves no bytes. |
| TMDB-centric identity | Items carry a TMDB id where one exists. It is the join key for artwork, intro markers, and cross-server interoperability. |
| Forward-compatible JSON | Unknown fields are ignored; new enum-like string values decode to an .unknown case instead of breaking older clients. |
| Neutral units | Time is always seconds (floating point) on the wire. Wall-clock timestamps are RFC 3339 strings. |
| Open, neutral metadata | Every canonical field is optional with a fixed meaning/unit; a client maps only the name. Anything beyond the canonical set rides in an open extra bag. |
| Boring, proven auth | Password hashing (bcrypt), short access token + rotating refresh token, device-scoped sessions, per-user row scoping. |
Repository layout
The project (reckloon/Sphynx-Media) is a monorepo containing two Swift packages plus the specs:
| Path | What it is |
|---|---|
sphynx-protocol/ | The wire contract as pure, dependency-free Swift value types (Foundation-only). Builds for every Apple platform + Linux. Used directly by the reference server; available for any client that prefers to reuse the types rather than hand-map the JSON. |
sphynx-server/ | The reference server — a Hummingbird 2 app. Uses the protocol types directly as request/response bodies, so it cannot drift from the wire format. |
docs/ | The endpoint reference (API.md). The full narrative — protocol, server design, and extending — is this guide. |
On consuming sphynx-protocol. The package is the canonical, dependency-free definition of the wire types, shared by the reference server so it can't drift from the spec. A client MAY consume it to get those types for free, but isn't required to — the wire is plain JSON, so a client can implement the protocol directly from this documentation (Ocelot does the latter, hand-mapping the JSON with its own small Decodable types). Because the package isn't at the repository root it can't be added as a SwiftPM URL dependency; a client that does want to consume it adds it via a local path dependency against a checkout today, and it can be mirrored to its own repo if standalone distribution is ever needed.
API — the wire surface wire contract
The wire protocol surface — the endpoints any client or conforming server speaks. (Server-specific endpoints, like the whole admin API, are in Part II.) Also available as Markdown in the repo at docs/API.md.
- Base path:
/v1 - Bodies:
application/json - Times/durations: seconds (floating point)
- Auth:
Authorization: Bearer <accessToken>on everything except/v1/infoand the/v1/auth/{login,refresh,logout}calls - Device scoping: send a stable per-install
X-Sphynx-Device: <opaque>header
Conventions
| Aspect | Rule |
|---|---|
| Unknown JSON fields | Ignored (forward-compatible) |
| Unknown enum strings | Decoded as an .unknown value, never an error |
| Errors | Consistent envelope (see Errors) + standard HTTP status |
| IDs | Opaque strings — treat as cookies, don't parse |
Discovery
Confirm a URL is a Sphynx server and learn its capabilities.
// 200
{ "product":"Sphynx", "serverName":"Sphynx Reference Server", "id":"srv_reference",
"version":"0.2.5", "protocol":["v1"],
"capabilities": { "search":false, "playstate":true, "candidates":true, "events":true, "passkeys":false, "deviceAuth":true, "webAuth":true,
"metadata": { "markers":"readwrite", "images":"read" },
"fields": ["id","type","title","tmdbId","year","images","placeholder","dateAdded","updatedAt","seriesId","seriesTitle","seasonIndex","episodeIndex","childCount","parentId","collectionId","collectionTitle","extra","overview","runtime","genres","communityRating","officialRating","cast","originalTitle","sortTitle","tagline","status","premiereDate","endDate","studios","directors","writers","countries","tags","trailers","chapters","versions","externalIds","resumePosition","watched","playCount","isFavorite","userRating","lastPlayedAt"],
"browse": { "sorts":["added","name","rating"], "filters":["genre","year","unwatched"] },
"playstateReportInterval":5 } }
A client treats unknown capability keys as ignorable and missing booleans as false. metadata is the bi-directional access policy: a per-field map of none | read | readwrite (open enum). A field absent from the map is none — readable if served, but not contributable. playstateReportInterval (seconds) is the server's preferred client playback-report cadence — a client reporting progress periodically SHOULD use it (default ~5s if absent). Push-only: the server never polls. events advertises the additive server→client event stream (SSE); absent ⇒ the client falls back to polling. passkeys advertises passwordless passkey (WebAuthn) sign-in; absent/false ⇒ no Relying Party is configured, so the client hides passkey affordances and uses password login. webAuth advertises the OAuth-style web authorization flow — a same-device hosted-login sign-in for clients that can't add the server's host to an Associated Domains entitlement.
fields — the coverage advertisement (highly recommended on both sides). It lists the canonical Item fields the server can populate — distinct from metadata, which is the read/write access policy. The protocol strongly recommends (SHOULD) that a server list every field it serves here, and that a client read it to inform the user of unsupported features (e.g. hide a "Trailers" row when fields omits trailers). An absent/empty list means "coverage not advertised" — the client assumes nothing and just renders whatever each item carries. (The reference server advertises its full set, including chapters — filled by the media-probe extension when an item is probed. The one canonical field it never fills is criticRating, since TMDB has no critic score; see the Item shape.)
Authentication
Body { "username":"…", "password":"…" }. Optional header X-Sphynx-Device.
// 200
{ "accessToken":"…", "refreshToken":"…", "expiresIn":3600, "refreshExpiresIn":2592000,
"user": { "id":"u_…", "displayName":"admin" } }
expiresIn is the access-token lifetime (seconds); the optional refreshExpiresIn is the refresh-token lifetime, so a client can pre-empt a forced re-login. Both login and refresh return them.
401 unauthorized — invalid username or password.
Body { "refreshToken":"…" }. Returns a new token pair; the presented refresh token is rotated. Rotation keeps a ~60s reuse grace window: the immediately-previous token presented inside the window idempotently returns the current pair, so a concurrent-refresh race or a lost rotation response can't strand the session. Revocation still wins, and a token two or more rotations back never resolves. Same response shape as login. 401 on an invalid, expired, or already-rotated token (outside the window).
Body { "refreshToken":"…", "allDevices":false }. Revokes the presented refresh token's session. allDevices:true revokes every session on the same device id. 204 No Content on success (idempotent).
The caller's active sign-in sessions (devices), newest-active first. 200 → { "sessions":[{ "id","deviceId","current","createdAt","lastActiveAt","expiresAt" }] }. current flags the requesting session. Powers the "signed-in devices" list on /user.
Sign out one of the caller's own devices. 204; idempotent; scoped to the caller. Revoking the current session signs this device out on its next request.
The authenticated user plus that user's effective permissions. Where /v1/info advertises what the server supports, this reflects what this user may actually do (permissions are granted per-user by the admin).
// 200
{ "user": { "id":"u_…", "displayName":"Bob" },
"permissions": ["library.read", "metadata.markers.write"],
"metadata": { "markers":"readwrite", "images":"read" } }
permissions— the user's effective permission keys (see Permissions). The admin holds all of them implicitly. Treat unknown keys as opaque and ignore them (forward-compatible).metadata— a per-field metadata-access view (server policy narrowed to this user's write permissions), kept for the contribute affordance.
A client should use this (not /v1/info) to decide which affordances to show (browse, contribute markers, edit metadata, …).
Change the authenticated user's own password. Body { "currentPassword":"…", "newPassword":"…" }. 204 on success; 401 if the current password is wrong. The presenting session stays valid.
Profile-picker sign-in
A pre-auth "who's watching" chooser: the /user sign-in page can render tappable profiles instead of a username field. Served only when the admin enables the signInUserList setting (off by default) — otherwise the routes are 404, so a server never enumerates its accounts before sign-in. Credentials, roles, and admin status are never included.
The list of pickable profiles. 200 → { "users":[{ "username":"alice", "displayName":"Alice", "avatarURL":"/v1/auth/directory/u_…/avatar" }] }, sorted by display name (case-insensitive). avatarURL is absent when the user has no avatar (the client shows an initial). Picking a profile prefills username for POST /v1/auth/login; passwordless sign-in still goes through the discoverable passkey ceremony. 404 not_found when signInUserList is off.
The profile picture for a chooser entry, served without a token so the picker can render it pre-auth. Gated by the same signInUserList setting. 404 when the directory is disabled or the user has no avatar.
Passkeys (WebAuthn)
Passwordless sign-in with a WebAuthn passkey (Touch ID / Face ID, a synced platform passkey, or a hardware key). Available only when capabilities.passkeys is true — i.e. an admin has configured a Relying Party. When it's false the /v1/auth/passkeys/* routes are absent (404) and clients use password login.
Each ceremony is two calls: begin returns a server-generated challengeId plus a standard WebAuthn publicKey options object; finish echoes that challengeId with the authenticator's response. A challengeId is single-use and short-lived (≈5 min). The publicKey options and the authenticator credential objects are the standard W3C WebAuthn JSON shapes — what navigator.credentials (browser) or ASAuthorization (Apple) produce/consume — so build them with the platform API, not by hand.
Enrollment requires a session (you add a passkey while logged in); signing in with a passkey is public. Passwords remain the bootstrap/fallback credential.
For an end-to-end walkthrough — server Relying-Party setup, the begin/finish ceremony model, and the per-platform client methods (navigator.credentials on the web, ASAuthorization on Apple) — see the passkeys implementation guide.
Begin enrolling a passkey for the authenticated user (no body). 200 → { "challengeId":"pkc_…", "publicKey": { "challenge","rp","user","pubKeyCredParams","timeout","attestation" } }. Pass publicKey to navigator.credentials.create().
Body { "challengeId":"pkc_…", "label":"iPhone", "credential": {…} } (label optional). 201 returns the stored PasskeyInfo; 400 if the attestation can't be verified or the challenge is invalid/expired.
Begin a passwordless sign-in (no body). The options omit allowCredentials, so the authenticator offers its discoverable passkeys for this RP. 200 → { "challengeId":"pkc_…", "publicKey": { "challenge","rpId","timeout","userVerification" } }. Pass publicKey to navigator.credentials.get().
Body { "challengeId":"pkc_…", "credential": {…} }. On a verified assertion returns the same token pair as /v1/auth/login (scoped to X-Sphynx-Device). 401 if the assertion fails or the credential is unknown; 400 if the challenge is invalid/expired.
The caller's passkeys, newest first. 200 → { "passkeys":[{ "id":"pk_…", "label", "createdAt", "lastUsedAt", "backedUp" }] }. id is the opaque pk_… used in the management URLs below (not the raw credential id); key material is never returned.
Rename a passkey. Body { "label":"…" } (non-empty). 200 returns the updated PasskeyInfo; 404 if it isn't one of the caller's passkeys.
Remove a passkey. 204; 404 if it isn't one of the caller's passkeys.
Configuration. Passkeys stay off until an admin sets the Relying Party in Settings (passkeyRelyingPartyID, optional passkeyRelyingPartyName / passkeyRelyingPartyOrigin). The RP id is the bare domain the server is reached at (no scheme/port, e.g. media.example.com); the origin defaults to https://<rpId>. These must match the client's origin or every ceremony fails — a WebAuthn constraint, not a Sphynx one.
Device authorization (QR / code sign-in)
Passwordless sign-in for TVs and limited-input clients — an RFC 8628-style device-authorization grant. The device shows a QR + short code; the user approves on a second device where they're already signed in (typically via a passkey); the device polls and receives the same TokenResponse as any login. Advertised via capabilities.deviceAuth.
The device begins (send X-Sphynx-Device; optional body { "label":"Living Room TV" }). 200 → { "deviceCode", "userCode":"WXYZ-2345", "verificationUri", "verificationUriComplete", "interval", "expiresIn" }. The device renders a QR of verificationUriComplete and shows userCode for manual entry; deviceCode is the secret it polls with (never shown).
The device polls with { "deviceCode" } every interval seconds. Until approved: 400 with error code authorization_pending (keep polling) / expired_token / invalid_grant. Once approved: 200 with a real TokenResponse. Single-use — a second claim fails.
pending confirms which device ({ "label", "expiresIn" }); approve takes { "userCode" } → 204. The approver authenticates however they like — a passkey makes this "scan → Face ID → done." The reference server hosts a browser approval page at GET /link; a native client can call approve straight from its own passkey-authenticated session.
The QR must point somewhere the phone can reach. verificationUri is the server's public base URL + /link, derived from passkeyRelyingPartyOrigin (else SPHYNX_PASSKEY_RP_ID → https://<id>, else http://<host>:<port>). For sign-in from a phone off the LAN, set your public domain (the same origin you use for passkeys) — otherwise the QR resolves to http://<host>:9410, which a phone on cellular can't open.
Web authorization flow (OAuth-style)
A seamless same-device web sign-in for clients that can't add the server's host to an Associated Domains entitlement — the self-hosted case, where the app can't be re-signed per server, so platform passkeys and universal-link callbacks aren't available. It mirrors the OAuth 2.0 authorization-code grant but returns to the app via a custom URL scheme instead of a universal link, so it needs no Associated Domains and no per-owner app signing. The client drives it with ASWebAuthenticationSession (or the platform equivalent). Advertised via capabilities.webAuth.
PKCE is recommended. Before starting, the client generates a high-entropy code_verifier and sends its challenge (code_challenge = BASE64URL(SHA256(verifier)) with code_challenge_method=S256). The code can then only be redeemed by the client that holds the verifier, which matters because any app could register the same custom scheme. plain is accepted but discouraged. state is an opaque value the client generates and must verify on return (it's echoed back unchanged) to tie the redirect to the request it started.
The client opens this URL in a web-authentication session. The server renders its normal login page; on a successful sign-in the page redirects to redirect_uri?code=<authCode>&state=<state>. Query params: redirect_uri (required — where the code is delivered; must pass the server's allowlist, see below — a bad target renders an error page, 400, not a login form), state (recommended; echoed back on the redirect), code_challenge (recommended PKCE challenge), code_challenge_method (S256 recommended, or plain; defaults to plain if a challenge is sent without it).
The hosted login page submits credentials here. Body { "username", "password", "redirectUri", "state"?, "codeChallenge"?, "codeChallengeMethod"? }. On success returns { "redirectTo":"<redirect_uri>?code=…&state=…" } and the page navigates there. The browser never receives a session token — only the short, single-use code.
The client redeems the code for a session. Honors X-Sphynx-Device for session scoping (the session is scoped to the exchanging client's device, not the browser's). Body { "code":"<authCode>", "codeVerifier":"<verifier>" } — codeVerifier is required when the flow used PKCE; omit it otherwise. 200 → a full TokenResponse (same shape as /v1/auth/login). 400 with error code invalid_grant for an unknown, expired, already-used code, or failed PKCE verification — the code is single-use (consumed on first exchange) and short-lived (~60s TTL).
redirect_uri allowlisting. To prevent open redirects, redirect_uri is validated server-side. With an allowlist configured (the webAuthRedirectAllowlist setting — a newline/comma-separated list of exact URIs or scheme prefixes such as ocelot://auth), the redirect must equal, or begin with, a listed entry. With no allowlist (the default), app custom schemes are accepted (a deep link can't be an open redirect to an arbitrary web origin) while http(s) targets are rejected — an operator must allowlist a web origin to permit it. PKCE is what binds the code to the legitimate client.
Browse
The top-level collections a user can browse.
// 200
{ "libraries": [ { "id":"lib_…", "title":"Movies", "kind":"movies" } ] }
kind is an open string enum (movies, tvShows, homeVideos, musicVideos, music, audiobooks, boxSets, collection, other, …); clients map unknown kinds to a default. Browsing a kind: collection library is a read-through view that aggregates every box-set tile across all libraries the user can read (newest first, served via items?parent=); each tile still opens to its members the usual way.
Music & audiobooks. The reference server doesn't implement audio (no music identification — TMDB is film/TV only), but the protocol fully models it so another server can, with no wire changes. It reuses the same primitives: libraries (kind: music / audiobooks), the parent/child tree, resolve, playstate. Item types artist → album → track and audiobook → chapter (nested via parentId); ordering rides on artistName, albumTitle, discNumber, trackNumber (for an audiobook: author→artistName, book→albumTitle, chapter №→trackNumber). Lossless / hi-res is expressed on resolve's described streams — MediaStream carries codec + sampleRate + bitDepth + bitRate, so { codec:"flac", sampleRate:96000, bitDepth:24 } renders as "FLAC 24/96"; FLAC-vs-MP3 alternatives use the same multi-version picker. Full contract in API.md → Music & audiobooks.
Children of a container.
| Param | Default | Meaning |
|---|---|---|
parent | required | A library id (top-level items) or an item id (its children) |
detail | skeleton | skeleton (tile fields) or full (adds enrichment) |
limit | 50 | Page size (1–200) |
cursor | — | Opaque pagination cursor from a previous nextCursor |
sort | added | A library's top level: added | name | rating |
order | (by sort) | asc | desc (default: name asc, added/rating desc) |
genre | — | Top level only: keep items carrying this genre |
year | — | Top level only: keep items of this release year |
unwatched | — | true ⇒ drop items the caller has marked watched |
The valid sort keys and filter params are advertised in capabilities.browse ({ "sorts":[…], "filters":[…] }), so a client builds its sort/filter UI from the contract instead of guessing.
// 200
{ "items": [ { "id":"it_…", "type":"movie", "title":"…", "year":2008 } ],
"nextCursor": "b2Zmc2V0OjUw",
"totalCount": 947, "pageSize": 50 }
An absent nextCursor means the end of the list. totalCount is the structural total under this parent matching genre/year — the full set the cursor paginates, so a client shows "1–50 of 947"; it does not account for the per-user unwatched post-filter. pageSize echoes the effective limit after the server's clamp (both are on /v1/items; the home feeds omit them). Items fold the caller's per-user state (resumePosition, watched, playCount, isFavorite, lastPlayedAt). sort/genre/year apply to a library's top level; children of an item keep their natural order.
Set the caller's per-user state for an item. Body (any subset) { "watched":true, "isFavorite":true, "rating":8.5 } → 200 with the item, state folded in. 403 if the caller can't read the item's library. Play count + last-played are tracked from playback (a non-failed stop bumps them); watched/isFavorite/rating are explicit here. watched:true also clears the caller's resume (same as DELETE /v1/playstate/{id}) so resumePosition reads 0 and the item leaves Continue Watching — mark-watched implies finished (Jellyfin PlayedItems / Plex scrobble). rating is the caller's own 0–10 score (a 5-star UI sends stars ×2), folded back as userRating; 0 clears it, out-of-range is a 400 — distinct from the crowd's communityRating and the press's criticRating.
A single item. 404 not_found if absent. See Item shape.
Extras / bonus content
Trailers, featurettes, deleted scenes, and behind-the-scenes clips arrive as their own typed items (type = trailer / featurette / deletedScene / behindTheScenes), nested under their movie or show via parentId — never as a standalone library tile, never as a real season. The split is deliberate: the server classifies, the client presents. The server promises only the type and the parentId; layout is entirely the client's call.
To consume them, browse a title's children (GET /v1/items?parent=<movieId|seriesId>) — the listing mixes a show's season rows with its extras — then partition by type. From that grouping a client may render extras however it likes; all of these are valid:
- A "Bonus / Extras" shelf on the detail screen, optionally sub-grouped into Trailers / Deleted Scenes / Featurettes / Behind the Scenes. The recommended default.
- A pseudo-season per category — show "Deleted Scenes" or "Featurettes" as if it were a season row beside Season 1, Season 2, … This is a client-side rendering choice only: the server emits no
seasoncontainer for extras, and their realparentIdis the title. - A dedicated "Extras" library/view — collect extras across titles into their own section, composed client-side by walking each title's children. The server exposes no separate extras library or endpoint.
Play an extra like any leaf (GET /v1/resolve/<id>); it carries no TMDB id, only filename-derived title/container. Treat the extras type set as open — render an unknown type as a generic extra rather than assuming the list is fixed. Specials (a real season with seasonIndex: 0) are not extras — they're aired episodes the server enriches from TMDB; don't conflate the two.
Search optional
Search is optional, and on purpose. A server advertises whether it implements server-side search through capabilities.search; when that is false the /v1/search endpoint is simply absent. The reference server ships search:false and implements no server-side search — it leaves search to the client, which is usually the better place for it.
Why the client is the right place to search. A client already mirrors the whole library locally — it browses the tree and stays current through the /v1/changes delta feed. So it can search that local cache instantly, offline, with zero server round-trip, and rank results however suits its UI. The server holding a second search index would only duplicate what the client already has.
Encouraged client-side approaches, simplest to richest:
- In-memory match — a substring or fuzzy match over the cached items' titles (and
originalTitle, cast, genres). Trivial to ship; enough for most libraries. - Local store query — if the client persists its catalogue (SQLite, Core Data, …), search becomes an indexed query over its own database — typo-tolerance, prefix matching, and field weighting for free.
- Semantic / natural-language — go further with embeddings or an on-device model. Ocelot ships a proprietary on-device LLM search over its synced catalogue, answering natural-language queries ("that 90s heist movie with the twist") locally, with no server cost or privacy trade-off.
The protocol standardizes only the shape, so that a server which does choose to offer search is drop-in interchangeable. GET /v1/search?q=query (params: q required, optional type ItemType filter, limit, cursor) returns a SearchResponse shaped like /v1/items — { "items":[…], "nextCursor":"…", "query":"…" } — so the client reuses the same result rendering. How matching and ranking work is left entirely to the implementer; the contract fixes only the request params and the response shape.
Markers bi-directional
Timeline-segment markers are item-level (shared across a server's clients) and gated by capabilities.metadata["markers"]. See Extending for the contribution model (e.g. a client bridging TheIntroDB).
A marker maps a segment type to a { start, end } window (seconds; end optional for open-ended). The four well-known types — recap, intro, credits, preview — power "Skip Recap / Skip Intro / Next Episode" affordances. The type space is open: a server or extension may contribute any segment type (e.g. sponsor), and clients ignore types they don't recognise. On the wire it's a flat object keyed by type.
// 200
{ "markers": { "recap":{"start":0,"end":30}, "intro":{"start":75,"end":145},
"credits":{"start":9120}, "preview":{"start":9150,"end":9180} },
"source":"theintrodb", "confidence":0.95, "authoritative":false,
"updatedAt":"2026-06-27T12:00:00Z", "stale":false }
404 if the server doesn't offer markers, or none are stored. stale:true means the markers are older than the server's freshness window and a client with a data source should re-fetch and PUT updated ones. Authoritative markers are never stale.
Body { "markers": {…}, "source":"…", "confidence":0.9 } → 200 with the stored markers info.
- 403
forbiddenif the server is read-only for markers, or the user lacks themetadata.markers.writepermission (permissions are per-user; the admin always has it). CheckGET /v1/auth/me. - 409
conflictif authoritative markers exist and the caller isn't admin — a best-effort client contribution may not clobber server-detected/admin data.
Contributed markers also appear in the /resolve descriptor's markers.
Resolve
The late-bound handoff: turns an item into a direct, playable location. Called at play time, never cached from browse.
// 200
{ "url":"https://cdn.example/movie.mkv", "headers":{}, "container":"mkv",
"terminal":true }
url— DIRECT location; the client streams this itself. Resolved fresh on every call and never stored — Sphynx keeps only the item's source reference.headers— headers the client must send when fetchingurl.terminal— if true,urlis the driver's final location: fetch it directly, with no further Sphynx resolve step. This is the driver's own assertion about what it produced, not the result of probing the origin — it says nothing about ordinary HTTP redirects (your HTTP client follows those normally) or timing (resolution is always fresh at play time). Absent/false means you must resolveurlyourself before fetching. Every built-in driver emitstrue.ttl— optional. When the source hands back a time-bounded link (e.g. a signed CDN URL), how many seconds it stays valid; the server passes the driver's value straight through and never persists it. The built-inhttp/localdrivers return plain, non-expiring URLs, sottlis absent. Absent = no expiry.tracks— optional. The selection indices (preferredAudio/preferredSubtitle/copyableAudio) plus, once the media has been probed, the full per-track detail:streams(each{ index, kind, codec, language, title, channels, isDefault, isForced }) andexternalSubtitles(sidecar{ url, language, format }) — enough to render an "Audio: English 5.1 / Subtitles: Spanish" picker without demuxing the file.streams/externalSubtitlesare absent until the item is probed; enable the media-probe extension and probe an item to populate them (the result caches and folds in here).candidates— optional. Ranked fallback locations: ifurlfails, try these in order ({ url, headers, priority }, lowerpriorityfirst). The reference server populates them from the title's other versions (so a client can fall back to another quality/edition) and advertisescapabilities.candidates: true; absent for a single-file item. Driver-supplied true mirrors (same file, alternate hosts) lead the list.markers— optional.
Multi-version / editions. When one title is backed by more than one file (4K + 1080p, Director's Cut + Theatrical), the server collapses them into a single item (grouped by title + year) carrying a best-first versions array — [{ id, label, resolution?, edition?, dynamicRange?, container?, size? }] — instead of duplicate tiles. versions appears only when there's a real choice (≥2 files). A plain resolve plays versions[0] (the default/highest quality); a client shows a picker and plays a specific one with GET /v1/resolve/{id}?version=<vid> (an unknown vid is a 404, never a silent fallback). Each id is opaque and stable across re-scans. The reference server detects versions from filenames (2160p/4K, 1080p, HDR10/DV, Director's Cut, Extended, Remux, …); a field-rich server may populate them from a probe.
Avoid ttl unless you genuinely need it — highly recommended on both sides. Because resolve is already called fresh at play time and never cached, freshness is handled by simply re-resolving — ttl adds a moving part with its own failure mode (a link that expires mid-session).
- Server: do not set
ttlunless the underlying link is truly time-bounded (a signed/expiring CDN or object-store URL). For plain, non-expiring URLs leave it absent — the built-inhttp/localdrivers do. - Client: don't build logic around
ttl. Resolve at play time and, if a fetch fails, just re-resolve. The one case it earns its keep is a long extended pause (e.g. paused for hours, then resumed) where a signed URL obtained at play-start may have expired — there, usettlto pre-emptively re-resolve before resuming.
404 not_found (no such item) / no_media_source (item's source unavailable).
Playstate
Per-user resume tracking, row-scoped to the authenticated subject — a user only ever reads/writes their own state. Positions are in seconds. All require auth.
Body { "position":12.5 } → 204.
Body { "position":1342.5, "paused":false } → 204.
Body { "position":1290.0, "duration":1290.0, "failed":false } → 204. On failed:true the server does not overwrite the stored resume point — a misfire (the playhead never advanced past startup) can't clobber a good position — and nothing below applies. duration is the playing media's full length in seconds as the player knows it, and clients SHOULD send it: it beats the item's nominal metadata runtime when classifying the stop (TMDB lists a TV episode's broadcast slot, so a "25-minute" episode is often a ~21-minute file — without duration, finishing the file reads as ~86% and never marks watched). A non-failed stop is resolved against duration (falling back to runtime), per user: stopping in the last 5% (≥95%) marks it watched, clears resume (drops out of Continue Watching), and counts the play (scrobble-at-the-end, like Jellyfin/Plex); stopping in the first 5% (≤5%) marks it unwatched, clears resume, and does not count a play (a false start is discarded); anything in between stores the resume point and counts the play. With no usable length at all, every non-failed stop is a partial watch.
200 → { "position":1342.5, "updatedAt":"…" }. No stored state → { "position":0, … } ("from start").
Batch read. 200 → { "states": { "it_1": { "position":…, "updatedAt":… } } }. Items with no stored state are omitted.
Clear resume / remove from Continue Watching. Deletes the caller's stored playstate for the item, so its resumePosition reads back as 0 and it drops out of /v1/home/continue. 204; idempotent. Only ever affects the caller's own row.
Reset the caller's entire watch history (cross-device). Clears all stored resume positions and per-item state (watched flag, play count, last-played) for the authenticated user, everywhere. Only ever affects the caller's own rows; idempotent. 200 → { "cleared": 12 } (rows removed). This is the "reset my watch history" action on the /user page.
resumePosition is folded into item responses for the authenticated user as a convenience snapshot — but it does not move updatedAt, so a cached value can be stale. /v1/playstate is the authoritative source: read it (single or batch) when you need the current position (e.g. to resume playback), and use the folded resumePosition for display hints only.
The typed home feed: the ordered shelves that make up the user's home screen. 200 → { "shelves": [ { "id", "title", "kind", "aspect", "items":[…] } ] }. Each shelf carries a kind (open enum: continueWatching, recentlyAdded, favorites, plus the layout kinds genre and releaseDecade) and an aspect (portrait | landscape | square) telling the client the tile shape — so which rows are landscape is contract, not convention. Empty shelves are omitted; each shelf shows a capped preview, paged in full via its per-row endpoint below. Continue Watching is unified — there is deliberately no nextUp kind: the next unwatched episode of a show you've started is merged into continueWatching, one row, never a separate "Next Up".
For a genre / releaseDecade shelf the row parameter rides in Shelf.id as genre:<Name> (e.g. genre:Action) or decade:<startYear> (e.g. decade:1980), so a client pages it without extra state. Both are open-enum additions: a client that doesn't recognise the kind can still render the shelf's title + items, or skip it. The home layout is configurable (see below) — the rows above are the default, not a fixed set.
The full, paginated Continue Watching row: the user's in-progress items (stored position > 0) plus the next unwatched episode of each show they've started — one unified list, most-recently-played first, each with resumePosition folded in (0 for a next-up episode). Cursor-paginated; detail selects skeleton/full. Returns the same ItemsResponse shape as /v1/items. Resume wins (an in-progress episode suppresses its show's next-up); specials (season 0) don't generate a next-up.
The server only stores and exposes the data (per-user position + updatedAt, ordered by recency) — the client owns presentation and policy: it has each item's runtime, so it decides what counts as "finished", whether to hide it, how to sort.
Recently Added: top-level items (movies + series) newest first, per-user state folded in. Cursor-paginated; detail selects skeleton/full. Same ItemsResponse shape.
The caller's favourited items, most-recently-played first. Cursor-paginated; same ItemsResponse shape.
Page a genre row — top-level items tagged with <genre>, newest first. name is required (400 if absent). Cursor-paginated; detail selects skeleton/full. Same ItemsResponse shape.
Page a decade row — top-level items released in the ten years from <year> (e.g. start=1980 → 1980–1989), newest first. start is required (400 if absent). Cursor-paginated; same ItemsResponse shape.
The genres available for building genre rows across the caller's libraries. 200 → { "genres": ["Action", …] }.
Configurable home layout
The home screen is a layout of ordered shelves: there's an admin default every user sees, plus an optional per-user override. Both are expressed as a list of HomeShelfDTO:
{ "id":"genre:Action", "kind":"genre", "title":"Action",
"genre":"Action", "decade":null, "aspect":"portrait", "enabled":true }
id/kind/title/aspect mirror the shelf in GET /v1/home; genre and decade carry the row parameter for those kinds (null otherwise); enabled lets a layout keep but hide a row. A config response is { "shelves": [HomeShelfDTO, …], "customized": <bool> } — customized is true when a per-user layout overrides the default (drives a "Reset to default" button). A request body is { "shelves": [HomeShelfDTO, …] }. Empty rows (a genre or decade with nothing in the user's libraries) are dropped from GET /v1/home automatically, so a layout can list more rows than render.
Read / replace / clear the caller's own layout (a HomeConfigResponse). DELETE reverts to the admin default (customized returns to false). The default layout all users see is managed by the admin via GET/PUT /v1/admin/home, with GET /v1/admin/genres → { "genres": [String] } listing every genre present for building genre rows (see the admin API).
A person's filmography: the distinct movies and series the person is credited in the cast of (the inverse of an item's cast array), for a person-detail screen. personId is a cast-entry id of the form pe_<tmdbId>. Returns the standard ItemsResponse with images.primary and the normal projection, cursor-paginated and gated by the same per-library read permissions; sorted newest-first by premiere/production date then title. The lookup is cast-only (crew aren't stored with person ids). A well-formed id always returns 200 with a possibly-empty list (the server keeps no person registry); 404 is reserved for a malformed id.
Incremental sync without re-listing the library. Params: since (epoch seconds or an RFC 3339 timestamp; default 0 = full sync), plus cursor/limit/detail. 200 → { "changes":[…items…], "tombstones":[{ "id", "deletedAt" }], "until":"…", "nextCursor"? }. changes are items whose client-rendered data changed after since (same notion as updatedAt — not playstate), in change-time order, permission-filtered to readable libraries. tombstones are deletions in the window (id-only, returned in full, not permission-filtered — the item is gone, so nothing to leak); drop those ids from the cache. The loop: start since=0, drain all pages of a window via nextCursor with the same since, then store until and pass it as the next since — until carries sub-second precision so the loop is gap-free.
Events server-sent
An additive server→client stream of live updates over Server-Sent Events. It exists so a client can keep UI fresh — continue-watching, now-playing, watched/favorite sync, a "library changed" nudge — without polling, and it never replaces the access-controlled REST endpoints. Advertised by capabilities.events; a client that ignores it (or a server that doesn't offer it) keeps working by polling.
Opens a text/event-stream. The connection is scoped to the authenticated subject, and each event is filtered by access: per-user events reach only the subject's own connections; item/library events reach only connections that may read that library (a null library is admin-only — the same fail-closed rule as item reads). A comment heartbeat (: ping) is sent ~every 15s (SPHYNX_EVENTS_HEARTBEAT) to keep the connection warm; clients reconnect with the browser EventSource default.
# An EventSource-style stream (one blank-line-delimited frame each):
: connected
event: playstate
data: {"type":"playstate","itemId":"it_42","position":531.0,"ts":1719536400.1}
event: useritemstate
data: {"type":"useritemstate","itemId":"it_42","watched":true,"isFavorite":false,"playCount":3,"ts":1719536402.5}
event: library
data: {"type":"library","libraryId":"lib_tv","action":"scanned","ts":1719536500.0}
: ping
Each non-comment frame carries event: <type> and a one-line JSON data: payload with a stable type discriminator and ts (epoch seconds). Unknown types and unknown fields are ignorable, so new event kinds stay forward-compatible; nil fields are omitted.
| type | Audience | Emitted when | Key fields |
|---|---|---|---|
playstate | subject | start / progress / stop reported | itemId, position |
useritemstate | subject | watched/favorite set, or a play recorded on stop | itemId, watched, isFavorite, playCount |
markers | library readers | a marker contribution is stored | itemId, libraryId |
library | library readers | a scan completes, or a library is added/updated/removed | libraryId, action |
heartbeat | — | keep-alive | sent as an SSE comment, not a data: frame |
Events are liveness, not a second source of truth. markers / library are nudges: on receipt a client re-fetches via the normal access-controlled endpoint (/v1/home/recent, /v1/items/{id}/markers) rather than trusting the event payload as data. This keeps the stream cheap and the access rules in exactly one place.
Extending the stream. type is an open discriminator and unknown types/fields are ignorable, so the event set grows with no wire-version bump and no new capability — a server just emits another type (or new fields) and older clients skip what they don't recognise. Keep two invariants: every frame carries a stable type + ts (nil fields omitted), and the event stays a nudge (small payload, re-fetch the authoritative endpoint) so a client that doesn't handle it yet loses nothing. The one per-event decision is audience — scope it to the subject for personal data or to library-readers for shared data, so delivery stays fail-closed. capabilities.events stays a single boolean and deliberately doesn't enumerate types, because clients never depend on a specific event existing.
Errors
Every non-2xx response uses this envelope:
{ "error": { "code":"unauthorized", "message":"Token expired.", "retryable":false } }
Clients branch on code, not message. Codes in use: unauthorized, forbidden, not_found, no_media_source, rate_limited, server_error, unavailable, and the open values bad_request and conflict. Unknown codes must be tolerated.
The optional error.retryAfter (seconds) is a backoff hint — set only where the server knows one (rate_limited / HTTP 429, unavailable / HTTP 503), and omitted otherwise. When present, the same value is also sent as the standard HTTP Retry-After header. Prefer honoring it over guessing; treat its absence as "no specific guidance".
Item shape
All fields except id, title, type are optional; the server sends what it has. The canonical set is deliberately broad — matching what mainstream clients display — so a client can rely on these names; anything a field-rich server adds beyond them rides in extra. A skeleton item carries the tile fields (images, placeholder, year, dateAdded) and omits the heavier enrichment (overview, genres, ratings, cast, studios, …).
{
"id":"it_…", "type":"movie", "title":"Blade Runner 2049", "tmdbId":"335984",
"originalTitle":"…", "sortTitle":"…", "tagline":"…",
"overview":"…", "year":2017, "runtime":9840.0,
"images": { "primary":"…", "backdrop":"…", "thumb":"…", "logo":"…", "banner":"…" },
"placeholder": { "url":"…/tiny.jpg" },
"seriesId":"…", "seriesTitle":"…", "seasonIndex":1, "episodeIndex":3, "childCount":10,
"versions":[ { "id":"v_…", "label":"4K · HDR10 · Remux", "resolution":"4K", "container":"mkv" } ],
"parentId":"it_…", "collectionId":"it_…", "collectionTitle":"…",
"genres":["Sci-Fi"], "communityRating":8.0, "criticRating":88, "officialRating":"R",
"cast":[ { "id":"pe_…", "name":"Ryan Gosling", "role":"K", "imageURL":"…",
"placeholder":{"url":"…/tiny.jpg"} } ],
"directors":["…"], "writers":["…"], "studios":["…"], "countries":["…"], "tags":["…"],
"trailers":["https://…"], "chapters":[ { "start":0.0, "title":"Intro" } ],
"status":"Released", "premiereDate":"2017-10-06", "endDate":"…", "dateAdded":"2026-06-27T12:00:00Z",
"externalIds": { "imdb":"tt1856101", "tvdb":"…" },
"resumePosition":1342.5, "watched":true, "playCount":3, "isFavorite":true, "userRating":9.0, "lastPlayedAt":"2026-06-27T12:00:00Z",
"updatedAt":"2026-06-27T12:00:00Z",
"extra": { "anything":[1,2,3] }
}
Every field above is optional and omitted when empty — a client renders gracefully without any of them. The reference server fills the TMDB-derived ones (overview, genres, communityRating, officialRating, runtime, tagline, studios, directors/writers, externalIds.imdb, premiereDate/endDate, status, sortTitle, tags, trailers, cast — for both movies and TV series/episodes — images incl. logo/banner, plus parentId/collectionId). officialRating is the content certification ("PG-13" / "TV-MA"), from the US entry of TMDB's release_dates (movies) / content_ratings (TV). chapters are filled for any item probed by the media-probe extension (ffprobe -show_chapters) — chapters are container-level, so TMDB has none. The one canonical field it never fills is criticRating: TMDB exposes only an audience score (vote_average → communityRating), not a critic aggregate like Metacritic/RT, so that needs a different source (e.g. an OMDb-backed extension) or rides in extra. externalIds is an open map keyed by namespace (imdb, tvdb, …); dateAdded powers a "Recently Added" view.
resumePosition, watched, playCount, isFavorite, userRating, and lastPlayedAt are per-user state, folded in for the authenticated caller (set via PUT /v1/items/{id}/state). Like resumePosition they are not enrichment, so they appear in both skeleton and full and never make a skeleton look "enriched".
Image roles. images carries several neutral roles, all optional, each with a defined orientation so a client knows which to use for a portrait tile vs a landscape one:
primary— portrait poster (~2:3). Exception: an episode'sprimaryis its landscape still (episodes have no poster).backdrop— landscape (~16:9), large; full-bleed hero / background.thumb— landscape (~16:9), card-sized; for horizontal tiles & rows (e.g. Continue Watching). It is not a small poster.logo— transparent title logo (wide);banner— wide banner strip.
The reference server fills, per type: movies / series → primary (poster) + backdrop and thumb (both from the TMDB backdrop, large and card-sized) + logo/banner when TMDB has them; seasons → season poster as primary + backdrop/thumb inherited from the show's wide art; episodes → the still as primary/thumb (already landscape) + the show's backdrop. So every enriched item carries both a portrait option (primary, except episodes) and a landscape option (thumb + backdrop). A client building a horizontal row uses thumb (card) or backdrop (full-bleed); a portrait grid uses primary.
Per-image variants. Alongside the flat URL fields, images.variants is an optional map keyed by role name (primary, backdrop, …) carrying per-image metadata — a low-res placeholder to blur up from and an aspect hint (width ÷ height: ~0.667 portrait, ~1.778 landscape) — so a client can lay out and blur-up each image independently, not just the poster. In blurhash mode each role carries its own hash (a landscape backdrop blurs up from its own BlurHash, not the poster's). The flat role fields stay the URL source of truth (a client reading only images.primary keeps working); variants is additive and the map is open (unknown role keys tolerated). The reference server fills a variant for every role it serves; width/height are reserved for servers that know exact dimensions.
updatedAt (RFC 3339) is the last change to client-rendered data for the item (title, images, enrichment, markers…) — the max of the server's per-field change times. A client can diff this one value to decide "changed since I cached it?" without comparing every field. It excludes per-user playstate (resumePosition), so progress reports don't invalidate the cache.
placeholder is a self-describing one-of that may carry any low-res form. The reference server emits the blurHash form by default — a compact BlurHash string the client decodes locally, generated for every image (poster, backdrop, still, logo, banner, and cast faces) by a lazy background pass — and can be switched to the url form (a small pre-sized image link) or off via the low-res images extension; the protocol equally allows a future form. Clients should support both blurHash and url, using whichever the server sent, and fall back to a plain background for forms they don't recognize.
Open metadata (extra)
The canonical fields above are the neutral contract: each has a fixed meaning and unit; a client only maps the name to whatever it calls the field internally. For anything beyond the canonical set, an item may carry an extra object of arbitrary server-defined metadata. A client reads the keys it understands and ignores the rest. Together with the forward-compatibility rules, this is what lets a server serve whatever metadata it wants while older clients keep working. extra is omitted entirely when empty.
Planned
Search (capabilities.search) is the one protocol-defined capability left optional and unimplemented here — but it's a deliberate non-goal for this server rather than a to-do; the Search section explains why the client is the better place for it. (Ranked candidates in the /resolve descriptor are implemented and advertised capabilities.candidates: true.)
Implementing the Protocol
The reference server is one implementation; the wire contract is the real product. This section is for anyone building a different client (a player, a web UI) or a different server (a third-party backend) that needs to interoperate. The HTTP shapes in the API Reference are the contract; here we cover the rules that make independent implementations interoperate.
Principles to honor on both sides
| Rule | Why it matters |
|---|---|
HTTPS + JSON, base path /<version> | Breaking changes bump the version (/v2); additive changes never do. |
| Opaque IDs | Treat ids as cookies. Never parse, never assume a format. it_… / lib_… prefixes are conventions, not contracts. |
| Seconds everywhere | All positions/durations are floating-point seconds. Convert to your internal unit only at the boundary. |
| RFC 3339 timestamps | Wall-clock times (updatedAt, token expiry context) are ISO 8601 strings. |
| Cursor pagination | cursor in, nextCursor out. Absent nextCursor = end. Cursors are opaque. |
| Consistent error envelope | Branch on error.code, never error.message. |
Building a client
A minimal Sphynx client needs to walk this path. Each step maps to one endpoint:
- Discover —
GET /v1/info(no auth). Confirmproduct == "Sphynx"and thatprotocolincludes a version you speak before showing a login screen. Readcapabilitiesto decide which features to surface — in particular it is highly recommended to readcapabilities.fieldsand inform the user of unsupported features (e.g. grey out or hide a "Trailers" row whentrailersisn't listed) rather than silently showing empty UI. - Authenticate —
POST /v1/auth/loginwith a stableX-Sphynx-Deviceheader. Store both tokens. When a call returns 401 withcode: "unauthorized", callPOST /v1/auth/refreshonce, then retry; if refresh also fails, send the user back to login. - Browse —
GET /v1/libraries, thenGET /v1/items?parent=…. Render tiles fromdetail=skeleton; fetchdetail=full(orGET /v1/items/{id}) when the user opens an item. Page withnextCursor. - Resolve at play time —
GET /v1/resolve/{id}the moment the user hits play, never earlier. Streamurlyourself, sending anyheaders. If a fetch fails, just re-resolve — don't build logic aroundttl(it's best avoided except for long extended pauses). Ifterminalis true,urlis the final location — fetch it directly, no further resolve step needed. - Report playstate — POST
start/ periodicprogress/stop. On a failed/aborted start, sendstopwithfailed:trueso you don't clobber a good resume point. - Contribute (optional) — check
GET /v1/auth/me(not/v1/info) for your effective write access before showing a "fix this" affordance;PUTmarkers when granted.
Adapter pattern. A client that already models media internally needs only a thin translation layer — it never exposes its internal types. Map resolve.url + headers to a ready-to-stream playback request; position fields to your internal time unit; placeholder.{blurHash,url} to a self-describing low-res type; playstate/* to your start/progress/stop hooks; markers.intro to a "Skip Intro" affordance. Ocelot speaks Sphynx through exactly one such adapter, implementing the wire directly from this documentation rather than importing a package.
Optionally sharing the protocol types (Swift clients)
This documentation is the client contract — its field names, units (seconds), and JSON shapes are what you map against, and a thin hand-written Decodable layer is trivial and safe given the optional fields and unknown-tolerant enums (Ocelot does exactly this). If you'd rather not hand-roll the types and your client is Swift, you may instead depend on the sphynx-protocol package to reuse the server's exact value types. Because the package isn't at the repo root, add it as a local path dependency against a checkout:
// Package.swift
dependencies: [
.package(path: "../Sphynx-Media/sphynx-protocol")
],
targets: [
.target(name: "MyClient", dependencies: [
.product(name: "SphynxProtocol", package: "sphynx-protocol")
])
]
The package is Foundation-only and dependency-free, so it builds on every Apple platform and Linux without dragging in a server stack.
Building a server
Any HTTP server can speak Sphynx. To interoperate with existing clients, implement at least the spine — discovery, auth, browse, resolve — and advertise honestly in /v1/info:
GET /v1/infomust be unauthenticated and returnproduct: "Sphynx", a stableid, and aprotocolarray. Advertise only capabilities you actually implement; clients treat a missing capability asfalse/none.- Auth shape is fixed but the crypto is yours. Issue a short-lived access token + a rotating refresh token; rotate the refresh token on every use and invalidate the old one. Scope tokens to
X-Sphynx-Deviceso one device can be revoked alone. Hash passwords with a proven KDF (the reference server uses bcrypt). Row-scope all per-user data to the token's subject. - Resolve is the whole point. Return a direct, client-fetchable
urlplus anyheaders. Never proxy or redirect bytes through your process. Leavettlabsent unless your link is genuinely time-bounded (a signed/expiring CDN URL) — since clients re-resolve fresh each play, attlon a non-expiring URL only adds a failure mode (see the resolve note). - Pick your access policy.
capabilities.metadatais per-fieldnone|read|readwrite. A field absent from the map isnone. If you accept contributions, also implement the per-user grant model so/v1/auth/mecan report effective access. - Serve what you have. Every Item field except
id/title/typeis optional. Omit what you lack; put anything non-canonical inextra. Don't invent required fields — clients won't expect them. - Advertise your coverage (highly recommended). List every canonical Item field you can populate in
capabilities.fields. It costs nothing and lets clients tell the user what your server backs up front, instead of discovering gaps by inspecting items. Keep it honest — list only fields you actually fill.
The reference server's trick: its request/response bodies are the SphynxProtocol value types, so it physically cannot drift from the wire format. If you build in Swift, do the same and the compiler enforces conformance for you.
Forward compatibility — the rules that let everyone evolve
These four rules are what make independent client/server versions interoperate. Implement them on both sides:
- Ignore unknown top-level fields. A newer server may add fields your decoder doesn't model. Drop them silently; never error.
- Tolerate unknown enum strings.
type,kind,capabilities.metadatavalues, errorcodes are all open enums. Decode an unrecognized value to an.unknown(raw)case and map it to a neutral default — never crash. - Treat all metadata as optional. The absence of a field means "not available," not "error." Render gracefully without it.
- Read
extraopportunistically. Pull keys you understand fromitem.extra; ignore the rest. This is how a server exposes new metadata to new clients without breaking old ones — no negotiation, no versioning.
Open questions the protocol intentionally leaves to implementations: whether ttl is advisory or enforced and how a client re-resolves mid-session; who owns failover ordering when candidates exist; image size hints (?w=600) vs. one canonical size; whether markers are inlined in /resolve or fetched separately; fallback identity (IMDB/TVDB/hash) when TMDB can't resolve; and a future binary wire encoding. Don't hard-code assumptions about these — they may shape future versions.
Protocol extension points wire contract
Sphynx is designed so that capabilities can grow without breaking anyone. This section covers the wire-level extension plane — metadata: reading new fields and client-contributing them. The other two planes are server-side and live in Part II: the resolve descriptor (new storage backends via drivers, plus resolve-time transforms like transcoding/subtitles) and client-side features the server stays thin under (co-watch / SharePlay).
The throughline: the protocol's canonical fields are neutral and optional, and everything else is open. A server serves whatever metadata it has; a client consumes what it understands and ignores the rest.
0. The access model
GET /v1/info advertises a per-field access policy. MetadataAccess is an open enum: none | read | readwrite (+ unknown future values). A field absent from the map means none — the client may still read whatever the server includes on an item, but there's no contribution endpoint advertised. The reference server sets it from config (SPHYNX_MARKERS_ACCESS).
capabilities.metadata advertises what the server supports. The write itself is gated by a per-user permission the admin grants (e.g. metadata.markers.write), set via PUT /v1/admin/users/{id}/permissions, so two users on the same server can have different write access. A client learns its own effective permissions + access from GET /v1/auth/me. See the Permissions table for the full model.
A client should check /v1/auth/me before offering a "fix this"/"contribute" affordance (not /v1/info), and gracefully degrade when its effective access is read/none. Effective write = server advertises the field readwrite and the user holds the field's write permission (the admin always does).
1. Reading new metadata — no extension needed
Clients need no extension to read new metadata. Two mechanisms guarantee it:
- Canonical fields are optional and neutral. When the server starts sending a field a client already models (say
officialRating), the client just reads it. The meaning and unit are fixed by the protocol; the client only maps the name. - Open
extrabag + forward-compatible decoding. Anything outside the canonical set rides initem.extra(arbitrary JSON). Unknown top-level fields are ignored; unknown enum strings decode to.unknown(...).
// A server exposing data beyond the canonical schema:
{ "id":"it_1", "type":"movie", "title":"Alien",
"extra": { "imdbId":"tt0078748", "tagline":"In space…", "dolbyVision":true } }
A client that knows extra.dolbyVision shows an HDR badge; one that doesn't ignores it. No versioning, no negotiation, no breakage.
2. Client-side contribution (write)
When a server advertises a field as readwrite, clients may contribute metadata back. This is how data that must be sourced client-side reaches the server and gets shared with everyone.
Worked example: TheIntroDB → Sphynx
TheIntroDB requires a client-only integration — a server must not call it. Sphynx respects that by never fetching it server-side. TheIntroDB's /v3/media read API is free and unauthenticated; an API key only matters for contributing your own submissions back to TheIntroDB. The client-only rule is about call origin (a server must not fetch it for clients), not about credentials. Instead the client bridges it:
- Client plays an item with a
tmdbId. - Client (per TheIntroDB's terms) fetches intro/credit markers from TheIntroDB.
- Client checks
capabilities.metadata["markers"] == "readwrite"(and its own grant via/v1/auth/me). - Client contributes them:
PUT /v1/items/{itemId}/markers
Authorization: Bearer <token>
Content-Type: application/json
{ "markers": { "recap":{"start":0,"end":30}, "intro":{"start":75,"end":145},
"credits":{"start":9120} },
"source":"theintrodb", "confidence":0.95 }
Sphynx stores them item-level, so every other client on that server now gets the markers — in GET /v1/items/{id}/markers and folded into GET /v1/resolve/{id} — without each client having to call TheIntroDB.
The contribution carries provenance (source, confidence, contributing user). Client contributions are best-effort: the server records them but marks them non-authoritative, and refuses to let a client overwrite authoritative markers (409 Conflict). Handle 403 (read-only / not granted), 409 (would clobber authoritative), and 404 (not offered) by degrading gracefully.
How the server is built
The reference server is a pipeline of small subsystems, each a seam that could later split into its own service. Bytes never enter the pipeline — every stage works on metadata only.
| Subsystem | Role |
|---|---|
| Sources | Admin-configured remote locations to index, each with a driver (http, local, webdav, …) plus non-secret config and withheld secrets. |
| Indexer | Walks a source's driver to list raw entries, then diffs against the catalog to add / remove / update — a cheap, metadata-only pass. The source is the source of truth. |
| Identifier | Turns a raw entry (filename + folder layout) into a confident TMDB id (+ season/episode). The load-bearing part — everything downstream keys off a correct id. |
| Enricher | Given a TMDB id, fetches overview, runtime, genres, ratings, artwork, and cast — cached with a freshness window. |
| Catalog | The queryable item store the API reads from: identity, enrichment, and parent/child structure (series → season → episode). |
| Resolver | The late-bound handoff: asks the item's driver for a direct, fetchable URL at play time. Stores no URL. |
| Users / Playstate | Accounts, auth, per-user permissions, and resume positions — all row-scoped to the authenticated subject. |
| API | The thin HTTP layer that speaks the Sphynx protocol, reusing the protocol value types directly as request/response bodies so the server can't drift from the wire format. |
The two subsystems that carry the most design risk are the Identifier (matching the right title is the perennial media-server hard problem) and the Source/driver model (each new backend is just a new driver). Everything else is well-trodden.
Setup & Run
This section takes you from a clean machine to a running Sphynx server answering requests. No prior Swift experience required.
Prerequisites
- A Swift 6 toolchain.
- macOS: install Xcode (15+ ships a Swift 6 toolchain). After installing, run
xcode-select --installfor the command-line tools. - Linux: install the Swift toolchain from swift.org/install, or just use the Docker workflow below.
- macOS: install Xcode (15+ ships a Swift 6 toolchain). After installing, run
- git — to clone the repository.
- curl — to exercise the API from the terminal (any HTTP client works).
- (Optional) Docker — for the Linux build/test/run loop without installing Swift locally.
- (Optional) A TMDB v3 API key — only needed to auto-identify and enrich titles. Get one free at themoviedb.org. Without it, the server still runs; you just add items manually.
Verify your toolchain:
# Should report Swift 6.x or newer
swift --version
macOS + iCloud Desktop gotcha. If you check the repo out under ~/Desktop or ~/Documents while iCloud Drive sync is on, the .build directory can be synced mid-compile and code-signing fails. Clone somewhere not synced (e.g. ~/code/ or ~/.local/), or move .build off the synced tree with a symlink.
Install & build
# 1. Clone the monorepo (gets both packages + docs)
git clone https://github.com/reckloon/Sphynx-Media.git
cd Sphynx-Media/sphynx-server
# 2. Build it (first build resolves dependencies — may take a few minutes)
swift build
# 3. (Optional) run the test suite
swift test
The server depends on its sibling sphynx-protocol package via the local path ../sphynx-protocol. Cloning the whole monorepo keeps that sibling layout intact — don't move the folders apart.
First run
# From the sphynx-server directory. Set an admin password for the examples below;
# omit SPHYNX_ADMIN_PASSWORD and the server prints a generated one to the log.
SPHYNX_ADMIN_PASSWORD=changeme swift run SphynxServer
# → serves on http://0.0.0.0:9410
In another terminal, confirm it's alive. /v1/info is unauthenticated — it's how a client verifies "this URL is a Sphynx server" before showing any login UI:
curl -s localhost:9410/v1/info
// abbreviated — capabilities.fields and playstateReportInterval omitted here;
// see the full /v1/info example under Discovery
{
"product": "Sphynx",
"serverName": "Sphynx Reference Server",
"id": "srv_reference",
"version": "0.2.5",
"protocol": ["v1"],
"capabilities": { "search": false, "playstate": true, "candidates": true, "events": true, "passkeys": false,
"metadata": { "markers": "readwrite", "images": "read" } }
}
On first run the server bootstraps the admin account (username admin by default). There is no default password: set SPHYNX_ADMIN_PASSWORD to choose one, or omit it and the server generates a strong random password and prints it once to the log — copy it from there. Change it later via POST /v1/auth/password.
Set the admin credentials on the first run (SPHYNX_ADMIN_USERNAME / SPHYNX_ADMIN_PASSWORD only apply when bootstrapping a fresh database). If you don't set a password, grab the generated one from the startup log. See Configuration.
Configuration
Configuration is split in two. A handful of startup / secret values stay environment variables (they're needed before the database exists, or are secrets). Everything else is a persisted runtime setting: the server seeds it from the environment (or a built-in default) on first run, then stores it in the database — so from then on you change it through the admin settings API (and any GUI built on it), not env vars.
SPHYNX_PORT=9000 SPHYNX_ADMIN_PASSWORD='a-strong-secret' swift run SphynxServer
Startup & secrets — environment only
| Variable | Default | Purpose |
|---|---|---|
SPHYNX_HOST | 0.0.0.0 | Bind address |
SPHYNX_PORT | 9410 | Bind port |
SPHYNX_DB_PATH | data/sphynx.sqlite | SQLite file path. The default is a persistent on-disk DB (WAL; the directory is created if missing). Use the special value :memory: for an ephemeral in-memory DB that's lost on restart. In Docker, point this at a mounted volume — see Docker / Linux. |
SPHYNX_ADMIN_USERNAME | admin | Bootstrap admin (first run only) |
SPHYNX_ADMIN_PASSWORD | (none) | Bootstrap admin password. Unset ⇒ a strong random one is generated and printed once to the log |
SPHYNX_TMDB_API_KEY | (empty) | First-run seed only for the TMDB v3 key (empty disables identification/enrichment). The key read at boot is then DB-authoritative and GUI-managed under Settings via GET/PATCH /v1/admin/tmdb — a change applies on the next restart. Not environment-only. |
SPHYNX_PLAYSTATE_REPORT_INTERVAL | 5 | Seconds; the preferred client progress-report cadence advertised as capabilities.playstateReportInterval |
SPHYNX_EVENTS_HEARTBEAT | 15 | Seconds between keep-alive pings on the event stream (GET /v1/events) |
Runtime settings — seeded from env on first run, then edited via the API
These are read/written at GET/PATCH /v1/admin/settings. The env var (shown for the first-run seed) and the JSON field name are listed; changes apply on the next restart. Durations are in seconds.
| Setting (JSON) | First-run env seed | Default | Purpose |
|---|---|---|---|
serverName | SPHYNX_SERVER_NAME | Sphynx Reference Server | Name reported by /v1/info |
serverID | SPHYNX_SERVER_ID | srv_reference | Stable id reported by /v1/info |
accessTokenTTL | SPHYNX_ACCESS_TTL | 3600 | Access-token lifetime |
refreshTokenTTL | SPHYNX_REFRESH_TTL | 2592000 (30d) | Refresh-token lifetime |
enrichmentTTL | SPHYNX_ENRICH_TTL | 7776000 (90d) | Server-owned enrichment freshness |
metadataLanguage | SPHYNX_METADATA_LANGUAGE | en-US | TMDB metadata language (language-COUNTRY); normalizes enriched titles/overviews. Set this before building genre/decade home rows — genres are stored in this language, so a genre row only matches when its name does (e.g. a Spanish library needs a Comedia row, not Comedy; a row that doesn't match shows up empty and is dropped from the home feed). Changing the language later applies live, but re-translate existing titles with Reset enrichment first, then rebuild the rows so the genre names line up. |
avatarMaxBytes | SPHYNX_AVATAR_MAX_BYTES | 2000000 (2 MB) | Max accepted user-avatar upload size |
markersAccess | SPHYNX_MARKERS_ACCESS | readwrite | none | read | readwrite (writes still granted per-user) |
markersStaleAfter | SPHYNX_MARKERS_STALE_AFTER | 604800 (7d) | Age after which markers are reported stale |
playstateRetention | SPHYNX_PLAYSTATE_RETENTION | 31536000 (365d) | Playstate retention before purge |
maintenanceInterval | SPHYNX_MAINTENANCE_INTERVAL | 86400 (1d) | Background maintenance interval; 0 disables it |
passkeyRelyingPartyID | SPHYNX_PASSKEY_RP_ID | (empty) | Passkey (WebAuthn) Relying Party id — the bare domain the server is reached at (no scheme/port). Empty disables passkeys (capabilities.passkeys=false) |
passkeyRelyingPartyName | SPHYNX_PASSKEY_RP_NAME | (server name) | Display name the authenticator shows during enrollment |
passkeyRelyingPartyOrigin | SPHYNX_PASSKEY_ORIGIN | https://<rpId> | Expected client origin (with scheme) for ceremony verification. Also the public base URL for the device-code QR / /link page — set it (or the RP id) to a domain a phone can reach, or remote QR sign-in falls back to http://<host>:<port> and won't work off-LAN. |
webAuthRedirectAllowlist | SPHYNX_WEB_REDIRECT_ALLOWLIST | (empty) | Allowlist of permitted redirect_uris for the web authorization flow — a newline/comma-separated list of exact URIs or scheme prefixes (e.g. ocelot://auth). Empty ⇒ app custom schemes are accepted and http(s) targets rejected |
signInUserList | SPHYNX_SIGN_IN_USER_LIST | false | When true, exposes the pre-auth profile-picker directory (GET /v1/auth/directory); off by default so a server never enumerates accounts before sign-in |
Env seeds once. After first run the database is authoritative for these settings, so changing their env vars later has no effect — edit them via PATCH /v1/admin/settings instead. (Startup/secret vars above are read from the environment every boot.)
Web admin page. The server hosts a small admin page at /admin (e.g. http://localhost:9410/admin) — sign in as the admin and manage the server in your browser, no env vars or curl needed. An always-visible Activity panel tops every page with catalog coverage (items in source vs in database vs enriched), live scan/enrich status, and a Next runs indicator showing when each scheduled background task (enrichment refresh, library index, BlurHash generation, media probe) next fires (from GET /v1/admin/overview + /status). A Breakdown section expands the totals into items per library and enriched by category — collection/movie/series/season/episode and the extras kinds — so the enriched gap is self-explanatory: extras (trailers, featurettes, deleted scenes) index but never enrich, shown with an extras tag rather than a deficit. The per-type counts are exhaustive and sum to the catalog totals. Seven tabs: Libraries (add / delete shelves and the storage sources that feed them — add / scan / delete sources, one clean connection form per driver, mapping a Movies and a TV library per source), Users (every account with a per-permission editor including per-library scoping, avatars, and password resets), Items (correct a title by hand — edit metadata with per-field locks against re-scan, re-identify/re-enrich against TMDB, and re-map placement: change type, move to another library, set season/episode, or nest under the right series/season), Collections (build manual box sets by hand — group movies or series into your own collections, alongside the auto-discovered TMDB ones; backed by /v1/admin/collections), Home (arrange the default home-screen layout every user sees — order, enable/disable, and add genre / decade rows, backed by GET/PUT /v1/admin/home), Settings (the runtime settings above, plus the TMDB API key), and Extensions — the Diagnostics module (Database / Logs), the optional Media probe (with its own background-probe interval and a "Run now" button), and the Low-res images module (the tile placeholder form — blurhash | url | off — with its own generation interval, a "Generate now" button, and a live progress indicator). Each extension owns its own schedule; intervals take fractional seconds (0 = manual-only). It's a static page that drives the /v1/admin/* API. A companion /user page lets each non-admin user manage their own display name, profile picture, password, and watch-history reset, and (when signInUserList is enabled) renders the opt-in profile-picker sign-in.
For a throwaway test server that leaves nothing behind, set SPHYNX_DB_PATH=:memory:. The database (and your admin account, libraries, items) vanish on restart.
Docker / Linux
There are two ways to run the server in Docker. Almost everyone wants the first one — a pre-built image you just download. The second is for contributors who want to build from source.
Option A — Run the published image (recommended)
The CI pipeline publishes a slim, multi-arch image to GitHub's container registry (GHCR) on every release tag. It's built for linux/amd64 and linux/arm64, so the same name works on a NAS, a VPS, a Raspberry Pi, or Apple Silicon — Docker pulls the matching arch automatically. No source checkout, no compiling.
# Quickest possible smoke test
docker run -p 9410:9410 \
-v sphynx-data:/data \
-e SPHYNX_DB_PATH=/data/sphynx.sqlite \
-e SPHYNX_ADMIN_PASSWORD='secret' \
ghcr.io/reckloon/sphynx-server:latest
# then:
curl http://localhost:9410/v1/info
For anything beyond a smoke test, use a Compose file instead (it keeps your settings in one place) — the README's 2-minute setup has a copy-paste docker-compose.yml that points at this image.
Pin a version instead of riding latest if you want predictable updates — every release also publishes :MAJOR.MINOR and an exact :MAJOR.MINOR.PATCH tag (e.g. ghcr.io/reckloon/sphynx-server:0.4.0).
Updating. Pull the new image and recreate the container — that's the whole process:
docker compose pull
docker compose up -d
Your catalog and accounts live in the sphynx-data volume, not in the container, so they survive untouched. On boot the server runs any new database migrations forward automatically, so a newer image upgrades an older database with no manual step. Note this is roll-forward only: there are no down-migrations, so if you ever need to roll back to an older image after a schema change, snapshot the volume first.
Option B — Build the image from source
For contributors, or if you'd rather not use the published image. The build context must be the parent directory (it needs both packages); the provided files set that up:
# From the sphynx-server directory — builds the image and runs it
docker compose -f docker-compose.build.yml up --build
# then:
curl http://localhost:9410/v1/info
# Or run the test suite inside a Swift Linux container (needs Docker)
./scripts/test-linux.sh # runs `swift test` inside swift:6.3-noble
To build the image by hand:
# Note the trailing `..` — context is the PARENT dir with BOTH packages
docker build -f sphynx-server/Dockerfile -t sphynx-server ..
docker run -p 9410:9410 -e SPHYNX_ADMIN_PASSWORD='secret' sphynx-server
Persist the database. The DB lives at SPHYNX_DB_PATH, which defaults to a path inside the container — so without a volume it's wiped on every rebuild/recreate (the catalog and admin account vanish). The provided docker-compose.yml handles this: it mounts a named volume sphynx-data at /data and sets SPHYNX_DB_PATH=/data/sphynx.sqlite. If you run with bare docker run, add -v sphynx-data:/data -e SPHYNX_DB_PATH=/data/sphynx.sqlite yourself.
Docker Desktop on macOS: always use docker build / docker compose (which stream the source as a tar), not bind-mounted source. Mounted source triggers the virtiofs "Resource deadlock avoided" (EDEADLK) failures that plague Swift builds under Docker Desktop.
Your First Library
A full, copy-paste walkthrough. By the end you'll have logged in, created a library, indexed media from a source, and resolved an item to a direct playable URL. Replace localhost:9410 with your server if different.
1. Log in
Authenticate as the bootstrapped admin to get an access token. Send a stable per-install X-Sphynx-Device header so this device can be revoked independently later.
curl -sX POST localhost:9410/v1/auth/login \
-H 'Content-Type: application/json' \
-H 'X-Sphynx-Device: my-laptop-001' \
-d '{"username":"admin","password":"changeme"}'
{ "accessToken": "eyJ…", "refreshToken": "rt_…", "expiresIn": 3600,
"user": { "id": "u_…", "displayName": "admin" } }
Save the accessToken — every other call needs Authorization: Bearer <accessToken>. For convenience in a shell:
TOKEN="eyJ…" # paste your accessToken
AUTH="Authorization: Bearer $TOKEN"
2. Create a library
Libraries are the top-level collections users browse. Admin-only.
curl -sX POST localhost:9410/v1/admin/libraries \
-H "$AUTH" -H 'Content-Type: application/json' \
-d '{"title":"Movies","kind":"movies"}'
# → { "id": "lib_…", "title": "Movies", "kind": "movies" }
3. Add a source
A source is where the media actually lives, plus how to reach it. The indexer reads a source's manifest (a JSON list of entries — metadata, never bytes) and feeds the named library. Here's an HTTP source pointed at a manifest:
curl -sX POST localhost:9410/v1/admin/sources \
-H "$AUTH" -H 'Content-Type: application/json' \
-d '{ "label":"My CDN", "driver":"http", "libraryId":"lib_…",
"baseURL":"https://cdn.example",
"manifestURL":"https://cdn.example/manifest.json" }'
# → { "id": "src_…", "label": "My CDN", "driver": "http", "config": { … } }
In the web admin you don't paste a library id. Turn on the fixed Movies / TV Shows libraries, then on each source just tick the content types it holds — Sphynx routes movies and episodes to the matching enabled library for you. At the API this is the libraryMap { "movie":"lib_…", "tv":"lib_…" } (a source can feed both at once), or a single libraryId as shown above.
The manifest is a simple document; key is resolved into a direct URL (relative to baseURL, or absolute):
{ "items": [
{ "key": "BigBuckBunny_320x180.mp4", "title": "Big Buck Bunny", "type": "movie", "year": 2008 },
{ "key": "Breaking.Bad.S01E01.mkv", "container": "mkv" }
] }
No CDN handy? Use the local driver — set config.rootPath to a folder on disk and the indexer walks it, deriving identity from the folder layout: Title (Year)/file for movies, Show (Year)/Season N/file for TV. .strm files are followed at resolve time to their contained URL.
{ "label":"NAS Movies", "driver":"local", "libraryId":"lib_…",
"config": { "rootPath": "/srv/media/movies" } }
The local driver does not serve files — it is for testing a library on the same machine only. Sphynx is metadata-only: resolve hands the client a location and never moves bytes. A plain media file under a local source resolves to a file:// path, reachable only by a player running on the server host itself — so a phone, TV, or another computer can't play it. To serve a local media folder to other devices, run a file-serving service over it (a Samba/SMB share, a WebDAV server, or any HTTP file server) and use the matching smb / webdav / http driver, so resolve returns a network-reachable URL the file server — not Sphynx — actually serves. (local with .strm files is fine, since those resolve to whatever URL they contain.)
Other backends (WebDAV, SMB, FTP) configure through two open maps — config for non-secret settings and secrets for credentials (never returned, never logged):
{ "label":"NAS", "driver":"webdav", "libraryId":"lib_…",
"config": { "baseURL":"https://nas.example/remote.php/dav" },
"secrets": { "username":"alice", "password":"•••" } }
4. Scan & enrich
Index the source: fetch its manifest, diff against the catalog, apply adds/updates/removes.
curl -sX POST localhost:9410/v1/admin/sources/src_…/scan -H "$AUTH"
# → { "sourceId":"src_…", "scanned":12, "added":3, "updated":1, "removed":0, "enriched":3 }
If SPHYNX_TMDB_API_KEY is set, the scan also identifies each entry against TMDB (movies and TV) and enriches posters, overviews, cast, and — for TV — builds a series → season → episode tree with season posters and episode stills. The enriched count reflects that (0 when TMDB isn't configured).
Don't have a manifest source? You can add a single item by hand pointing at any direct URL — great for a first end-to-end test:
curl -sX POST localhost:9410/v1/admin/items -H "$AUTH" \
-H 'Content-Type: application/json' \
-d '{"title":"Big Buck Bunny","container":"mp4",
"sourceKey":"https://download.blender.org/peach/bigbuckbunny_movies/BigBuckBunny_320x180.mp4"}'
When sourceKey is an absolute URL, you can omit sourceId entirely.
5. Browse & resolve
List libraries, then browse a library's children with parent=:
curl -s localhost:9410/v1/libraries -H "$AUTH"
curl -s "localhost:9410/v1/items?parent=lib_…" -H "$AUTH"
# → { "items": [ { "id":"it_…", "type":"movie", "title":"Big Buck Bunny", "year":2008 } ], … }
Finally, the heart of Sphynx — resolve an item to a direct, playable location at play time. The client streams this URL itself; the server moved no bytes.
curl -s localhost:9410/v1/resolve/it_… -H "$AUTH"
# → { "url":"https://…/BigBuckBunny_320x180.mp4", "container":"mp4",
# "terminal":true }
That's the entire spine: login → browse → resolve → play. Everything else (playstate, markers, enrichment, multiple users) layers on top of these calls.
Identifier & Parser
The Identifier turns a raw entry — a filename and the folders above it — into a confident identity: a movie title and year, or a series with a season and episode. It is the load-bearing subsystem; everything downstream (enrichment, the catalog tree, resolve) keys off a correct identity, so the parser is deliberately heuristic, self-contained, and exhaustively unit-tested.
All example names in this section are invented for illustration. The parser cares about shapes (where the year sits, what a season folder looks like), never about a specific title, so substitute your own freely.
Role & the three layers
Parsing happens entirely on text — no bytes, no network. A single source-relative key like Greyport (2019)/Greyport.2019.2160p.mkv is enough. The work is split across three cooperating layers, smallest first:
| Layer | Sees | Job |
|---|---|---|
FilenameParser | One filename | Clean a release filename into a title + optional year: normalise separators, drop release junk, find the year. |
EpisodeParser | One filename | Detect a TV marker (S01E02, 1x05) and split off the series title. |
PathParser / FolderName | The whole relative path | The orchestrator. Lets the clean, canonical identity carried by folders win over a messy or foreign filename, and recognises season folders, absolute/date episodes, and library buckets. |
PathParser.parse is the production entry point (the Indexer calls it for every entry). It builds on the filename primitives rather than replacing them, so each layer stays independently testable. Its result is one of two shapes:
movie(title: "Greyport", year: 2019)
episode(series: "The Tin Lantern", season: 2, episode: 5,
episodeTitle: "The First Light"?, year: 2018?)
Movies: title & year
A release filename is cleaned in four steps:
- Take the last path component and strip the extension. A trailing token is only treated as an extension when it is 1–4 characters and contains a letter — so
.mkvstrips, but a decimal (1.5) or a bare year (.1999) does not. Stacked container extensions like.mkv.strmare both removed. - Normalise separators to spaces. Dots, underscores, dashes, plus signs and all bracket characters become spaces, so
Greyport.2019.2160pandGreyport_2019_2160ptokenise identically. - Find the release year. The year is the first plausible four-digit token (1900 … next year) that is not the leading token. The "not leading" rule is what lets a title that starts with a number keep it.
- Drop release junk from the tokens before the year — resolutions, codecs, source and audio tags (
2160p,x265,WEB-DL,DDP5,HDR, a1280x720resolution, …).
| Input filename | → title | → year |
|---|---|---|
Greyport.2019.2160p.WEB-DL.x265-CREW.mkv | Greyport | 2019 |
Mossfield 2099 (2024) [2160p].mkv | Mossfield 2099 | 2024 (2099 is implausible → part of the title) |
2099.2024.2160p.mkv | 2099 | 2024 (leading year stays in the title) |
2299.mkv | 2299 | — (no second year → numeric title, no year) |
Vellichor.mp4 | Vellichor | — |
Single-word titles that look like junk. If a film's whole title is a word the parser would normally strip as a release tag (imagine a movie literally titled Web or Cut), dropping junk would leave nothing — so the parser detects the empty result and keeps the title unfiltered. The title survives instead of collapsing into the raw filename.
TV: seasons & episodes
An entry is treated as an episode when any of these are found, in this order of authority:
| Signal | Example (filename or folder) | Result |
|---|---|---|
Explicit marker SxxExx / sNNeNN | The.Tin.Lantern.S02E05.mkv | season 2, episode 5 |
Marker NxNN | The Tin Lantern - 1x05.mkv | season 1, episode 5 |
| Season folder + loose number | The Tin Lantern/Season 2/Ep 05.mkv | season 2, episode 5 |
| Date-stamped (daily shows) | Nightly Recap/2024-01-15.mkv | season 2024, episode 115 |
| Absolute number (long-running anime) | Drifting Saga - 1071.mkv | season 1, episode 1071 |
Several details make this robust across messy real-world libraries:
- Large episode numbers are never truncated. A marker allows 1–4 digits per field, so
S01E1071is episode 1071 (not "E10"), and a year-as-season daily marker likeS2024E01parses as season 2024. - Resolutions are not mistaken for markers. The
NxNNform guards its digit boundaries, so a1280x720resolution never reads as season 80 / episode 72. - Season folders come in many spellings —
Season 2,Season 02,S2,S02,Series 2, plusSpecials(season 0) and multi-season ranges (Seasons 1-3), where theSxxExxin the filename supplies the actual number. - Loose episode numbers inside a season folder are read from
Episode 1,Ep 01,E01, or a leading bare number (01 Title). - Multi-episode files (
S06E01-02,S03E07E08) take the first episode. - Episode titles are only lifted from the curated
" - SxxExx - Title"form (space-dash-space). A dotted release tail (.2160p.WEB-DL) is metadata, not a title, so it is left for the caller's "Episode N" default.
Absolute & date episodes are gated to protect movies. A trailing number behind " - " only becomes an absolute episode with positive TV signal — a season-folder ancestor, a leading [group] fansub tag, or a multi-digit number that isn't a plausible year. This is what keeps a movie that merely ends in a number (think a heist film titled "… 11", or a sci-fi title ending in a future year) classified as a movie. Date episodes map to season = air-year, episode = MM×100 + DD, a deterministic ordinal that keeps same-year airings in calendar order.
Folder authority & library buckets
In a real library the folder is the clean, curator-maintained name while the file is often a scene release in another language. So for the title, the folder wins:
Greyport (2019)/ ← folder is authoritative: "Greyport", 2019
└─ severnyy-veter.2019.Hybrid.UHD.Remux.2160p.mkv.strm ← ignored for the title
Folder handling has three moving parts:
- Curated folders preserve punctuation. Only a trailing or bracketed year is lifted out, so
The Long Watch - A Greyport Story (2016)keeps its internal dash. - Scene-style dotted folders are cleaned like filenames. A folder such as
Greyport.2019.2160p.BluRay.x265-CREW(dot-delimited, carrying junk tags) is routed through the release-filename cleaner instead of being taken verbatim →Greyport, 2019. - A yearless title folder borrows the year from the file — folder
Greyport+ fileGreyport.2019.mkv⇒Greyport, 2019 — while a folder that already has a year keeps its own.
Library buckets — the top-level container folders that never carry a title — are recognised and skipped, so the real title (the folder below, or the filename) is used instead. The bucket list is intentionally multilingual (see below), and there is a structural backstop for buckets it doesn't list: a yearless parent folder that is unrelated to a filename which does carry a year is treated as a bucket, and the richer filename wins.
Built to handle any language
The parser is Unicode-aware end to end — titles in Cyrillic, CJK, accented Latin, and beyond pass through untouched. Three things make "any correct filename, in any language, in any directory" the design target rather than an afterthought:
- Library buckets are recognised across languages. A flat file under a localized root resolves to the file's real title, not the root's name. The recognised buckets span English plus common organisational folders and other scripts:
| Language | Bucket folders treated as containers (illustrative) |
|---|---|
| English / organisational | Movies, TV, Shows, Anime, Downloads, Complete, Library, … |
| Russian | Фильмы, Кино, Сериалы, Мультфильмы |
| Japanese | 映画, ドラマ, アニメ, 番組 |
| Chinese | 电影, 电视剧, 剧集, 动漫, 综艺 |
| Korean | 영화, 드라마, 예능 |
| Spanish / Portuguese | Películas, Cine, Filmes, Filme |
- Full-width digits are folded. A full-width year (
2016) is recognised as 2016, and full-width / CJK separators (the ideographic space,・,。,.) are treated as token boundaries — so夏の記録.2016.mkvresolves to title夏の記録, year 2016. - Title comparison is script-agnostic. The "is this folder related to the filename" backstop normalises on Unicode alphanumerics, so it works the same for Latin and non-Latin scripts.
| Input path | → result |
|---|---|
Фильмы/severnyy-veter.2019.1080p.mkv | movie — title from the filename, 2019 (root is a bucket) |
映画/夏の記録.2016.mkv | movie — 夏の記録, 2016 |
Drifting Saga.S01E01.WEB-DL.2160p.mkv.strm | episode — Cyrillic/CJK series titles pass through unchanged |
Decision order
For each entry, PathParser applies these rules top to bottom and stops at the first that fits. Knowing the order explains every result:
- Explicit episode marker in the filename (
SxxExx/NxNN) → episode. The marker is authoritative for season + episode even inside a multi-season folder. - Season folder + loose number → episode (season from the folder).
- Date stamp (
YYYY-MM-DD) → daily episode. - Absolute number behind
" - ", if TV-signal gated → episode. - Otherwise a movie — title from an informative parent folder (preferring it over a foreign filename), else from the filename; year from the folder, borrowed from the file when the folder has none.
An admin's explicit hints always outrank parsing. When a source's manifest pre-declares type: episode with a season and episode (some HTTP sources do), the Indexer uses those directly and only falls back to the parser for the parts the manifest omits.
Pattern reference
Episode markers (case-insensitive): S01E02, s1e2, S01E1071, S2024E01, 1x05. Loose forms inside a season folder: Episode 1, Ep 01, E01, leading 01 …. Multi-episode: E01-02, E01E02 (first wins). Date: YYYY-MM-DD / YYYY.MM.DD. Absolute: Title - 1071 (gated).
Season folders: Season N, Season 0N, S N, SN, Series N, Specials (→ 0), Seasons A-B.
Release junk stripped from titles: resolutions (480p…2160p, 4K, UHD, plus W×H like 1920x1080), codecs (x264, x265, H264, HEVC, AVC, XviD), sources (BluRay, BRRip, WEB-DL, WEBRip, HDTV, DVDRip, CAM), audio (AAC, AC3, DTS, DDP5, FLAC, Atmos, TrueHD), and edition/process tags (REMUX, PROPER, REPACK, EXTENDED, UNRATED, REMASTERED, IMAX, HDR, DV, …).
Limits & tips for a clean library
The parser is heuristic, so a few inherently ambiguous shapes are resolved by convention rather than guesswork:
- A clean single-word title folder with no year is indistinguishable from an unlisted bucket. If your top-level container isn't in the recognised list, give per-title subfolders a year —
Title (Year)/— and identification is unambiguous. - Absolute single-digit episodes with no season folder and no group tag are intentionally left as movies, to avoid hijacking a movie that ends in a small number. Put such anime in a
Season 1folder if you rely on absolute numbering. - Date episodes use a synthetic
MMDDepisode ordinal; the underlying air date is preserved as the value, but it is not a broadcaster's official episode number. - Identification feeds a TMDB search whose confidence is scored on title + year agreement, so a correct year materially improves the match — and the cleaner your folders, the better the result.
- Release numbering can run ahead of TMDB. When a release splits a two-part finale across files (
S06E21+S06E22, or a… (2)part file) or numbers a special TMDB lists elsewhere, the file's episode number can exceed what that season actually has on TMDB. Such an episode stays fully indexed and playable but unenriched — it keeps its filename title and gets no art/overview, because there is no matching TMDB episode to copy. Surface these in the admin Items view (type in the search box, or tick Needs metadata to list everything still missing data), then edit and lock the fields by hand — or rename the files to match TMDB's numbering.
The single best habit: name the per-title folder Title (Year) and put episodes under Season N. The parser will then succeed regardless of how messy — or how foreign — the individual filenames are.
Admin server-specific, not part of the wire protocol
Catalog setup, indexing, manual entry, and server settings. Auth required, and the admin role unless noted — the item-edit PATCH is gated by the metadata.edit permission instead, so it can be granted to a non-admin. 403 forbidden otherwise.
The current persisted runtime settings. 200 →
{ "serverName":"…", "serverID":"…", "accessTokenTTL":3600, "refreshTokenTTL":2592000, "enrichmentTTL":7776000, "metadataLanguage":"en-US", "markersAccess":"readwrite", "markersStaleAfter":604800, "playstateRetention":31536000, "maintenanceInterval":86400, "avatarMaxBytes":2000000, "signInUserList":false, "webAuthRedirectAllowlist":"", "passkeyRelyingPartyID":"", "passkeyRelyingPartyName":"", "passkeyRelyingPartyOrigin":"" }.
Update runtime settings — any subset of the fields above. Body e.g. { "serverName":"My Library", "markersAccess":"read" } → 200 with the full updated settings. Most changes apply immediately; a few (e.g. token TTLs) take effect on the next restart. metadataLanguage applies live — it updates the enrichment client in place, so a re-enrich immediately fetches titles/overviews/posters in the new language (no restart). 400 if markersAccess isn't none/read/readwrite, if passkeyRelyingPartyID carries a scheme/port, or if passkeyRelyingPartyOrigin omits one. Setting passkeyRelyingPartyID turns on passkeys (capabilities.passkeys).
The TMDB v3 API key (identification + enrichment depend on it). GET → a masked status { "configured":true, "keyHint":"…1b87", "appliesOnRestart":true } — the key itself is never returned. PATCH { "apiKey":"…" } stores it (seeded once from SPHYNX_TMDB_API_KEY, DB-authoritative thereafter); the change takes effect on the next restart. See docs/API.md.
Restart the server process — the way to apply a changed TMDB API key (and other boot-time settings) without shell access. 202 Accepted; the server then shuts down cleanly and re-execs itself in place, so it comes back whether or not a supervisor (Docker restart: policy / systemd) would relaunch it. The web admin's Restart server button (Settings) calls this. The server is briefly unavailable while it comes back up; the library and all settings are preserved.
Body { "title":"Movies", "kind":"movies" } (kind defaults to other). 200 → { "id":"lib_…", "title":"Movies", "kind":"movies", "collectionThreshold":1 }. New libraries start at collectionThreshold:1.
List all libraries. 200 → { "libraries": [ { "id":"lib_…", "title":"Movies", "kind":"movies", "collectionThreshold":1 }, … ] }.
Update a library. Body (any subset) { "title":"…", "kind":"…", "collectionThreshold":2 } → 200 with the updated library. collectionThreshold (Int, default 1) is the box-set grouping knob: the minimum number of present members a TMDB collection needs to surface as a box-set tile at this library's top level (1 groups any non-empty collection; clamped to >= 0). It rides on these admin library objects, not the public wire Library shape. See docs/API.md for detail.
Cascade. Deletes the library and every item it holds, then unbinds it from any source that feeds it — a source that also feeds another library survives (with this library removed from its routing); a source left feeding no library is deleted. 204 on success.
{ "label":"My CDN", "driver":"http", "baseURL":"https://cdn.example",
"headers":{ "Authorization":"…" },
"libraryMap":{ "movie":"lib_movies", "tv":"lib_tv" },
"manifestURL":"https://cdn.example/manifest.json" }
driver defaults to http; manifestURL is where the indexer lists entries. A source feeds a library by content category: libraryMap routes each item to a library by type (movie / tv), so one source + one scan fills a Movies library and a TV library from the same folder — one driver walk, items split by detected type. A single libraryId is still accepted and acts as the fallback for any unmapped category. 200 → { "id":"src_…", "label":"…", "driver":"http", "config":{…}, "libraryId":…, "libraryMap":{…} } — only non-secret fields are returned.
Drivers other than HTTP configure through two open maps: config for non-secret settings, and secrets for credentials. Secrets are stored but never returned or logged (for the HTTP driver, request headers are treated the same way).
{ "label":"NAS", "driver":"webdav", "libraryId":"lib_…",
"config": { "baseURL":"https://nas.example/remote.php/dav" },
"secrets": { "username":"alice", "password":"•••" } }
For a local source, set driver to local and config.rootPath to a directory path; the indexer walks that tree, deriving identity from the folder layout (Title (Year)/file for movies, Show (Year)/Season N/file for TV). A re-scan re-walks the folder, so it doubles as the periodically-updated source. .strm files are followed at resolve time to their contained URL. Note that the local driver does not serve files — a plain media file resolves to a file:// path usable only by a player on the server host, so it is for local testing only. To serve a local folder to other devices, front it with a Samba/SMB share, WebDAV, or an HTTP file server and use the matching smb/webdav/http driver.
Many sources, one library. A library can be fed by any number of sources, of any mix of drivers — each source routes its items to a libraryId (or per content category via libraryMap), and nothing requires those targets to be distinct. Point several sources at the same library to merge them onto one shelf; point one source's movie and TV categories at two libraries to split it.
The manifest is a simple JSON document the indexer reads (metadata, not media). TV is detected from the filename (S01E02, 1x05, …): the indexer builds a series → season → episode tree, deduping shared series/seasons, and (when TMDB is configured) identifies and enriches them. Entries may instead carry explicit seriesTitle / season / episode hints. Browse the tree via parent= with seriesId, seasonIndex, episodeIndex, and childCount on each item.
List all sources (non-secret fields only). 200 → { "sources": [ { "id":"src_…", "label":"…", "driver":"http", "config":{…} }, … ] }.
Update a source. Body (any subset) { "label":"…", "baseURL":"…", "manifestURL":"…", "libraryId":"…", "headers":{…}, "config":{…}, "secrets":{…} } — given maps replace the stored ones. 200 → the updated source (secrets withheld).
Cascade. Deletes the source, the items it produced, and any series/season containers those items leave empty. 204 on success.
Index one source: fetch its manifest, diff against the catalog, apply adds/updates/removes. 200 → { "sourceId":"src_…", "scanned":12, "added":3, "updated":1, "removed":0, "enriched":3 }.
Re-scan every source feeding one library (a per-library refresh). 200 → { "sources":[ <scan summary>, … ] }. Gated by catalog.scan for that library (or globally) — the admin's per-library Refresh button and a delegated scanner both use it.
Scan every source. 200 → { "sources": [ <scan summary>, … ] }. Requires the unscoped catalog.scan.
Permissions
Authorization is a single admin (the bootstrap account, which holds every permission implicitly and is the only admin) plus an open per-user permission set the admin grants. Permissions are string keys, stored uniformly and forward-compatible — unknown keys are tolerated. Well-known keys:
| Key | Grants | Gates |
|---|---|---|
library.read | Browse libraries + resolve/play their items | browse, items, resolve, home, playstate, item state |
metadata.markers.write | Contribute intro/credit markers | PUT /v1/items/{id}/markers |
metadata.images.write | Contribute artwork (reserved — no endpoint yet) | — |
metadata.edit | Read/edit item metadata, lock fields, re-identify/re-enrich, and re-map a title (move library / re-parent / set type & season-episode) | GET/PATCH /v1/admin/items*, …/identity, …/enrich |
catalog.scan | Trigger a re-scan of a source or library (not its credentials) | POST …/sources/{id}/scan, …/libraries/{id}/scan, …/scan |
collections.edit | Manage manual collections (box sets) — create, add/remove movies or series, rename, or delete them | GET/POST/PATCH/DELETE /v1/admin/collections* |
A key may be scoped to one library or one item with a :<id> suffix, e.g. library.read:lib_abc grants read for that library only, and metadata.edit:it_123 grants editing of a single title. A user may hold the global key and any number of scoped keys; a gated action passes if the caller holds the global key or the key scoped to the relevant library or item. The admin always passes. A full-catalog POST /v1/admin/scan needs the unscoped catalog.scan.
Admin role vs. permission. Most /v1/admin/* endpoints require the admin role (the single bootstrap account, holding source credentials, settings, user management). Three capabilities are delegable to non-admins via permissions: item correction (metadata.edit — including re-identify, re-enrich, and re-mapping placement), scanning (catalog.scan), and collection curation (collections.edit). A user holding metadata.edit gets a Library correction panel on the /user page that mirrors the admin tools (browse/search, "needs metadata" filter, edit + lock, re-identify, re-enrich, and re-map); re-mapping across libraries needs metadata.edit on both sides. A user who holds collections.edit gets a Collections panel on the same page.
Permission denied is a clean 403. A gated action the caller may not perform returns 403 forbidden. Clients must surface this clearly — disable/hide the affordance up front from GET /v1/auth/me, show a short "you don't have permission" message on a 403, and treat it as terminal and non-retryable (distinct from 401 re-auth and 5xx/timeout retries). Never let the action silently do nothing or look like a network error.
The permission vocabulary for the admin editor, so the UI is data-driven rather than hardcoding keys. 200 → { "permissions":[{ "key":"library.read", "label":"Browse & play", "description":"…", "scopable":true, "reserved":false }], "libraries":[{ "id":"lib_…", "title":"Movies" }] }. scopable keys may be granted per-library; reserved keys are stored but not yet enforced.
List all accounts. 200 → { "users": [ { "id":"u_…", "username":"bob", "displayName":"Bob", "avatarURL":"/v1/users/u_…/avatar?v=…", "isAdmin":false, "permissions":["library.read"] }, … ] }. avatarURL is omitted when the user has no picture. The admin's permissions reflects the full implicit set.
Create a non-admin user (there is exactly one admin — any isAdmin in the body is ignored). Body { "username":"bob", "password":"…", "displayName":"Bob", "permissions":["library.read"] }. permissions defaults to ["library.read"] when omitted, so a new user can browse and play immediately. 200 → the created user. 409 if the username is taken.
Replace a user's permission set. Body { "permissions":["library.read","metadata.markers.write"] } → 200 with the updated user. This is how the admin controls per-user access. Setting the admin's permissions is rejected (it holds all implicitly).
Delete a user and revoke all their sessions + per-user state. 204 on success. The admin account cannot be deleted (403).
Browse the catalog as a raw file hierarchy for the correction UI: the direct children of parent (a library id → its top level; an item id → that container's children). 200 → { "items":[<Item>,…] } (full projection — type, images, childCount). Unlike the player-facing /v1/items, this applies no collection grouping — a collection is its own openable row and member movies appear individually, a 1-to-1 mirror of the indexed source tree. Reads the catalog only (no driver/CDN traffic). Gated by metadata.edit for the resolved library.
Read one item with its current lock state, for the correction editor. 200 → { "item":<Item>, "lockedFields":["title","overview"] }. Gated by metadata.edit. The wire Item carries no lock info, so this is how a UI knows which fields are pinned.
Edit an item's metadata and lock each edited field against auto-refresh, or re-map its placement. Gated by the metadata.edit permission (honoring per-library scoping), not the admin role — so a non-admin editor can be granted it. Every field is optional; each metadata field present is written and locked, so it survives every scan, TTL refresh, and forced enrich.
{ "title":"…", "overview":"…", "year":1999, "runtime":8160,
"genres":["…"], "communityRating":8.2, "officialRating":"PG-13",
"images":{ "primary":"https://…", "backdrop":"https://…", "thumb":"https://…" },
"placeholder":"https://…",
// Re-map (fix placement): move/re-parent/retype/re-position
"libraryId":"lib_…", "parentId":"it_…",
"seasonIndex":1, "episodeIndex":3, "type":"episode",
"unlock":["overview"], "unlockAll":false }
Re-mapping fixes where an item lives (wrong library, or never linked to its show). The server keeps the tree consistent: parentId nests the item (clears libraryId, derives seriesId/seriesTitle/seasonIndex from the parent); libraryId makes it top-level (clears parent + linkage). Moving across libraries changes who can see the item, so a re-map of libraryId/parentId needs metadata.edit on both the current and destination library (admin bypasses) — 403 otherwise; an unknown type or missing destination is 400.
200 → { "item": <Item>, "lockedFields":["overview","title"] }. To revert a field to automatic TMDB data, unlock it (or unlockAll) and re-enrich.
Admin override: pin an item to a specific TMDB id and re-enrich. Body { "tmdbId":"603", "type":"movie" }. 200 → the enriched item.
Force re-identification + enrichment of one item. 200 → the enriched item.
Enrich every item that needs it (new or stale). 200 → { "enriched":7 }.
The three enrichment endpoints require TMDB to be configured (SPHYNX_TMDB_API_KEY); otherwise they return 400 bad_request.
{ "title":"Big Buck Bunny", "type":"movie", "container":"mp4",
"sourceId":"src_…", "sourceKey":"path/or/absolute-url", "tmdbId":"…",
"libraryId":"lib_…", "parentId":"it_…", "year":2008,
"extra":{ "anything":[1,2,3] } }
titleandsourceKeyare the only required fields.sourceKey— an absolute URL (self-contained) or a key relative to the source'sbaseURL.sourceId— optional; omit it whensourceKeyis an absolute URL.typedefaults tomovie.libraryId— optional; the library this item belongs to (top-level browse membership).parentId— optional; a parent item id to nest under (e.g. an episode under a season).year— optional release year.extra— optional open map of server-defined metadata, stored and projected onto the item'sextra.
200 → the created item.
Cascade. Deletes the item and its whole subtree (a series takes its seasons + episodes), then prunes any container the deletion leaves empty up the parent chain. 204 on success. (Note: an item still listed by its source reappears on the next scan — the source is the source of truth.)
Collections
Curate manual collections (box sets) — the same collection-typed container browsed via GET /v1/items?parent=<id>, just with no tmdbId, so it groups and obeys its library's collectionThreshold identically. All four endpoints are gated by the collections.edit permission (held globally or scoped to the target library, collections.edit:lib_…); admins always pass.
List a library's collections with their members. 200 → { "collections":[{ "id":"it_…", "title":"…", "libraryId":"lib_…", "memberCount":2, "members":[<Item>,…] }] }.
The library's top-level movies/series available to add (already-nested items are excluded). Optional case-insensitive search. 200 → { "items":[<Item>,…] }.
Create a collection, optionally seeding members. Body { "libraryId":"lib_…", "title":"…", "itemIds":["it_…",…] } (itemIds optional). Only top-level items of that same library are linked. 200 → the created collection (same shape as a collections list entry).
Rename and/or add/remove members in one call. Body (any subset) { "title":"…", "addItems":["it_…"], "removeItems":["it_…"] }. A rename keeps members' denormalized collectionTitle in sync. 200 → the updated collection.
Delete the collection tile, orphaning its members back to the library's top level (the movies/series are kept; only the grouping is removed). 204 on success.
Diagnostics & extensions
These power the web admin's Extensions tab (Diagnostics + optional modules) and are server-specific, not part of the wire protocol. Brief shapes here; the exhaustive reference is in docs/API.md.
- Diagnostics (all
GET, admin-only) —GET /v1/admin/status(activity snapshot: current parse/enrich activity + counters);GET /v1/admin/logs?after=&limit=&level=(paged diagnostics log lines);GET /v1/admin/db/tablesandGET /v1/admin/db/query?table=&limit=&offset=&tmdbId=&name=(read-only, whitelisted database browser with secret columns redacted; optionaltmdbIdexact-match andnametitle-substring search when the table has those columns);GET /v1/admin/overview(catalog coverage: items in source vs indexed vs enriched, per library and source). - Extensions registry —
GET /v1/admin/extensionslists each module (kindbuiltinlikediagnostics, oroptional;availablereflects prerequisites such asffprobebeing installed). - Media probe (
id: media-probe, opt-in) —GET/PATCH /v1/admin/extensions/media-proberead/set{ enabled, ffprobePath, intervalSeconds?, maxPerMinute? }(applied live, no restart);GET /v1/admin/extensions/media-probe/probe?itemId=runsffprobeon the item's resolved location and caches the streams / external subtitles /chaptersback onto the item, so resolve and item detail then serve them without re-probing. Like BlurHash, it also runs as a background pass (POST …/media-probe/runfor "Run now"; or setintervalSeconds> 0 to schedule it — 0 = manual-only, the default) that probes every not-yet-probed title with bounded concurrency, rate-limited tomaxPerMinutesource resolves a minute (default 60,0= unlimited) so it stays far under a provider's request budget (e.g. TorBox's 300/min) and leaves headroom for live playback; a title that fails to resolve or probe is attempted at most once per run and simply waits for the next pass, and the pass aborts after 12 consecutive failures (a refusing provider means the rest of the pass is doomed too — stopping releases the pressure instead of sustaining it). 400 when disabled orffprobeis unavailable.
Why pre-indexing tracks speeds up playback. Caching each title's audio and subtitle tracks ahead of time lets a player render its audio/subtitle picker and begin playback without first probing the file itself — a step that otherwise stalls the first frame, especially for remote sources. Players that rely on the server's advertisedtracks(e.g. Ocelot) therefore start dramatically faster on titles that have been probed, so running a background pass over the whole library is recommended when such a client is in use. - Low-res images (
id: placeholders) —GET/PATCH /v1/admin/extensions/placeholdersread/set{ mode, intervalSeconds?, hashing? }wheremodeisurl|blurhash|off(defaultblurhash). Governs the low-resplaceholderform clients receive in a tile; read live, so switching re-shapes serving at once (no restart). An unrecognisedmode— or a negativeintervalSeconds— is 400. BlurHash generation (reference server) is decoupled from enrichment and runs as a lazy background pass: it hashes every photographic image — item rolesprimary/backdrop/thumb/bannerplus up to 30 cast faces, for every item type (transparent logos keep the URL form) — with bounded concurrency (≤4 fetches in flight, so it never hammers the image source), hashing only what's still missing each pass and persisting hashes without bumpingupdatedAt(no mass client-cache invalidation). Its cadence is read live fromintervalSeconds(seconds, fractional allowed; 0 = manual-only;POST …/placeholders/runfor "Generate now"). Inblurhashmode the response includeshashing: { running, total, done, lastCompletedAt? }(image-granular progress) which drives the module's status indicator. Per-role hashes live in a{role: hash}map; each cast face carries its own hash too.
Server-side extensions reference impl
Ways to extend this server beyond the wire contract — self-written metadata, source drivers, and resolve-time transforms. These are implementation seams, not part of the protocol a client speaks.
3. Server-side extension (self-write)
A server (or a server extension) can write metadata itself — e.g. an intro detector that analyses media and produces markers. These writes go through the server's internal catalog API, not the public HTTP contribution endpoint, and are marked authoritative so client contributions won't override them.
The reference server does not ship an intro detector — this is the documented seam so one can be added as an extension. ItemRecord already carries markersJSON + provenance (markersSource, markersConfidence, markersAuthoritative, markersContributedBy, markersUpdatedAt), and Catalog.updateItem(_:) persists them.
var item = try await catalog.item(id: itemId)!
item.markersJSON = encode(Markers(intro: .init(start: 75, end: 145)))
item.markersSource = "intro-detector"
item.markersConfidence = 0.9
item.markersAuthoritative = true // server-detected → wins over client
item.markersUpdatedAt = now
try await catalog.updateItem(item)
Because markersAuthoritative == true, a later client PUT is rejected with 409 Conflict (only an admin can override). The precedence policy in one sentence: server-detected / admin-pinned beats best-effort client contributions.
Adding a new server-written field, end to end
- Storage: add column(s) to the relevant record + a migration.
- Access: add the field to
AccessPolicy(advertisereadorreadwrite). - Read: surface it on the item projection / a
GETendpoint //resolve. - Write: a contribution endpoint (if client-writable) and/or an internal write path for the extension.
- Provenance: record source + authority so precedence stays sane.
- Docs: note the field's meaning, unit, and access in
API.md.
If the field is niche/server-specific, skip the canonical schema and use item.extra instead — clients read it without any protocol change.
Worked example: filling criticRating from a critic source (e.g. OMDb)
This is the one canonical Item field the reference server never populates, because TMDB has no critic data — it exposes only an audience score (vote_average, which the server maps to communityRating). A critic aggregate (Rotten Tomatoes, Metacritic) needs a different source, and it has a ready-made home in the spec: criticRating, a 0–100 review-aggregator score (distinct from the 0–10 communityRating). Add it as an opt-in extension, exactly like the built-in media-probe:
- Pick a source. OMDb is the obvious one — its response carries a
Ratingsarray with Rotten Tomatoes (e.g."87%") and Metacritic (e.g."74/100"), plus aMetascore. A free API key gates it, so make the extension opt-in with its key in settings (the media-probe extension'sext.mediaProbe.*keys are the template). - Key the lookup by IMDb id. The server already stores it — enriched items carry
externalIds.imdb(tt…), which is exactly OMDb'si=parameter. No new identity work. - Normalize to 0–100. Rotten Tomatoes is already a percentage; Metacritic's
x/100is too; pick one (or average) and round to anInt. - Write the field, honoring the lock. Reuse the enrichment apply pattern so a manual admin edit still wins:
var item = try await catalog.item(id: itemId)! if !item.lockedFields().contains(LockableField.criticRating) { item.criticRating = 87 // 0–100, from OMDb try await catalog.updateItem(item) } - Advertise it. Add
"criticRating"toInfoController.supportedItemFieldssocapabilities.fieldsstays honest, and a client knows it can show a critic score.
That's the whole shape — fetch by externalIds.imdb, write a 0–100 criticRating (lock-respecting), advertise it. The reference server stops at documenting the seam; it ships no OMDb integration and stores no third-party key by default.
4. Freshness & expiry
Two planes — only one of them ever goes "stale."
- Existence & location — the item list, each item's source key, and the playback URL. Owned by the source. Sphynx stores the source reference, never a resolved URL, and never marks it stale or gives it a TTL. It changes only when you re-scan the source: a re-list that diffs the directory — new keys added, vanished keys removed. The source is the source of truth, so if you delete an item from Sphynx's DB it is re-added on the next scan as long as the source still lists it. (A manually-added item that points at a bare absolute URL has no source to re-add it, so that delete is permanent; and a re-added item is a fresh row, so any manual edits / field-locks on the deleted one are gone.) Playback URLs are resolved fresh on every play.
- Layered metadata — TMDB enrichment and intro/credit markers, fetched from elsewhere. This does carry a freshness window, refreshed by re-fetching TMDB or by a client re-contributing — never by re-scanning the source. That's what the rest of this section covers.
Scan cadence. A re-scan can be triggered manually (POST /v1/admin/scan, or per source), but each source also carries its own per-source auto-refresh. A source's refreshInterval (seconds; 0 = manual-only) and lastScannedAt drive a background SourceRefreshService that re-scans each source when it falls due — so e.g. an FTP source can re-list every 15 minutes while a WebDAV source re-lists every 5, each on its own cadence. Set it via PATCH /v1/admin/sources/{id} with refreshInterval.
The layered-metadata plane goes stale; Sphynx keeps it fresh along the same ownership split as contribution — whoever can fetch the data is responsible for refreshing it:
- Server-owned data — the server refreshes it. Data the server can re-fetch (TMDB enrichment: posters, overview, cast) carries a freshness window (
SPHYNX_ENRICH_TTL, default 90 days). A background maintenance pass (SPHYNX_MAINTENANCE_INTERVAL, default daily) re-fetches anything older. The client does nothing. - Client-owned data — the server flags it, the client refreshes it. Data only a client can fetch (markers from a client-only source) can't be refreshed server-side.
GET /v1/items/{id}/markersreturnsstale:trueonce markers passSPHYNX_MARKERS_STALE_AFTER(default 7 days). A client with a data source should, onstale:trueor 404, re-fetch andPUTupdated markers — closing the loop.
Refresh never clobbers higher-authority data: the maintenance pass only touches server-owned fields, and authoritative markers are never reported stale. Per-user playstate untouched for SPHYNX_PLAYSTATE_RETENTION (default 365 days) is purged by the maintenance pass.
5. Source drivers — adding a backend
A source driver teaches Sphynx to read a new kind of storage backend. The framework is a registry: a driver declares its kind and the config it needs, and the core never changes to accommodate it. The contract is deliberately narrow:
list()— a metadata-only walk that yields oneSourceEntryper media file (a key, plus optional container/size hints). It never reads media bytes.resolve()— turns one entry's key into a direct, client-fetchable location (a URL, plus any headers and an optional expiry, set only when the link is time-bounded). The client streams it; bytes never pass through the server. The server stores nothing the driver returns — it resolves on demand.
That split is the whole design: only listing differs between backends. resolve just emits a scheme-appropriate URL — https://… for HTTP/WebDAV, smb://… for SMB, ftp://… for FTP.
Driver status
| Kind | resolve (handoff) | list (enumerate) |
|---|---|---|
http / https | ✅ direct URL + headers | ✅ via a JSON manifest |
local (local-test only — see note) | ✅ .strm → its URL; else file:// (same-host players only) | ✅ filesystem walk |
webdav | ✅ https:// + auth header | ✅ recursive PROPFIND over the built-in HTTP client |
smb | ✅ smb://host/share/key | ✅ via smbclient (per-directory ls walk) |
ftp | ✅ ftp://host/key | ✅ via curl (per-directory LIST walk) |
All six drivers now both resolve and list. webdav lists natively over the HTTP client; smb and ftp shell out to smbclient / curl (which must be on the server's PATH — listing fails with a clear message if absent, while resolve/playback still work); torbox lists and resolves over the TorBox debrid API. Every driver honours the core rule: the server hands back a location and moves no bytes. Configure them in the web admin's Libraries → Storage sources (one per driver, with a clean connection form) or via POST /v1/admin/sources.
Authoring a driver
- Implement
SourceDriver(list()+resolve()), reading non-secret settings fromSourceContext.configand credentials fromSourceContext.secrets. - Declare
static let registration = DriverRegistration(kind:, requiredConfigKeys:, make:).requiredConfigKeysare validated beforemakeruns, so a misconfigured source fails fast with a clear message. - Add the registration to
DriverFactory.defaultRegistrations. That one list is the only shared edit — there is no centralswitch.
Resolving — handing off an authenticated location
Because the server never proxies bytes, resolve() hands the client a location it can fetch and authenticate itself. There are three flavours, all expressible in the descriptor:
| Backend | How resolve() authenticates | Descriptor |
|---|---|---|
| S3 / object store, some CDNs | Server mints a time-bounded signed URL with its stored key | url (signed) + ttl; no credentials reach the client |
| HTTP / WebDAV | Server returns the URL + auth headers | url + headers (e.g. Authorization) |
| SMB / FTP | No signed-URL concept → URL with embedded credentials | smb://user:pass@host/share/key; the client's native stack authenticates |
The server is a credential broker, not a byte proxy: it holds each source's secrets and only materialises them into a descriptor — for an authorised user, at play time. list() walks the backend; resolve() describes where + how to authenticate.
Security of the handoff
- Secrets stay server-side. A source's credentials live in its
secretsand are never returned by browse or source-list responses — only the non-secretconfigis shown. (Encryption-at-rest is the remaining hardening.) - Descriptors are sensitive. They may carry a signed URL or raw credentials, so they're returned only to an authorised caller, served over TLS, resolved fresh per play, and never logged (only method + path are). A driver that embeds credentials in a
urlmust keep descriptors out of logs. - Prefer signed URLs. S3/CDN signed URLs leak no credentials and expire via
ttl— the safest handoff. - SMB/FTP hand the share password to the client (inherent to those protocols). That suits a trusted, single-admin deployment, gated by the user's play permission. For users who shouldn't hold the credential, serve that content from a signed-URL or HTTP backend instead — the server will never proxy as an alternative.
6. Resolve-time extensions — transcoding & external subtitles
§1–§3 grow an item's metadata. The other extension plane is the resolve descriptor — the per-play handoff (GET /v1/resolve/{id}) carries tracks, candidates, container, and headers alongside the url. These are the seams for changing how a player gets bytes, while the iron rule holds: no bytes ever transit the Sphynx server.
Worked example: a transcoder — without becoming a byte proxy
A transcoder looks like it violates Sphynx's premise (it touches bytes). It doesn't — as long as Sphynx points at a transcoder rather than being one. Run the packager as a separate service (your own ffmpeg/remux endpoint, a CDN's just-in-time packager, or a sidecar) and write a driver — or wrap an existing one — whose resolve() returns the packager's URL instead of the origin's:
// A transcode driver wrapping a backing driver's location.
struct TranscodeDriver: SourceDriver {
let id = "transcode"
let backing: any SourceDriver // the real storage backend
let profile: String // from the source's config
func list() async throws -> [SourceEntry] { try await backing.list() }
func resolve(_ request: ResolveRequest) async throws -> ResolvedLocation {
let origin = try await backing.resolve(request) // where the real file lives
let packaged = packagerURL(for: origin.url, profile: profile)
return ResolvedLocation(url: packaged, headers: [:], container: "m3u8",
ttl: 3600, terminal: true, candidates: nil) // signed, expiring HLS
}
}
The client streams HLS/DASH straight from the packager: bytes flow client ← packager, never through Sphynx. The descriptor already carries what a player needs — container (the rendition's format), ttl (a signed/expiring packager link), and tracks, where copyableAudio is precisely the "this audio can be copied, not transcoded" hint and preferredAudio/preferredSubtitle select renditions source-relative.
Ranked renditions ride candidates. Because the reference server advertises capabilities.candidates and already emits candidates (built from a title's other versions), resolve() can return the direct original first and a transcoded fallback second ([{ url, headers, priority }]) — a client attempts direct play, then drops to the transcoded variant only if its player can't handle the source. A transcode driver slots its packaged URL into that same ranked list.
Worked example: a subtitle fetcher
Subtitles come in two shapes, and Sphynx already has a home for each:
- Embedded (a subtitle stream inside the media file) — no fetch needed;
resolve()points the player at it withtracks.preferredSubtitle(a source-relative index). - External sidecar (fetched from OpenSubtitles, a studio feed, …) — a server-side self-write extension (the §3 pattern). The fetcher runs server-side and stores the sidecar URLs on the item's open
extrabag; the client side-loads them. Because only the server can call the subtitle source, the server owns the refresh (the §4 server-owned freshness rule).
// Self-write: a server extension attaches fetched sidecar subtitles.
var item = try await catalog.item(id: itemId)!
item.extra["subtitles"] = [
["lang":"en", "format":"vtt", "url":"https://subs.example/abc.en.vtt", "forced":false],
["lang":"es", "format":"vtt", "url":"https://subs.example/abc.es.vtt"]
]
try await catalog.updateItem(item) // shared with every client; client renders extra.subtitles
If sidecar subtitles become common enough to standardise, promote them from extra.subtitles to a canonical optional field via the §3 "new server-written field, end to end" recipe, so every client models them identically; until then extra ships them with no protocol change. And if a subtitle source is client-only (its terms forbid server-side calls — like TheIntroDB for markers), invert it: fetch in the client and contribute, exactly as in §2.
7. Client-side extensions — co-watch / SharePlay
Everything above lives on the server. Some features are purely a client concern, and Sphynx is deliberately thin beneath them. A synchronised watch party — Apple's SharePlay, or a cross-platform clone — is the clearest example.
What the protocol already gives you — the shared addressing layer that makes co-watch possible at all:
- A stable item id that means the same title to every participant on a server, so a party agrees on "what are we watching" by passing one id.
- Independent resolve — each participant calls
GET /v1/resolve/{id}for itself at play time and gets an equivalent direct location (its own authenticated handoff; no one shares another's credentials). Everyone streams the same content. - Per-user playstate — each viewer's own resume position, already folded into browse.
The real-time channel it does give you: the additive event stream (GET /v1/events, SSE) is a server→client push channel — playstate, per-user state, markers, and library nudges, access-filtered per subscriber. It's deliberately one-way and not sub-second: ideal for now-playing / presence ("what's everyone in this house watching"), keeping a shared Continue Watching rail live across a user's devices, and reacting to library changes — but not for the tight play / pause / seek lockstep of a watch party. That sync stays out of band, owned by the client:
There are two ways to implement it, and they work very differently. Pick by platform; the only thing they share is Sphynx addressing (same itemId, each participant resolves for itself).
Approach A — Apple platforms: ride SharePlay (Apple syncs the playhead; you write no sync code)
On Apple platforms the playhead lockstep is not yours to build — AVPlaybackCoordinator owns it over the FaceTime real-time channel. Sphynx never sees the session.
- Define a
GroupActivitywhose payload is the Sphynx{ serverId, itemId }(plus title/artwork for the SharePlay UI). Activating it from a FaceTime call or AirDrop starts the session. - On each participant's device the app authenticates to that server and calls
GET /v1/resolve/{itemId}for itself — its own credentials, its own direct location. Nobody shares another's resolve, and bytes flow participant→CDN as usual. - Bind the player to the session:
player.playbackCoordinator.coordinatePlayback(with: groupSession). Apple now keeps play / pause / seek / rate in lockstep automatically. - Optional companion UI (a "now playing" badge, a synced Continue Watching rail) can still ride
/v1/events— but the playhead rides SharePlay, not the event stream.
Net: you implement no clock and no sync loop; the server only answers each resolve. This is the path you're taking.
Approach B — Cross-platform party: you own the sync (do not stream position)
Off Apple, there's no coordinator, so you build it — but the trick to "instant, no delay" is that you do not relay a position tuple continuously (that's laggy and jittery, and clients fight each other). You sync on sparse events + a shared clock and let each client predict the timeline locally. This is what AVPlaybackCoordinator does internally — you're re-creating it.
- Shared addressing — same as Approach A: everyone agrees on
{ serverId, itemId }and each client resolves independently. Bytes still go peer→CDN, never through your sync channel. - Shared clock (this is what kills the delay) — each client estimates its offset to a common reference with an NTP-style ping (a few round-trips over your channel), so everyone agrees on "session time".
- Anchored transport events — when a participant controls playback, broadcast one event, not a stream:
{ action: play|pause|seek|rate, position, atSessionTime, rate }. Each client schedules it foratSessionTimeon the shared clock, so a pause lands on the same wall-clock instant everywhere regardless of who has 20 ms vs 120 ms of latency. - Local prediction + drift correction — between events each client computes
expected = anchorPos + (sessionNow − anchorTime)·rateand nudges itself back onto the line (a micro-seek if far off, or a tiny ±2% rate tweak to glide in). - State on join — a late joiner pulls the current anchor so it starts already in sync.
Carry those events over a transport you own — a small WebSocket relay or a WebRTC data channel between participants. The events are sparse (one per user action), so the channel is cheap. Realistic result: sub-100 ms perceptual sync; literally zero is impossible over the open internet (SharePlay gets tighter only by piggybacking FaceTime's real-time channel).
Why this split is right. Co-watch playhead sync is latency-sensitive and platform-specific, and participants already stream bytes peer-to-CDN, not through Sphynx — so the tight loop belongs on a peer transport (SharePlay / WebRTC), not a server fan-out. The event stream covers the rest: the liveness a server is well-placed to broadcast (state changed, library changed) without pinning clients to one transport or becoming a byte path.
From stream to full parties sketch — not on the roadmap. The server→client half now ships as /v1/events. A full server-mediated party (a shared queue + an upstream "broadcast my position to the group" channel for clients with no peer transport) would add the missing client→server direction — naturally an advertised capabilities.sessions exposing a session object (create / join / broadcast), gated by the same per-user permissions. That upstream half is not implemented; it's noted here only as where it would attach without disturbing the rest of the wire.
Roadmap & Coming Soon
Built spine-first. What's working today is documented above; what's flagged here isn't ready yet. For a version-by-version history of what landed in each release, see the changelog (current release: v0.2.5).
Working today live
- Discovery & auth —
GET /v1/info; password login with short access tokens + rotating refresh tokens; device-scoped sessions. - Browse — libraries and items (skeleton/full), cursor pagination, single-item detail.
- Identity & enrichment — movies and TV identified against TMDB; series → season → episode tree with posters, season art, episode stills/titles. A broad canonical field set (tagline, studios, directors/writers,
externalIds, premiereDate, …) so clients have what they display. - Admin catalog management — create / list / update / delete libraries, sources, and items, with cascade. One source can map a Movies and a TV library, routing a mixed folder by type from a single scan. A built-in web admin page (
/admin) drives settings, libraries, sources, and users from the browser, with a live activity dashboard (items being parsed/enriched), a read-only database browser, and a diagnostics log. - Manual edits — admin
PATCHof item metadata with per-field locks that survive every scan and refresh. - Resolve — direct playback location + headers, resolved fresh per play and never stored (never proxied; carries an optional expiry only for time-bounded source links).
- Playstate & per-user state — start/progress/stop with failed-stop protection; watched / favorite / play-count / last-played; browse sort (name/added/rating) + genre/unwatched filter.
- Typed home feed —
GET /v1/homereturns ordered shelves, each tagged with akindand a tileaspect(portrait/landscape) so the client knows how to lay a row out. Continue Watching is unified: the next unwatched episode of a show you've started is merged into it alongside in-progress items — one row, never a separate "Next Up". Plus Recently Added and Favorites. - Incremental sync —
GET /v1/changes?since=returns changed items + deletion tombstones so a client syncs deltas instead of re-listing; plus aDELETE /v1/playstate/{id}clear-resume action, an advertised refresh-token lifetime (refreshExpiresIn), and aRetry-After/error.retryAfterbackoff hint on rate-limited errors. - Live updates — an additive server→client event stream (
GET /v1/events, SSE) for playstate, per-user state, markers, and library changes, access-filtered per subscriber; clients opt in viacapabilities.eventsand otherwise poll. - Bi-directional metadata — server-configurable per-field read/write access; contributable recap / intro / credits / preview markers (extensible to any segment type); an open
extrabag. - Extras, collections & people — bonus content (trailers, featurettes, deleted scenes, behind-the-scenes) classified as its own item types and nested under its movie or show via a generic
parentId; collections / box sets created from TMDB and browsed viaitems?parent=; a person filmography endpoint (GET /v1/people/{id}/items, newest-first across movies and TV). Plus artwork & metadata fills:logo/banner,trailers,tags,sortTitle. - Source drivers —
local(test-only),http(manifest),webdav(PROPFIND),smb(smbclient),ftp(curl), andtorbox(debrid API) all resolve and list. Configured in the web admin's Libraries → Storage sources, one clean connection form per driver. - Passwordless & TV sign-in — passkeys (WebAuthn) for biometric login, and a QR / device-code flow so a TV pairs by approving on your phone (where a passkey makes it "scan → Face ID → done").
- Multiple versions & editions — one title backed by several files (4K + 1080p, Director's Cut + Theatrical) collapses into a single item with a best-first
versionspicker, played viaresolve?version=; a title's other versions are also offered as ranked fallbackcandidates. - Richer per-user state — per-user ratings (
userRating), plus completion semantics: marking watched clears resume, a stop in the last 5% auto-completes, and a stop in the first 5% un-watches. A typed browse contract (capabilities.browsesort/filter +totalCount/pageSize). - Music & audiobooks (protocol) — artist→album→track and audiobook→chapter, with lossless/hi-res stream descriptors, are modeled in the wire format for other servers to implement (the reference server is film/TV only).
Coming soon planned
Everything above is live today — including the published Docker image, passkeys, QR/device sign-in, multiple versions/editions, and the typed browse contract (see the changelog for what landed when). The work below is the remaining roadmap — small wire-contract and client-facing additions, ordered by priority.
Protocol / client
- Marker editing — a DELETE for a wrong marker and per-segment provenance/confidence (today contributions are PUT-only and whole-set).
- Metadata-language negotiation (
Accept-Language) for multi-language libraries, on top of the server's single configured language.
Server / breadth
- A music / audiobook server — the protocol already models audio (artist/album/track, audiobook/chapter, lossless streams); an actual identify/enrich backend (e.g. MusicBrainz) would let the reference server serve it, not just describe it.
- More source drivers — additional backends (e.g. S3) beyond the local / http / webdav / smb / ftp / torbox set.
- A built-in critic rating — the
criticRatingfield is ready; populating it just needs a critic source. The guide shows how to wire one (e.g. OMDb) as an opt-in extension.
FAQ & Troubleshooting
The server won't build — code-signing or .build errors on macOS.
You're almost certainly building inside an iCloud-synced folder (~/Desktop, ~/Documents). Move the checkout somewhere unsynced, or symlink .build off the synced tree. See the prerequisites warning.
Docker build fails with "Resource deadlock avoided" (EDEADLK).
Docker Desktop for Mac + bind-mounted source. Use docker build / docker compose (which stream a tar), not a mounted volume. The provided Dockerfile already does this. See Docker / Linux.
swift build can't find sphynx-protocol.
The server resolves it via the local path ../sphynx-protocol. Build from inside sphynx-server/ within a full monorepo checkout — don't move the two package folders apart.
Enrichment isn't happening — posters/overviews are missing.
TMDB isn't configured. Set SPHYNX_TMDB_API_KEY to a TMDB v3 key and re-scan (or call POST /v1/admin/enrich). Without it the server runs fine but every item stays a skeleton, and the enrichment endpoints return 400.
I changed SPHYNX_ADMIN_PASSWORD but login still uses the old one.
Admin bootstrap only applies when creating a fresh database. If data/sphynx.sqlite already exists, the account is already there. Delete the DB (or point SPHYNX_DB_PATH at a new file) to re-bootstrap, or change the password through normal account management.
Resolve returns 404 no_media_source.
The item exists but its source is unavailable (e.g. the manifest URL is unreachable, or a driver's list() isn't implemented yet). Check the source config and that the underlying URL is reachable.
A client got a 409 conflict on PUT …/markers.
Authoritative markers (server-detected or admin-pinned) already exist; a best-effort client contribution can't clobber them. Only an admin can override. This is intended precedence, not an error to retry.