For engineers & curious producers
Gloss is designed around typical AI-generator artifacts. It is a cleanup-first finishing tool, not a guarantee to correct every musical or technical problem in one pass.
The main site keeps things simple. This page matches the technical reference in docs/DSP_ARCHITECTURE.md and docs/USER_GUIDE.md — condensed for the web.
A deeper-control version of Gloss for more technical cleanup and finishing workflows.
Gloss is a spectral processor aimed at AI-generated music. A small neural network estimates how synthetic the signal sounds; that score scales three STFT-domain stages — Polish (Fix), Stereo Fix (Space), and Air — followed by an optional multiband mastering chain (Automaster).
All spectral work runs in mid/side for stereo-aware treatment. Processing uses 4× oversampling around the spectral engine to reduce aliasing from magnitude changes.
RTNeural runs on a 64-dimensional feature vector derived from mel bands, spectral flux bands, shape statistics, and L/R coherence bands — roughly every 10 ms at 48 kHz (every 4th STFT frame). The smoothed score modulates thresholds and depth for Polish, Stereo Fix, and Air so AI-heavy material gets firmer correction and natural recordings stay lighter.
Dual-threshold peak detection (spatial vs temporal neighbours) with a 12-point frequency weighting curve peaking around 3.5 kHz for harshness. Soft-knee downward compression plus upward “spectral fill” on dips. Transient detection reduces cuts in the attack region so drums and plucks stay punchy.
Sub-bass mono: side collapsed below 80 Hz, blended through ~150 Hz. HF phase alignment: where L/R coherence is poor above ~1.5 kHz, the side vector is corrected toward a coherent image. Widening: where coherence is already good, gentle side boost in the 4–13 kHz range.
Maag-style shelf from 2.5 kHz upward with adaptive gain: more boost when HF energy is lacking, less when the spectrum is already bright — including per-bin dip/peak shaping.
LR4 Linkwitz-Riley bands at 200 Hz and 2.5 kHz, RMS compression per band with linked stereo detection, feed-forward LUFS targeting (Spotify −14, Apple Music −16, YouTube/SoundCloud −14, CD −9), and a 2 ms lookahead brickwall limiter with Lagrange true-peak detection.
Total reported latency is about 12.7 ms at 48 kHz (~610 samples): STFT frame at the oversampled rate, oversampler delay, and limiter lookahead. Buffers are pre-allocated — no heap allocation on the audio thread.
Questions? Support · Gloss home
Optional analytics only. No consent, no tracking.