For engineers & curious producers

How Gloss works under the hood

Gloss is designed around typical AI-generator artifacts. It is a cleanup-first finishing tool, not a guarantee to correct every musical or technical problem in one pass.

The main site keeps things simple. This page matches the technical reference in docs/DSP_ARCHITECTURE.md and docs/USER_GUIDE.md — condensed for the web.

Coming soon

Gloss Pro

A deeper-control version of Gloss for more technical cleanup and finishing workflows.

← Back to Gloss

Overview

Gloss is a spectral processor aimed at AI-generated music. A small neural network estimates how synthetic the signal sounds; that score scales three STFT-domain stages — Polish (Fix), Stereo Fix (Space), and Air — followed by an optional multiband mastering chain (Automaster).

All spectral work runs in mid/side for stereo-aware treatment. Processing uses 4× oversampling around the spectral engine to reduce aliasing from magnitude changes.

Signal flow (simplified)

Input (stereo or mono → duplicated to stereo).
4× upsample → PolishEngine (STFT in / STFT out) → 4× downsample.
Dry/wet mix with latency-aligned dry path.
Optional Automaster: HPF / low shelf → 3-band Linkwitz-Riley crossover → per-band compression → feed-forward auto-level (K-weighted LUFS) → true-peak lookahead limiter.
Output metering; optional TPDF dither on 16-bit paths.

Artifact detection

RTNeural runs on a 64-dimensional feature vector derived from mel bands, spectral flux bands, shape statistics, and L/R coherence bands — roughly every 10 ms at 48 kHz (every 4th STFT frame). The smoothed score modulates thresholds and depth for Polish, Stereo Fix, and Air so AI-heavy material gets firmer correction and natural recordings stay lighter.

Polish — Fix knob

Dual-threshold peak detection (spatial vs temporal neighbours) with a 12-point frequency weighting curve peaking around 3.5 kHz for harshness. Soft-knee downward compression plus upward “spectral fill” on dips. Transient detection reduces cuts in the attack region so drums and plucks stay punchy.

Stereo Fix — Space knob

Sub-bass mono: side collapsed below 80 Hz, blended through ~150 Hz. HF phase alignment: where L/R coherence is poor above ~1.5 kHz, the side vector is corrected toward a coherent image. Widening: where coherence is already good, gentle side boost in the 4–13 kHz range.

Air

Maag-style shelf from 2.5 kHz upward with adaptive gain: more boost when HF energy is lacking, less when the spectrum is already bright — including per-bin dip/peak shaping.

Automaster

LR4 Linkwitz-Riley bands at 200 Hz and 2.5 kHz, RMS compression per band with linked stereo detection, feed-forward LUFS targeting (Spotify −14, Apple Music −16, YouTube/SoundCloud −14, CD −9), and a 2 ms lookahead brickwall limiter with Lagrange true-peak detection.

Latency & performance

Total reported latency is about 12.7 ms at 48 kHz (~610 samples): STFT frame at the oversampled rate, oversampler delay, and limiter lookahead. Buffers are pre-allocated — no heap allocation on the audio thread.

Known limitations (summary)

Integrated LUFS gating (BS.1770 silence gate) is not implemented; impact is minor on typical AI music.
TPDF dither is for 16-bit export; float paths are undithered.
The detector is trained around 48 kHz; other rates are normalised via bin mapping.

Product spec sheet

Formats: AU, VST3, Standalone
Platform: macOS + Windows
Channels: Stereo (mono auto-converted)
Sample rates: 44.1 – 192 kHz
Oversampling: 4× (spectral path)
AI inference: RTNeural, ~10 ms update
Presets: 12 factory + unlimited user

Questions? Support · Gloss home