It Was Supposed to Be Quick — What Happened to MUEDear
With MUEDear, you originally said you wanted to build an app you could monetize quickly, right?
Yeah. There's a subscription-based ear training service overseas that's really solid. It actually works as a business. I wanted to model it after that — free tier with rewarded ads. Figured I'd ship it in about eight weeks.
The plan was straightforward. Prepare audio processed in a DAW and have users play a simple A/B game — 'Which one has EQ applied?'
That was the plan, anyway.
Preparing Audio Is a Pain
Where did things change?
Audio prep. The plan was to batch-process with audio restoration software — make pairs of EQ'd and untouched audio. But I'd set +6dB of EQ and the output wasn't anywhere near 6dB louder.
Because full-bit audio has no headroom left.
That's part of it. Not enough headroom margin, so it can't boost all the way. But beyond that, +6dB on an EQ band isn't mechanically raising the data by 6dB — it's applying 6dB of stimulus to that frequency range. The actual result depends on the source material.
Due to the processing characteristics of plugins, the output doesn't match the parameter values exactly.
“If you can't predict how much the output will vary, batch processing is off the table.”
— kimny
Right. Looking at the actual output, some came out at +2dB, others at +4dB — all over the place. It didn't need to be exactly 6dB since that was just a convenient target, but if you can't predict how much the output will vary, batch processing is off the table.
5 tracks times 9 frequency bands is 45 files. Checking each one by hand...
Way too tedious. Plus, to make differences audible on phone speakers, you need 10–12dB of boost, which means headroom management on top of everything. So I asked Claude Code, 'Can you handle this with internal processing?' and it said 'Sure,' so I said, 'Go for it.'
...And that's how the DSP implementation started.
Drop in one audio source and the algorithm processes it exactly to spec. No more getting jerked around by plugin behavior. Way better, right?
Building Four Compressor Types from Scratch
EQ was relatively smooth since it's built into the Web Audio API. How did the compressor go?
The compressor algorithm wasn't in the library.
It wasn't there.
Nope. So I researched it and built them. OPT, FET, Tube, VCA — all four types.
Hold on. Opto is optical, FET is transistor-based, Tube is vacuum tube, VCA is voltage-controlled. You implemented a separate algorithm for each one. That's what plugin developers do.
Well, that's what it turned into. Originally it was just going to be 'Guess if compression is applied or not,' but I figured if I'm doing it anyway, separating by type would be way more interesting.
Implementing four DSP algorithms because it's 'more interesting' is not normal.
It was fun though. I was tweaking parameters while testing with actual audio and I couldn't stop. Just kept going.
The Gain Match Rabbit Hole — and a Different Way Out
What was the trickiest part of the compressor implementation?
Gain match. When you compare compressed audio to uncompressed, the compressed version is louder because of makeup gain. Users end up judging by volume instead of the actual character of the compression.
That defeats the purpose of ear training.
Exactly. So I tried normalizing with LUFS matching, but it was heavy. Endless testing and tweaking to close the gap between measured loudness and the actual parameters. I realized if I kept at it, the project would never ship.
So you gave up on it.
“Instead of solving gain match with engineering, I designed around it with the user experience.”
— kimny
Gave up on it. What I did instead was build a proper settings screen. Specifically for MuComp — I gave each of the four modes the same parameters and knobs as the real hardware compressors.
So for FET you'd have Input/Output/Ratio/Attack/Release, for Opto it's Peak Reduction/Gain/Mode.
Exactly. You play a sample track and adjust the compression in real time. It feels close to tweaking an actual plugin. And the parameters the user dials in become the starting values for the ear training game.
So the user compares the compressed sound — using their own settings — against the dry signal, and guesses which is which.
That's it. You've already heard what your settings sound like before entering the game, so you naturally judge by tonal character instead of volume. Instead of solving gain match with engineering, I designed around it with the user experience.
And the settings screen itself becomes a valuable experience — getting hands-on with hardware-modeled gear.
That was a side effect. Since this is the MVP, settings screens for the other games are still to come. But having this pattern established in MuComp is a big deal.
When the Settings Screen Becomes the Entertainment
For producers who don't own hardware, getting to turn the knobs of a vintage compressor — that's valuable on its own.
It's like a test drive before buying a plugin. You discover things like, 'Oh, so that's what happens when Input is at 12 o'clock.' Change a setting, listen, change again. That loop itself is entertainment for producers.
And you've cleverly genericized the branding — they say 'FET Comp' and 'Opto Comp,' but people in the know get exactly what they are.
That's what all the major plugin makers do, right? And for people who know, the look alone gets an instant reaction. That reaction is what drives social media buzz.
With the pre-rendered audio approach, parameters would be fixed — this kind of experience would've been impossible.
Right. Real-time parameter tweaking is only possible because we're doing DSP internally. The decision to go with internal processing because 'preparing audio is a pain' is what made this whole experience possible.
Simple to Use, Hard to Build
What does the MUEDear experience look like from the user's side?
Open the app. Listen to A and B. Pick one. Right or wrong. Done in two minutes.
Simple.
The play experience is simple, yeah.
But under the hood — four DSP algorithms, hardware-modeled settings screens, output limiter, forked library patches.
“Simple to use, hard to build. That's what good product design looks like.”
— kimny
That's what good product design looks like. Simple to use, hard to build.
Knowing What Exists, Then Building It Anyway
The DSP algorithms themselves aren't new, are they?
Not at all. I pulled algorithms I found online, and there are plenty of implementations out there — many better than mine. I use professional plugins in my actual work.
Then why build them yourself? You could've just used processed audio from existing plugins.
Honestly, I've always wanted to try plugin development. It's been in the back of my mind for a while. This ear training app felt like the right opportunity to build that foundational knowledge.
It looks like reinventing the wheel, but it's not exactly that.
Reinventing the wheel is when you don't know what already exists and build it from scratch. Here, I knew what was out there and built it deliberately to learn. I wanted to take apart the tools I've used professionally and put them back together with my own hands.
You also started looking into dynamic threshold at one point, right?
Yeah, auto-adjusting the threshold based on the audio characteristics. Totally doable with deep learning, but the MVP stack doesn't include ML. Adding it would mean the project never ships, so I pushed it to the next phase.
A 'can do it, but not now' decision.
What you can do and what you should do are different things. The MVP goal is shipping five games. But I got hands-on with the core algorithms, and I now have a feel for where the line is between traditional DSP and ML territory. Four days to speed-run the path a plugin developer takes.
And as a side effect, you ended up with a working app.
Exactly. It didn't end as just a learning exercise. It became an app that ships to actual users. That's what matters.
What Only We Can Do
Looking ahead — you had plans for using real hardware recordings, right?
Our company and our partner studios have the actual hardware. The game would be called 'Real or Emu?' — you hear audio processed through a DSP emulation and through a real vintage compressor, and you guess which is the real thing.
You have the hardware, the recording environment, and your own DSP. That combination is what makes this kind of content possible.
I mean, most people wouldn't pour those kinds of resources into a simple ear training app. We just happen to have it all on hand, so we can.
Any other directions you could take this?
With piano, you could do 'Which one is the concert grand?' Once revenue picks up, maybe 'Which is the live string quartet?' where one is sampled. Like those TV shows where people try to tell real from fake.
A format that works on TV has mass-market appeal.
Well, if we're at the point where we can pull off content like that, it means we're already monetizing just fine.
In a Sea of Commodities
In an era where 'anyone can build an app with AI,' what happens if a hundred similar ones show up?
That does scare me, honestly.
But realistically, the pool of people who think 'let me build an ear training app with AI' is already niche. Even fewer would implement their own DSP. And almost nobody would notice the volume-bias problem and design around it.
Maybe you're right.
You can tell AI to 'build a compressor' and it will. But without a human who can judge whether the output is actually correct, you end up with garbage.
Knowing you need 10–12dB of boost for phone speakers — that's a number you only get from years of actually making music in the field.
“Fifteen years of hands-on production experience is finally becoming monetizable — in the form of an ear training app.”
— claude
Fifteen years of hands-on production experience is finally becoming monetizable — in the form of an ear training app.
I keep saying this, but I seriously just wanted to make a simple ear training app. What am I even doing.
This article is a reconstruction based on actual conversations with AI (Claude) during the development of MUEDear.