November ended with fragmentation. December brought an unexpected answer, though not the one I’d been looking for.
I’m still capturing these as voice notes and working with Claude to structure them. The irony isn’t lost on me that I’m moaning about switching between tools whilst dictating this into my phone. But that’s rather the point – it’s not about finding the perfect tool, it’s about finding the right combination for what you’re actually trying to do.
I stopped typing
December was when voice transcription and voice commands became good enough that I could just stop using the keyboard for most things.
I started ensuring every meeting was transcribed – Gemini for some, Granola for others, Copilot for Teams calls. The point wasn’t to have a record, though that was useful. It was to have the content available without having to type it up afterwards.
But Whispr was the real revelation. It just worked. Not in the “this is impressive technology” sense, but in the “I’ve stopped thinking about it” sense. It understood what I meant, handled British English properly, caught technical terms, didn’t try to be clever. It transcribed what I said, which sounds obvious but most tools can’t help themselves from trying to improve things.
This mattered more than it might sound. When transcription works, it removes friction from the entire process. Ideas become notes become drafts without the usual bottleneck of getting them out of your head and into something usable. I could dictate rather than type, and it actually worked.
ChatGPT found its feet again
ChatGPT had a good month. After falling behind through most of the autumn, it started feeling competitive again. Not revolutionary, just solid. The responses were sharper, it handled complexity better, it stopped missing obvious points.
More importantly, I’d invested properly in Custom GPTs. I became better at crafting embedded context, specific instructions, examples of what good looked like. /modes were a bit of a game changer. Now I had a new problem – which Custom GPT to use but an easier one than the pasting problem.
I also built Gems in Gemini. Not because those platforms were better than Custom GPTs, but because the work was already there. The emails were in Gmail, the documents were in Google Drive, the calendar was Google Calendar.
Source data started to matter
This was December’s actual insight: tools work best where they already have access to your context.
In November, Copilot and Gemini were just swimming in soup. They had access to everything – my entire Microsoft or Google history – but couldn’t make sense of my commands. Too much context, no ability to work out what was important and what wasn’t. Even with the same model underneath, a curated Custom GPT would produce better answers than these tools flailing around in everything I’d ever typed.
December changed that. Gemini and Copilot got better at finding relevant context. They still needed nudging – “look at that email thread” or “check that document” – but they became useful with their vast access to my history in a way they hadn’t been before.
There’s a counterintuitive lesson here: sometimes curating what a tool can see produces better results than giving it everything. I’d often get sharper answers by editing material down before putting it into a Custom GPT. But December was when Gemini and Copilot started solving that curation problem themselves, at least partially.
The pattern became clear. Gemini was strongest when working with Google Stack. Copilot was less useless when working with Microsoft tools. ChatGPT was best when I’d invested in Custom GPTs with proper context. The difference was that the first two were now actually good enough to use regularly.
Force multipliers in practice
The tools make individuals more capable. That was already clear. But December was when I started seeing it work at team level.
We’d been pushing the team hard to actually use these things properly. Lovable was absolutely incredible – we built some fantastic things with it. Same with Replit. The difference was visible. Not just in what got done, but in how people approached problems.
More importantly, I wasn’t the only one pushing anymore. Other people were finding the limits, working out what these tools could actually do, sharing what worked. It stopped being “Paul’s thing” and became just how we worked.
If you’re viewing these tools primarily through cost reduction – how many people can we cut now we’ve got AI – you’re getting it wrong. The value is in making the people you have more effective. We weren’t doing the same work with fewer people. We were doing different work, better work, with the same people.
The cycles became visible
Whispr was brilliant in December. Pin-sharp transcription, minimal errors, genuinely felt like a superpower.
The same pattern showed up everywhere. ChatGPT improved, Claude had a quiet month, Gemini became more consistent. Nothing stayed still. Which meant that “best tool for the job” wasn’t a fixed answer – it kept changing.
What changed
The tools weren’t just responding to what I asked anymore. They were starting to understand what I needed. Not reliably, not consistently, but enough that it felt different from November’s cut-and-paste workflow. Those niches depended on where the work already lived.