The Fundamendals Matter
Or why the future of web search is probably simpler (and cheaper) than you're being told...
I still remember my father teaching me how to use Copernic Agent Basic on our first home computer to search the internet for information - the nuance of using keywords to find what I was looking for instead of trying to explain to the computer what I wanted in full sentences...

Turns out I was just ahead of my time, because apparently "AI" driven search is the future, or so every one of the mainstream search engines seems to think these days. Explaining to the computer that "I need you to find documentation on how to reset my Belkin power bank which is no longer turning on" is supposedly far better than the haiku-like "belkin powerbank reset" I learned to rely on.
Really though, web search has been getting progressively worse since the mid-2010s.
I recall making the argument that "I could never switch away from Google search, because it has gotten so good at knowing what I'm looking for that I rarely need to go beyond the first link". Every attempt I mad to try something different (up to and including using Google on someone else's computer) highlighted just how valuable this implicit filtering was to my productivity.
Over the last decade, however, Google Search progressively got worse. I'd search for something I already knew existed, only to be met with a sponsored (and usually inferior) alternative. I'd find myself needing to read through mutiple SEO-optimized results before finding the thing I was actually looking for.
Don't even get me started with the whole W3Schools vs MDN mess...
Quality
adj. being suitable for the intended purpose
I find it rather amusing to reflect on the use of "Military Grade" in common usage to mean "exceptionally tough and hard wearing" when in practice it means "cost effective for government procurement". In much the same way, "quality" doesn't mean "superior to alternatives" but simply "suitable".
If you made a search engine which you described as "barely good enough to be usable" you'd be laughed out of the room - and yet it seems that's the goal that all of the major search providers (i.e. Google and Bing) have opted to pursue. Not directly, of course, but rather as a result of their never-ending quest to monetize every aspect of your interaction with their platforms.
Why show you the result we know you're looking for when we can instead show you an advert for the thing you're looking for which will bring us some revenue instead? For that matter, why not show an advert for the competing product, knowing that they'll pay a premium to "steal" you as a customer.
The net result is a search experience which is just barely above the "fuck it" threshold for most users - good enough that you keep using it, but optimized for the benefit of the search provider rather than you.
Competition
So of course, when Open AI announced ChatGPT and folks realized that it was, in fact, quite a bit better at giving them what they asked for than Google's intentionally abusive "search" product - everyone lost their minds.
It's the future of search! It's going to destroy Google and Bing! Every search engine needs to integrate AI in order to remain competitive!
This all sounds perfectly reasonable if you assume that Google Search was operating as well as it possibly could and that LLMs were a stepwise improvement in information retrieval and ranking algorithms.
Of course, if you assume that Google Search (and its primary competitors) had instead optimized for the worst possible search experience which would retain users - then this suddenly takes on a very different look. Indeed, ChatGPT's "we don't yet know how we'll monetize this, so we'll just give the users what they want" starts to sound a whole lot more like the early days of Google Search.
Okay, but that's just a theory - there's no way to prove it.
Fair point - except that's not entirely true...
Getting it Right
At the start of 2024 I signed up for a new search engine called Kagi which takes a rather different approach to everyone else. For starters, I pay for it. Every month I pay roughly what I spend on Spotify - and while Spotify has at various points infuriated me, lost my business, and generally makes me consider cancelling every time they increase the cost of their service to subsidize AI features which actively make the platform worse for the primary thing I care about: Kagi consistenly gets better.

The single best thing I can say for Kagi is that almost 2 years later I haven't once found myself pondering whether Kagi is worth it, or needing to rely on an alternative to find what I'm looking for.
I type a search query, I get the results I'm looking for. No adverts, no bad results, no probabalistic slop which distracts from the answers I need. It's everything I loved about Google a decade ago, and unlike all of the other "quality" search engines out there - this one actually is superior to the common alternatives.
I'm not going to try to sell you on Kagi itself, you're (probably) an adult who can make your own decisions - but I do want to encourage you to think about whether the common wisdom is "wise" or just "profitable".
In a world in which power is increasingly wielded by the few and inequality continues to worsen (despite the hyper-majority of countries claiming to be free democracies), we should critically consider whether the systems we support further contribute to this pattern of exploitation.
Given the option of paying hundreds of Euro each month to subsidize mega-corporations walking back their CO₂ emissions goals as they race to replace the need for skilled labour with the latest and greatest AI hardware and models; or paying a small fraction of that cost for a system which is orders of magnitude faster, more accurate, and solves the foundational problem I care about - I'll take the latter.
The Right Tool for the Job
All that is not to bash "AI", it's to say that there's a time and a place for each tool. LLMs are an incredible technology and highlight that we've got a lot of room to develop in our modelling and indexing of information - the fact that you can cram enough information into a few hundred GB worth of "weights" to effectively produce believable representations of any information you desire is, frankly, mind-blowing.
The ability to use these systems to bridge the gap between novel problem domains and computing systems without the need for prescribed, executable, code (as has been the fundamental requirement for computer operation for decades) opens up a range of options. Effectively, we've taken computers from being systems which were exceptionally good at telling you "no, go away" (ask any developer whether their code compiles without errors the first time around) to being systems which are the sycophantic equivalent of Jim Carrey's "Yes Man".
That's not inherently a bad thing, but it is also not inherently a good thing either. Computers are a tool, using an oil rig to drill a hole makes sense for some types of hole - but is a fairly inefficient way to mount a painting in your livingroom and just because the folks on said oil rig are able to solve that and many other problems for you, doesn't mean you wouldn't be better served by a more appropriate tool.
When we strip away the marketting bullshit and optimistic delusion, we're left with systems which are capable of:
- Indexing a large amount of information and retrieving it without reliance on specific keywords.
- Transforming content from one form to another with a reasonable degree of accuracy (summarize, rewrite, translate, in-paint, etc).
- Producing statistically likely completions across large context windows, enabling them to reproduce believably-rational interactions (agents, tool calling, etc).
That's a fairly amazing set of capabilities, and when paired with the right set of constraints and integrations it can cut the time required to automate the initial execution of a task from hours/days/weeks to mere minutes. In contexts where you're faced with a wide range of novel problems which are rarely repeated, this can be a serious force-multiplier.
Of course, outside of that problem domain they tend to be slow, unreliable, costly, and the inherent flexibility that enables them to deal with a wide range of novel problems also makes them probabalistically likely to produce bad results.
Ultimately, as with Google Search - we should strive not to optimize for a solution which is "barely good enough to get the job done" but rather leverage our own critical thinking faculties to build solutions which solve the fundamental problem in a way which delivers a sustainably superior outcome.