Categories
Uncategorized

A (Personal) Search Engine

Personal Search Engine by jjg

Back in Goodbye Google Search! I said:

“I think it’s our duty to find alternatives and try them out and see if we can move away from big tech, either by choosing self-hosted alternatives, more ethical companies, or ways to subvert the existing system.”

I also talked about quitting Google (Web) Search, or should I say Google AI Answers? Anyway, Google Web Search is dead, so it’s time for something else. I covered a bunch of options in the last post.

Meanwhile, through some discussions in a group chat I’m in with a few software developers the idea of a “Personal Search Engine” came up. I did a little looking around and there are things with that name, but they are often made to search your own documents locally stored, or web pages you have bookmarked, or they use LLMs (Yuk!) so this idea is more of a Personal Web Search Engine.

Now you need to go read Jason’s post: Personal Search Engines

A Personal Search Engine (PSE) is a search engine that specializes in your interests. It provides personalized search results by indexing only the things you are interested in, not by spying on you. Instead of crawling the entire Web and then looking for what you’ve searched for, A PSE crawls only the parts of the Web you are most interested in and looks for what you’re searching for there. The result is a list of hits that are relevant to your interests that point to websites you are more likely to know and trust.

I love this idea, and want to ramble on about it…

If I were to develop a PSE of my own, I think there are a few things I’d it to do.

Index Everything I Browse:

As I browse the web I’d like to just index every page I visit. This might seem like a lot of pages but honesty it’s probably less than feeding just one large domain to the index. If I am searching for some obscure thing like writing specific Arduino code to do some weird MIDI thing I may visit a dozen or more pages, and I’d like to see those all added to my index. Then the next time I need to find what I found I could just search my own index. Ideally I could “PageRank” my results in some way, either automagically using my PES to do so when I click a link…

Or maybe I could manually set the ranking on a page so it comes up higher in my searches. Should there be a way to manually rank things? Why not? It’s not like you can game your own system for profit or something, right?

Use the RSS Feeds:

I subscribe to a bunch of RSS feeds through my feed reader (FreshRSS and NetNewWire) and they provide some searching capabilities, but maybe we can feed those indexes into our master index so our PES can use it. Alternately we add RSS feeds to the PES directly so blogs and any site with a feed can be incrementally indexed over time.

So those are just two ideas I wanted to get out there… I will probably have more.

I think the magic of this is that while I mentioned SearXNG in the previous post, even though it’s a container application and was dead simple to install onto my NAS, it still relies on the indexes of other already existing search engines.

It’s 2026, and while we (still) have a number of options to search the general web, there is absolutely no reason we cannot self host our own personal web search engines.

Categories
Uncategorized

Goodbye Google Search!

Google Search

Remember last year when I quit Google Mail and also quit Google Docs? Well, my efforts to DeGoogle didn’t end there…

I quit Google Search.

Yeah, Google Search… which I’ve been using for a quarter of a century. I even had a Google t-shirt back in… 1999 I think? No more.

DuckDuckGo

I switched to DuckDuckGo, which is a bit more focused on privacy and doesn’t use my search history for advertising purposes. I don’t even have an account to log into. You can still change some settings, which should persist across browser sessions thanks to cookies in your browser. (Though you then have to set them in each browser on each device you use. I think that’s a small price to pay.)

DuckDuckGo AI Settings

DuckDuckGo does have some AI features, but allows you to turn them off. Searching the web, not seeing any AI garbage, it feels like… like Google used to be, or like the web used to be. It’s pretty nice. Occasionally if I cannot find what I need I will open a private window and do a Google search, but it’s rare I need to do that.

SearXNG

While I’ve been happy with DuckDuckGo, I also try not to be complacent. I’ve installed SearXNG onto my home server which can also search the web.

SearXNG is a free internet metasearch engine which aggregates results from up to 251 search services. Users are neither tracked nor profiled. Additionally, SearXNG can be used over Tor for online anonymity.

So SearXNG goes out and does the searches using all sorts of search engines but protects your privacy by being a middle-agent. (There some public instances you can try, though they may be hit or miss, and localized results may be off.)

Mojeek Search

I’ve also had someone suggest I try Mojeek which claims to be “The alternative search engine that puts the people who use it first.” and unlike SearXNG does not rely on existing search engines but instead uses the Mojeek index of the web, so it is independent. (And, not into AI.)

I’ve left out a few options you could use instead of Google, but for those of use remember Ask Jeeves, Lycos, Alta Vista, Yahoo, and all the other search engines of 30 years ago, I think it’s our duty to find alternatives and try them out and see if we can move away from big tech, either by choosing self-hosted alternatives, more ethical companies, or ways to subvert the existing system.

Categories
Uncategorized

Google-Free Fridays… are you insane!?

You will be assimilated.... by Google!

Imagine a future where you rely on one company for your email, your calendar, your maps, your documents, your videos, your phone, your blog, your discussion groups, and, oh yeah, your searches and even your browser.

I was reminded that back in 2007 Danny Sullivan revived my Google-Free Fridays idea that I originally proposed in 2003.

Today, in the second half of 2011, the idea of going even one day without Google may seem insane for some people. The reliance that many have on one single company for so much of their Internet experience is, frankly, a bit frightening to me.

Don’t get me wrong, Google offers a lot of great services, and I use a number of them, but what would happen if they shut you out? It happened to Phil Wilson, and it’s happened to others as well. How much of your world would be affected if Google disabled your account?

Oddly enough, in the time between starting this post and finishing this post, Google launched a new initiative: Email Intervention. Obviously Gmail needs more users, and they want your help assimilating anyone who hasn’t yet become part of the Gmail Family. (I know the Email Intervention thing has a tinge of humor, but as all comedians know, behind every bit of humor is a bit of truth.)

I’m not saying Google is pure evil, because, well, they aren’t. In fact, the Data Liberation initiative should be applauded, and is something every web service should provide. I just hope there are other web services besides Google 5 years from now.

Categories
Uncategorized

Beautiful

Beautiful

I did an image search for the word “beautiful” and this is what I got…

I’m not sure what it says about us as human beings, but I found it interesting.

Categories
Uncategorized

I Googled Apple

I Googled Apple

I swear I won’t use the word “evil” or “conspiracy” in this post, but…

This seems weird to me. I used Google to search for “Apple, Inc.” and I got a search result for “www.apple.com” with the title showing “Apple Computer, Inc.”

Now, as you know, “Apple Computer, Inc.” recently changed it’s name to “Apple Inc.” so I wasn’t sure why Google showed it as “Apple Computer, Inc.” I then went to www.apple.com and the title of the page is just “Apple” not even “Apple Inc.”

So that begs the question… What’s up? If they indexed the page the same as they index every other site on the web, why does it not match? Is Apple feeding a different result to the Googlebot? (A quick test with Perl’s LWP::UserAgent says probably not.) So really… what’s going on? I’d like to know…