top of page

Shemeo Search News #04 - OpenAI take on search, Google gets into legal trouble, and more

A blue background with logos

We’re well into August, and the past month has been a bit of a rollercoaster. Google’s been in a spot of legal trouble which could cause a complete shift in the search landscape, Chrome has turned its back on plans to block third-party cookies, and OpenAI have announced a new search engine. And that’s not even scratching the surface.


That's enough intro from me - here is a round up of some of the most recent top stories in search.





1) Google loses landmark legal case against the U.S. Government

Possibly the hottest piece of search news this year. Essentially, Google has lost an ongoing antitrust court case that was originally brought against them by the U.S. Department of Justice in 2020.


What’s the issue?

When it comes to search, Google has a huge slice of the pie that other search companies find nearly impossible to eat into. They have numerous exclusivity agreements with companies to make Google search the default across their devices, browsers, etc.


For example, they pay Apple billions of dollars annually ($20 billion in 2022, according to Bloomberg) to be Safari’s default search engine.


The Department of Justice decided the use of these exclusive agreements were anti-competitive and violated antitrust laws. The vast amounts of money that Google pays for these agreements mean companies can’t afford to compete with them, and therefore, they have a monopoly over search.



What does this mean?

The outcome of this trial could mean Google breaking up parts of its business or altering the way it conducts business forever. This significant disruption opens up the floor for other companies to develop their search offerings and deliver them on a more even playing field.




2) Chrome U-turn on third party cookie blocking

This one caught me, and I think many others, off guard.


After announcing plans to ditch the use of third-party cookies by blocking them from Chrome in 2020, Google has decided to abandon the idea despite positive privacy implications for users.

The Privacy Sandbox API developed to achieve privacy goals after the depreciation of third-party cookies will still be available for people to use, but for the time being they have chosen to retain third party cookies and add a new experience to Chrome that “lets people make an informed choice that applies across their web browsing”.


You can read the full statement on this choice here.



3) Google are improving how they handle explicit “deepfakes” in search

AI is more advanced and accessible than ever before. Unfortunately, people have been taking advantage of generative technology to create sexually explicit images of real people without their consent. Gross.


On July 31st, Google reacted to the rise of explicit deepfakes on the web by sharing important developments that should make it simpler for people to get these images removed from search, and prevent them showing up in the first place. These updates include:


  • Newly developed systems that filter all explicit searches about a person when a request to remove explicit, non-consensual content is approved. This system should also remove all duplicates of the reported content

  • Reducing the exposure to explicit content in search results by tweaking ranking systems to surface legitimate, high quality content and downrank fake imagery when people are searching queries to find this type of content

  • Demoting websites that receive removal requests for a high volume of fake explicit imagery


They will continue to develop these systems as AI technology evolves.



4) Reddit has blocked most search engine crawlers… apart from Google


On July 1st, Reddit changed their robots.txt file to the following:

Reddit's robots.txt file

At a glance, it looks like they are blocking every crawler from reaching their website. But if you check this through the Rich Results Test that crawls from Google’s IP ranges, you can see that Google can access and crawl Reddit’s full website.



Reddit's robots.txt file in a Rich Results tool live test



Other search engines aren’t so lucky. A spokesperson for Reddit, Tim Rathschmidt, told The Verge that they weren’t able to reach agreements with all search engines (notably Bing) around their use of Reddit content, particularly for training and supplying data for AI models, which is what ultimately led to the block. 


Rathschmidt also noted that this was unrelated to Reddit’s recent partnership with Google, which has seen the search engine driving huge amounts of traffic to the site.



5) OpenAI have announced their own search experience, SearchGPT

The prototype of their flagship search product has been released to a small group of users for testing. As a generative AI sceptic, this had me shaking my fist in the air for a while, but at this stage the interface does at least appear to fairly link to source websites.


A screenshot from the OpenAI website showcasing their SearchGPT prototype

Screenshot from OpenAI


With Google recently losing an antitrust court case against the U.S. government and the potential of a more competitive search landscape on the horizon, this announcement was pretty perfectly timed. But for now, only time will tell if SearchGPT can provide a decent search experience.



I don’t want SearchGPT to use my website's content!

I hear you. You can choose to block the SearchGPT crawler agent via your robots.txt file by disallowing the OAI-SearchBot crawler agent in your robots.txt file.



That’s all for this roundup, see you next time 👋






Comments


BREE SHEMILT

SEO, content writer, and creator of Shemeo.

BREE SHEMILT

SEO, content writer, and creator of Shemeo.

bottom of page