oskar on Sun, 13 Jul 2003 18:41:40 +0200 (CEST)


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: <nettime> googolo-structural digest [sheetz, douwe]


Hi

> > but the point -- that google's technology is a political one -- holds
> > because its algorithm encodes the political structure of popularity...
> 
> An interesting idea. What are the other possible search engine political
> structures. The two most succesfull approaches right now are:
> 
> The Google model. The distinguishing feature of Google is that the role
> link popularity plays. This is independent of the actual search term (as far
> as we know), ie a page has a certain Google Rank and that helps the page, no
> matter what the user is searching for.
> This is not as much a populistic structure, but more a technocratic. It is
> not the popularity among searchers that determines success, but the 
> popularity among website builders/bloggers/corporations, what have you.

I personally love Google; I almost never use any other search engine.

However, I believe that it's going to have to undergo changes fairly
shortly. It's very strength, using pagerank, will become it's
greatest weakness.

Google's ranking system is, unfortunately, a feedback system. The problem
is that this feedback system seems to lack adequate controls to stop
positive feedback. There may be some magic factor in the pagerank calculation
to try and damp this effect, but I've not personally seen mention of it.

As you say, the feedback system is flawed because the people building
web sites are the people that are "voting" on the page. But I see that
there's another fairly large problem with this approach.


Let's say that I've recently become interested in the pagerank system.
I do a Google search, and work through the first pages of results, and find
lots of interesting things. I then write up a web page, and put a
discussion of pagerank on my web site. I also include a selection of
the most interesting links I've found on the page.

So: what just happened?

When Google looks at my page, it finds links to the pages that
it already has at the top of the index. This heightens the ranking of
those pages.

Every time someone uses Google to find links they can include on their
page, they increase the ranking of the top ranked pages. And this happens
all the time.

Unless I spend a very long time looking around the net, trying to
find useful links that are not "top rated", then all I ever do is
increase the ranking of the already top-rated pages.

This has huge social ramifications.

If you're top-ranked now, it's going to become harder and harder
to be displaced from your top ranking. The longer Google remains
the search engine of choice, the more limited people's access
to alternative viewpoints will get.

Further, people with a higher search ranking will take their
sites and turn them into commercial entities, in the hope of
making money from their sites. Slowly but surely, the top
ranked sites will become more and more commercial. Since there
are a huge number of "stagnant" pages on the net, which are
never updated, it will be even more difficult to remove
these top-ranked pages from the elite "first page results on Google"
list.

In fact, is this not already happening?

One of the most interesting things that I've seen in the last
few years is that the net seems to have become less and less
free, and pushes the boundries of "traditional society" less and less.
If this happens, it's almost never in the traditional web space,
it's in blogs, email, other places.

Perhaps this is because the top rated sites are now so
firmly entrenched at the top of the lists. Alternative viewpoints
end up appearing on page 8 or 9, and since nobody ever sees
them, nobody ever links to them, and they never progress up the
list.

Perhaps Google's already influencing our world view more than
we expect.


The primary way of getting out of this feedback loop seems
to be seperation of the measurement system from the feedback
system. I find it's exceptionally interesting that the blogging
system is causing "disturbances in the force" of pagerank.

Blogging, email, and other such things are quite fundamentally
different from the web. They are time based, off the cuff, and there
is a lot of internal cross referencing and meme transfer. Blogging
and email more rarely relies on the results of the search engines for
it's content: it's based on real world experience, and largely on
more human emotions (ie: it often represents an alternative
point of view to the norm). If there's a link to a website it's
often because it's related to something that people have not seen
before, or to an alternative point of view.  I see very few blogs
going "Hey wow, check this cool site: www.amazon.com".
They almost always point to alternative viewpoints or current events.
There's a lot of competition be the "the first" that mentions some
cool page. People still link to cnn, of course, but they normally do
it to some specific article at CNN that's pertinent to the day.

Perhaps the time aspect of blogging could fundamentally
change things. Blogs log the passage of time, more than the
traditional web does. Perhaps people could deteriorate the importance
of links as they got older and less relevant.


There are various other ways that you could collect feedback;
if you imposed on everyone's email and found urls, you'd have
a much more "intimate" ranking of pages... pages that
people wouldn't want to put on their home page with
pictures of their cat. There is still a feedback system, of
course, since they will often search for a page and then refer
it to a friend, but it seems less direct. I normally email
people hard-to-find/interesting links.. stuff that the first
hit doesn't find, and which uses interesting search criteria.
I am, of course, not your average net user.


There are a variety of other ways that this situation could be
improved; some that spring immediately to mind are:

1) Have some sort of exponential reduction of popularity;
thus pages that are very highly rated by links are penalised for
being so highly ranked, and may be overtaken by other pages that
actually have a lower number of links.

2) Randomly insert some lower ranked pages into the first page.

3) When you put a link on a page, include a tag that indicates where
you got it from. Page indexers then read that tag, say "ah, the
author got this tag from my index - reduce my index value by some
appropriate value". Obviously there are problems with this approach,
since it means updating every single link on the net :)

4) Try and reduce the impact of older pages. If a page hasn't
been updated for a very long time, it's likely that the author
hasn't looked at the links that they are referencing. Thus you
could ignore these in the ranking, or even penalise the destination
page. Similar to the blog thing above.

Oskar Pearson

#  distributed via <nettime>: no commercial use without permission
#  <nettime> is a moderated mailing list for net criticism,
#  collaborative text filtering and cultural politics of the nets
#  more info: majordomo@bbs.thing.net and "info nettime-l" in the msg body
#  archive: http://www.nettime.org contact: nettime@bbs.thing.net