Lyrics & Knowledge Personal Pages Record Shop Auction Links Radio & Media Kids Membership Help
The Mudcat Cafesj

Post to this Thread - Sort Descending - Printer Friendly - Home


Googling for Mudcat

greg stephens 01 Jun 06 - 12:22 PM
George Seto - af221@chebucto.ns.ca 01 Jun 06 - 12:29 PM
Bill D 01 Jun 06 - 12:30 PM
greg stephens 01 Jun 06 - 12:33 PM
Scrump 01 Jun 06 - 12:53 PM
greg stephens 01 Jun 06 - 12:58 PM
Helen 01 Jun 06 - 01:18 PM
greg stephens 01 Jun 06 - 01:24 PM
MMario 01 Jun 06 - 01:33 PM
MMario 01 Jun 06 - 01:48 PM
treewind 01 Jun 06 - 05:46 PM
CarolC 01 Jun 06 - 05:59 PM
CarolC 01 Jun 06 - 06:00 PM
Bill D 01 Jun 06 - 06:18 PM
Jim Dixon 01 Jun 06 - 08:47 PM
GUEST,Jon 01 Jun 06 - 08:52 PM
Helen 02 Jun 06 - 08:52 PM
Azizi 02 Jun 06 - 10:09 PM
The Fooles Troupe 03 Jun 06 - 03:13 AM
greg stephens 04 Jun 06 - 02:44 PM
The Fooles Troupe 04 Jun 06 - 06:15 PM
Azizi 04 Jun 06 - 07:32 PM
The Fooles Troupe 04 Jun 06 - 07:52 PM
Share Thread
more
Lyrics & Knowledge Search [Advanced]
DT  Forum Child
Sort (Forum) by:relevance date
DT Lyrics:





Subject: Googling for Mudcat
From: greg stephens
Date: 01 Jun 06 - 12:22 PM

Mudcat seems to behave very curiously in relation to Google. I have just been playing around, trying to make sense of it.
For example: there is a thread on Mudcat entitled "William Irwin Lake District Fiddler". But if you google on the exact phrase "William Irwin Lake District Fiddler", it doent find that thread.
For example: there is plenty of discussion of the Boat Band's CD "A Trip to the Lakes" on Mudcat. But if you google on( "Trip to the Lakes" mudcat )it wont find it.
For example: I have discussed the tunes "Queen of the May" and "Kendal Waltz" on Mudcat. Googling on( "queen of the may" mudcat "greg stephens" )will find that discussion. Googling on( "kendal waltz" mudcat "Greg stephens" )wont.
I was just playing around with threads I know contain certain words. Presumably these sorts of strange discrepancies will occur with anything people choose to play with.
So how does Mudcat interact with Google? Is there s ome way of explaining this odd behaviour? By the way, all ( and ) marks above weren't anything I put into google, just ways of showing what I actually typed into the"include all words" box in the advanced search section


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: George Seto - af221@chebucto.ns.ca
Date: 01 Jun 06 - 12:29 PM

That's the thing you'd have to ask Google itself.

Google has its own priority of finding things on the internet. Sometimes a Google search will come upon a Mudcat thread, but it's relatively rare, considering some items will have a lot of Mudcat stuff.

Remember, Mudcat is stored like a database, and not like static web-pages. I suspect Google finds things which have been recently accessed out of the dynamics of Mudcat.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Bill D
Date: 01 Jun 06 - 12:30 PM

Google searches by links, as I understand it. It follows whatever patterns it sees when it passes by, which means it doesn't automatically get everything...but it 'might' get more on a later pass. I also wonder if thread titles are indexed, or just content.

(I tried one search of the words to a song, and the only hit on the internet was my own Mudcat post from several years ago...)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: greg stephens
Date: 01 Jun 06 - 12:33 PM

It certainly finds some bits of text within a post, but not others. I just wondered why it found my thoughts on "Queen of the May" interesting, but not "Kendal Waltz". Do you think google has musical prejudices?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Scrump
Date: 01 Jun 06 - 12:53 PM

It could be someone has links to one of your articles but not the other - as Bill D says, that can affect the Google ranking. You can find out who's linked to the url - maybe that will confirm it?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: greg stephens
Date: 01 Jun 06 - 12:58 PM

It seems to me that if google cant find a thread with a specific title on Mudcat, it means Mudcat is not being useful as a source of information to outsiders. It's no good having the definitive account of the origins of "Dirty Old Town" on Mudcat, if a google search on "Dirty Old Town" cant actually find the Mudcat thread called "Dirty Old Town".


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Helen
Date: 01 Jun 06 - 01:18 PM

Hi greg,

Did you mean that you didn't use the double quotes (" ") around your phrases? Google uses double quotes around a group of words to keep them together as a phrase, i.e. same words in same order.

The other thing to consider is that really recent threads may not have been indexed or they may not have as many hits or as much repetition of the phrases you are searching for. Depending on how Google googles, I suppose.

Helen


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: greg stephens
Date: 01 Jun 06 - 01:24 PM

It doesnt particularly seem as if new threads are harder to find than old ones. I have tried looking for threads several years old: some it finds, some it doesnt.
Very very new threads it doesnt find, but that is different It generally finds most new things in a day or two. But then it sometimes seems to forget them. And I cant help feeling the explanation lies more in Mudcat than in google, but I dont understand what's going on to really come up with a theory.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: MMario
Date: 01 Jun 06 - 01:33 PM

most mudcat pages are not static; so when google-bots look to add the pages to their index only those being actively read are added.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: MMario
Date: 01 Jun 06 - 01:48 PM

Have you seen this? Google Technology


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: treewind
Date: 01 Jun 06 - 05:46 PM

The Googlebot only vists any site every few days or even weeks, so on Mudcat it only indexes threads that happen to be in the list on the day it visited.

That would explain why some whole threads don't get indexed - just luck of timing. Google isn't infallible.

And MMario - great link :-)

Anahata


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: CarolC
Date: 01 Jun 06 - 05:59 PM

Do the Google bots go by number of links, or can a page that is only linked to one or two others still get added to Google's index?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: CarolC
Date: 01 Jun 06 - 06:00 PM

Also, do the Google bots only detect external links, or do they detect internal links as well?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Bill D
Date: 01 Jun 06 - 06:18 PM

if I understand it right, they will follow any link they find within the current page they are indexing, so if you link to some older Mudcat post, it will be collected too...until it finds no more 'live' links...Maybe it is getting more because of the links to similar threads it finds at the top of some long threads.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Jim Dixon
Date: 01 Jun 06 - 08:47 PM

As I understand it, search engines such as Google use programs called web crawlers (click for Wikipedia article) to scan web sites and build an index.

The web crawler starts with the home page—that would be http://www.mudcat.org/ in this case—indexes that page, and then follows all the links it can find on that page, and indexes those pages, then follows all the links on those pages, and so on.

Some web crawlers have limits to the number of levels of links they will follow. Some web crawlers don't index the whole page, but limit their search to the first (insert arbitrary number) lines of text on each page. I don't know whether Google has any limits. If it does, its limits are probably higher than any other search engine.

At any given moment, there are lots of old threads on Mudcat that can't be reached this way. To view them, you have to type something into a search box. Web crawlers aren't smart enough to figure out what they should type into a search box in order to view every existing thread.

Come to think of it, neither am I. The only way I know to view every existing thread is to start with http://www.mudcat.org/thread.cfm?threadid=1 and increment the number from 1 up to whatever we are at now—at least 91914. Alternatively, you could start with http://www.mudcat.org/Detail.CFM?messages__Message_ID=1 and increment the number up to 1751399 or so.

I only know that because I happen to know a bit about how Mudcat works. I don't think a web crawler would figure it out. "Threadid=nnnnn" could mean anything.

Of course a programmer at Google, with a little investigation (or if someone clued him in) could easily write a special program that would search Mudcat this way, but I doubt that searching Mudcat is high enough on Google's priority list to warrant a specially-written program.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: GUEST,Jon
Date: 01 Jun 06 - 08:52 PM

Google and oher search engines need a way of finding a page/ site in the first place, eg. an external link from another site or submitting the site.

Google and others follows links according to "the rules" the website sets for robots. See http://www.robotstxt.org/wc/robots.html for more details and look at Mudcat's rules. This page for example has <META NAME="ROBOTS" CONTENT="INDEX,NOFOLLOW"> a "well behaved" (not all do as "instructed") robot would index this page if it found it but not follow any links contained in it.

Also, see Anahata's comments re frequency of indexing.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Helen
Date: 02 Jun 06 - 08:52 PM

BTW, when I see this thread name it brings an image to mind of a large number of Mudcatters in a huge virtual computer room, all furiously "Googling for Mudcat", like an Olympic sport, e.g. "swimming for Australia". Which is not far off, because we often seem to Google for information for ourselves and others, and there is an unofficial competition to see who can Google the answers the fastest.

Just a visual bit of thread creep -sorry!

Helen


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Azizi
Date: 02 Jun 06 - 10:09 PM

Helen, I liked your 'googling for mudcat' visual.

And I didn't find your thread creep to be the least bit creepy.

Hmmm. I wonder what the theme song would be for the 'Goggling for Mudcat' game show?


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: The Fooles Troupe
Date: 03 Jun 06 - 03:13 AM

"Barney Google, with his Goo-goo-googly eyes!"


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: greg stephens
Date: 04 Jun 06 - 02:44 PM

Earlier in the thread, I referred the fact that googling on the exact phrase "William Irwin lake District Fiddler" produced no hits at all, even though there is a Mudcat thread with that exact name.
   Today, strangely enough, googling on that phrase rather surpsisingly finds this current thread(googlin for Mudcat), because I used the William Irwin phrase in a posting. Yet, google still doesnt find the original thread on the subject. Which is a great pity, because a Mudcat thread on a subject would be a useful thing to find for anyone doing this kind of research.


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: The Fooles Troupe
Date: 04 Jun 06 - 06:15 PM

Mudcat has an 'internal database' structure, not a 'web of html pages' structure.

The latter are 'on-line' all the time and Web crawlers can get at them all.

The former are only served up on demand, and web crawlers can only find them if they are served up while they are looking (Peek-a-boo!)!!!

Because of the ads at the bottom of the page, Google sees each one that IS served up now, but only them.... unless someone refreshes every thread....

:-)

So if you have a favourite thread you want indexed, you nw know how to ensure it...


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: Azizi
Date: 04 Jun 06 - 07:32 PM

Great song choice, Foolestroupe.

Azizi,

who is better late than never.

:o)


Post - Top - Home - Printer Friendly - Translate

Subject: RE: Googling for Mudcat
From: The Fooles Troupe
Date: 04 Jun 06 - 07:52 PM

I know I'm going to ge in trouble for this...

Is "Googling for Mudcat" anyhting like "F**king For Britian"?


Post - Top - Home - Printer Friendly - Translate
  Share Thread:
More...

Reply to Thread
Subject:  Help
From:
Preview   Automatic Linebreaks   Make a link ("blue clicky")


Mudcat time: 28 August 5:39 AM EDT

[ Home ]

All original material is copyright © 2022 by the Mudcat Café Music Foundation. All photos, music, images, etc. are copyright © by their rightful owners. Every effort is taken to attribute appropriate copyright to images, content, music, etc. We are not a copyright resource.