Sponsored in part by... Freeverse Freeverse, Inc.'s SOUND STUDIO 3.5.5 - Sound Studio is for anyone
who needs to record or edit audio with a professional tool, but at
a consumer price. Perfect for Podcasts, Music, More! Now updated
for OS X 10.5 Leopard. <http://www.freeverse.com/soundstudio>

 [F] TidBITS  / TidBITS  / TidBITS Talk  /

Sitemaps

[Zimmerman, Carl S]Carl S Zimmerman (apparently) - 06:28am Dec 1, 2006 PST
via email

Ref.1: http://emperor.tidbits.com/webx/?13.3c892c32
Ref.2: <a href="/webx?13@@.3c86e808">sub000, "Dates in TidBITS" #, 14 Nov 2006 12:37 pm</a>

On Nov 27, 2006, "infonauts" posted a message (Ref.1) to start a new
thread under the subect "Digest from TidBITS Talk" [subsequently
rethreaded by me -Adam]. Although it was apparently prompted by the
ongoing thread about "Dates in TidBITS" (Ref.2), I was more intrigued
by the concept of Sitemaps, and thus have re-titled this message.

The motivation for Sitemaps, neatly summarized by infonauts' first
paragraph, is laudable. The basic concept and the protocol
description on the Sitemaps Website are straightforward and clear.
The FAQ on the Sitemaps Web site is helpful as far as it goes.
Unfortunately, it does not go far enough, and there is NO point of
contact given for submitting additional questions in hope that they
might eventually find their way into the FAQ. (In fact, the only
place one can find the identity of the Big Three sponsors is in a
disclaimer buried in the Terms of Service.)

What I find missing is a clear statement of why any Webmaster should
expend the resources needed to implement and maintain sitemap files.
In other words, what will a sitemap file enable a Web crawler to do
that it doesn't already do? The news releases don't answer this
question, nor does the Sitemaps Website. Even the "official" blogs
of Google, the developer of the sitemaps protocol, don't answer it.

Given that one part of the sitemap protocol is a frequency-of-update
indicator, it might be reasonable to assume that a sitemeap can
encourage Web crawlers to revisit some parts of the site more
frequently in return for revisiting other parts less frequently. But
I can find no confirmation of such an assumption.

Other questions which I think important but which are not covered in
the FAQ (and which I didn't find answered in Google's blogs):

- Will Web crawlers find a sitemap file if it isn't explicitly
submitted to the various search engines?

- Is it necessary (or at least important, or even just helpful) to
update and/or resubmit an existing sitemap file when one of the files
to which it points has been updated?

- If a Website has a "what's new" page that is regularly updated to
identify changed pages elsewhere on that Website, is it necessary to
mention any page other than this one in the sitemap?

- If a Website is well structured (e.g., according to Google's
Guidelines for Webmasters), is it necessary to list all files on the
Website in sitemap files? If not, how should a Webmaster decide
which files to list and which to omit?

It's unfortunate that the sponsors of the sitemap protocol apparently
didn't put as much effort into making its Website useful as they did
into disclaiming any and all responsibility for the results of using
it. The concept of the protocol seems to be potentially useful in
improving the timeliness and usefulness of search engine results, but
the present inadequacy of documention may well prevent the full
potential from being reached.

Carl


Mark as Read
  OutlineAll MessagesOlder MessagesOldest MessagesNewest MessagesNewer Messages

moe (apparently) - Dec 2, 2006 3:12 pm (#1 Total: 2)  

Reply to this message
via email  

Photo of Author
Posts: 29
Re: Sitemaps

>What I find missing is a clear statement of why any Webmaster should
>expend the resources needed to implement and maintain sitemap files.
>In other words, what will a sitemap file enable a Web crawler to do
>that it doesn't already do?


First, if anyone is wondering what the sitemap is, see:

   http://www.google.com/support/webmasters/bin/answer.py?answer=40318

and in general, anyone with a website should read the documentation
on the Google Webmaster pages in great depth.

I heard Google's Search Evangelist address your very question at a
seminar two night ago. The above link answers you question but I will
add my impression from what he said the other night.

For the most part, the spider will crawl your site and the sitemap
will be identical. But for large sites with dynamic content, the
spider may miss some nooks and crannies. The sitemap tells Google
where you think valuable content is and the spider will make sure to
hit it all. I think it would also affect frequency. My guess that
heavily spidered sites will benefit less from the sitemap.

You also can have otherwise unlinked pages in the sitemap, something
there was previously now way for Google to find.

And Google is comparing their spider's activity to sitemaps to learn
how to improve their algorithms.

Regardless, it is important to know that the sitemap never hurts you,
it can only help.

Note, too, that the sitemap program is in beta and changing.

You might also check the Google webmaster discussion groups. This
question is right up their alley.


mmatty (apparently) - Dec 2, 2006 3:12 pm (#2 Total: 2)  

Reply to this message
via email  

Photo of Author
Posts: 382
Re: Sitemaps



On Dec 1, 2006, at 8:28 AM, Carl S Zimmerman wrote:

>
>
> What I find missing is a clear statement of why any Webmaster should
> expend the resources needed to implement and maintain sitemap files.
> In other words, what will a sitemap file enable a Web crawler to do
> that it doesn't already do? The news releases don't answer this
> question, nor does the Sitemaps Website. Even the "official" blogs
> of Google, the developer of the sitemaps protocol, don't answer it.

There's information on the Google Webmaster Site:

http://www.google.com/support/webmasters/bin/topic.py?topic=8476

And also information on Mobile Sitemaps and why it is important to
organize content specifically for mobile phones:

http://www.google.com/support/webmasters/bin/answer.py?
answer=34627&hl=en

>
> Given that one part of the sitemap protocol is a frequency-of-update
> indicator, it might be reasonable to assume that a sitemeap can
> encourage Web crawlers to revisit some parts of the site more
> frequently in return for revisiting other parts less frequently. But
> I can find no confirmation of such an assumption.
>
> Other questions which I think important but which are not covered in
> the FAQ (and which I didn't find answered in Google's blogs):
>
> - Will Web crawlers find a sitemap file if it isn't explicitly
> submitted to the various search engines?

The chances are excellent that the crawler will, however, it will
only happen if there are incoming links from other sites that will
lead the spiders to your site. The more incoming links, esp. those
containing relevant copy, the better it is for your rankings.

By constructing a sitemap using keywords and relevant content, you
can improve your rankings. This will also help with Flash, Quicktime,
etc. content. And make sure there's links to the sitemap on you home
page and on internal pages.

Spider can also get lost if there are many levels to click on; i.e.,
the spider has crawled 5-6 levels down into a folder structure, and
there's no link back to a site map or the home page. It will probably
then leave the site before crawling any other pages.

>
> - Is it necessary (or at least important, or even just helpful) to
> update and/or resubmit an existing sitemap file when one of the files
> to which it points has been updated?

It's a good idea to think of a site map as commitment if you'll be
updating content - you'll need to add and remove appropriately, and
adjust the copy accordingly. Revisiting your site map might also help
you find new ways to structure information more effectively - like in
the example I mentioned above, it can serve as a guide to see if your
architecture is getting too complicated for visitors and spiders.

>
> - If a Website has a "what's new" page that is regularly updated to
> identify changed pages elsewhere on that Website, is it necessary to
> mention any page other than this one in the sitemap?

It isn't necessary, but it is good strategy. Keep in mind that the
engines constantly re-index sites, and that although you have great
rankings on Monday, on Tuesday things could look very different.
You'll need to carefully research and monitor keywords and links.

>
> - If a Website is well structured (e.g., according to Google's
> Guidelines for Webmasters), is it necessary to list all files on the
> Website in sitemap files? If not, how should a Webmaster decide
> which files to list and which to omit?

Think of it not just listing copy, but including descriptive
information that will help visitors that are looking for particular
information on your site. If the copy in the links is well
constructed and contains words that visitors will be searching for,
it will improve your rankings as well.

It's also a good idea for visitors to include a sitemap on a 404
error page, if you've got one.

Marilyn






  OutlineAll MessagesOlder MessagesOldest MessagesNewest MessagesNewer Messages


 [F] TidBITS  / TidBITS  / TidBITS Talk  / Sitemaps




Add a message

To add a message to this discussion, you must be a registered user. Enter your email address below. If you have an account associated with the email address you enter, you will be prompted for your password. If not, you'll be able to create a new account with no fuss.

Enter your email address:

Submit