Sponsored in part by... Bare Bones Software Yojimbo 1.5 from Bare Bones Software: Your effortless, reliable
information organizer for Mac OS X. It will change your life,
without changing the way you work. Download the demo or buy it
today! <http://www.barebones.com/products/yojimbo/>

 [F] TidBITS  / TidBITS  / TidBITS Talk  /

Tiger Mail vs Spotlight

[edward]edward (apparently) - 06:52am Aug 1, 2005 PST
via email

In the discussion of Spotlight, one thing I haven't seen mentioned is the
negative effect on Mail.

Up through Panther (Mail pre 2.0), Mail stored messages in standard .mbox
format files. This kept the number of files to a reasonable number even for
heavy email users and conformed to a widely used de facto standard.

But as of Tiger (Mail 2.0), Mail stores every message in a separate .emlx
file! Oh the horror. I've lost track of how many messages I have saved,
probably getting near 50,000, and I'm not even a heavy email user compare
with some people around TidBITS and TidBITS-Talk.


[But what's the real world downside? Emailer used to do this long ago, but the filesystem then couldn't come close to handling it at a decent performance level. Is performance still a problem with this approach? -Adam]


My guess is that the Mail developers screamed in agony but lost --
Spotlight was too important, and whatever powers made the decision said
that Mail should suffer rather than waiting for an interface between Mail
and Spotlight that would enable Spotlight to recognize messages within
files. The implication runs farther -- presumably this means that Spotlight
is not capable of indexing any kind of database except on the file level,
which is useless for any database system. Email messages within an mbox
file constitute a very primitive sort of database.

I've been setting up a new Mac mini for my wife to use. I thought I'd try
moving her to Mail, since Eudora has become less reliable with each release
and support is nil. I didn't expect Apple support to be any better, but
thought at least Mail might be more reliable. Wasted a couple of days on
that effort before discovering the .emlx kludge.

Oh the horror.

Edward
Art Works by Melynda Reid: http://paleo.org


Mark as Read
  (older msg: 1)OutlineAll MessagesOlder MessagesOldest MessagesNewest MessagesNewer Messages

Chris Pepper (apparently) - Aug 1, 2005 10:56 am (#2 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 841
Re: Tiger Mail vs Spotlight

At 6:52 AM -0700 2005/08/01, Edward Reid wrote:
>In the discussion of Spotlight, one thing I haven't seen mentioned is the
>negative effect on Mail.
>
>Up through Panther (Mail pre 2.0), Mail stored messages in standard .mbox
>format files. This kept the number of files to a reasonable number even for
>heavy email users and conformed to a widely used de facto standard.
>
>But as of Tiger (Mail 2.0), Mail stores every message in a separate .emlx
>file! Oh the horror. I've lost track of how many messages I have saved,
>probably getting near 50,000, and I'm not even a heavy email user compare
>with some people around TidBITS and TidBITS-Talk.

Edward,

        I don't agree that this is a bad thing. Cyrus (the
high-performance mail server Apple uses for Tiger Server) also uses
message-per-file. There are a few advantages. Backups fly -- you
don't have to pick up old messages in incrementals. Also, messages
are only written once (and possibly deleted once), so message
corruption is almost a non-issue. In contrast, I have lots of Eudora
mailboxes containing corrupt messages. Dealing with this is painful,
and with message-per-file, would not be.

        Basically, this puts a lot more strain on the file system
(which seems quite able to handle it), which is why message-per-file
has been avoided on Mac OS X, but that doesn't seem to be a real
problem today. I'm happy to have the isolation between messages.


                                                Chris Pepper
--
Chris Pepper: <http://www.reppep.com/~pepper/>
Rockefeller University: <http://www.rockefeller.edu/>

Matt Neuburg (apparently) - Aug 1, 2005 12:22 pm (#3 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 2626
Re: Tiger Mail vs Spotlight

On or about 8/1/05 10:56 AM, thus spake "John C. Welch" <jwelchbynkii.com>:

> On 8/1/05 08:52, "Edward Reid" <edwardpaleo.org> wrote:
>> Up through Panther (Mail pre 2.0), Mail stored messages in standard .mbox
>> format files. This kept the number of files to a reasonable number even for
>> heavy email users and conformed to a widely used de facto standard.
>
> Actually, it wasn't standard. It was a package, and inside the package was
> the "real" mbox file.

Well, a package isn't a "thing" except to the Finder and some other GUI
elements that voluntarily follow similar rules. To the rest of the system
(i.e. Unix as a whole), a package is just a folder, no more and no less. So
all you're saying is that it wasn't an .mbox file because it was in a
folder. That's ridiculous, because everything is in a folder. So it *was*
standard. m.

--
matt neuburg, phd = matttidbits.com, http://www.tidbits.com/matt/
pantes anthropoi tou eidenai oregontai phusei
AppleScript: the Definitive Guide -
http://www.amazon.com/exec/obidos/ASIN/0596005571/somethingsbymatt
Take Control of Word 2004, Tiger, and more -
http://www.takecontrolbooks.com/tiger-customizing.html
Subscribe to TidBITS! It's free and smart. http://www.tidbits.com/



atlauren (apparently) - Aug 1, 2005 12:22 pm (#4 Total: 21)  

Reply to this message
via email - Practicing random acts of punditry.  

Photo of Author
Posts: 802
Re: Tiger Mail vs Spotlight

At 10:56 AM -0700 8/1/05, John C. Welch wrote:
>Actually, it wasn't standard. It was a package, and inside the package was
>the "real" mbox file.

I thought it was a clever method of getting metadata about the
mailbox' contents. The <name>.mbox was a package, inside of which
was the actual mailbox file, named "mbox". A smattering of other
files containted certain metadata including, if I recall correctly,
an index of headers and such.

--
Andrew Laurence
atlaurenuci.edu

barefootguru (apparently) - Aug 1, 2005 12:22 pm (#5 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 110
Re: Tiger Mail vs Spotlight

> But as of Tiger (Mail 2.0), Mail stores every message in a
> separate .emlx
> file! Oh the horror. I've lost track of how many messages I have
> saved,
> probably getting near 50,000, and I'm not even a heavy email user
> compare
> with some people around TidBITS and TidBITS-Talk.

You're correct that Spotlight only works at the file level. iCal and
Address Book have also been adapted to work with this: <http://
arstechnica.com/reviews/os/macosx-10.4.ars/9>.

The biggest drawback of working at the file level is we're unlikely
to see Spotlight indexing of FileMaker databases, Entourage e-mail,
etc. But from a technical perspective, how could Apple have
programmed Spotlight to jump to a particular record or e-mail?


[I was hoping someone who knew more would jump in, but my impression is that Apple is at least working on, if they haven't already shipped, an API that developers can use to let Spotlight look inside database files. -Adam]


I second Adam's question, that you haven't said what your objection
to e-mails as individual files is? I love it--as Chris says,
incremental backups fly. No more backing up 400 Megs because a few
new messages have arrived.

One gotcha though, is that after conversion Tiger still leaves the
old mbox files around: i.e. the size of your Mail folder is
doubled. See this hint about clearing out the old mail: <http://
www.macosxhints.com/article.php?story=200505121139410>.

Cheers

atlauren (apparently) - Aug 1, 2005 1:08 pm (#6 Total: 21)  

Reply to this message
via email - Practicing random acts of punditry.  

Photo of Author
Posts: 802
Re: Tiger Mail vs Spotlight

At 12:22 PM -0700 8/1/05, Tom Robinson wrote:
>[I was hoping someone who knew more would jump in, but my impression
>is that Apple is at least working on, if they haven't already
>shipped, an API that developers can use to let Spotlight look inside
>database files. -Adam]

I've been told that Spotlight 1.0 (aka Tiger) can only handle one
data container within a file. Supposedly multiple data contiainers
in a file is a 2.0 feature.

--
Andrew Laurence
atlaurenuci.edu

cwilbur (apparently) - Aug 2, 2005 1:10 pm (#7 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 84
Re: Tiger Mail vs Spotlight



On Aug 1, 2005, at 3:22 PM, Tom Robinson wrote:

> The biggest drawback of working at the file level is we're unlikely
> to see Spotlight indexing of FileMaker databases, Entourage e-mail,
> etc. But from a technical perspective, how could Apple have
> programmed Spotlight to jump to a particular record or e-mail?

Spotlight works with "importers," and each document type has an
importer associated with it. When you invoke something from the
Spotlight dialog, Spotlight opens the file in the application, and
offers the application information on the search query. Applications
that are Spotlight-aware then know to search their document. So it's
not Spotlight that makes the program open to a particular location,
but the application itself.

Consider a PDF. If I search for a phrase from a PDF in the Spotlight
bar, Spotlight shows me a list of PDFs containing that text. Then if
I open it from there, when Spotlight opens the document, it passes
the phrase I searched for to Preview, and it's Preview that does the
searching. If Preview were not Spotlight-aware, it would just open
the document as if I had double-clicked on the PDF's icon in the Finder.

Charlton


--
Charlton Wilbur
cwilburchromatico.net



barefootguru (apparently) - Aug 2, 2005 1:10 pm (#8 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 110
Re: Tiger Mail vs Spotlight

> Consider a PDF. If I search for a phrase from a PDF in the
> Spotlight bar, Spotlight shows me a list of PDFs containing that
> text. Then if I open it from there, when Spotlight opens the
> document, it passes the phrase I searched for to Preview, and it's
> Preview that does the searching. If Preview were not Spotlight-
> aware, it would just open the document as if I had double-clicked
> on the PDF's icon in the Finder.

But isn't Spotlight just passing the same 'find' command to all
applications? If I double-click on a TextWrangler document, the
search string appears in the Find dialog, but the document opens at
the top. I figure Preview scrolls to the right spot because it does
live finds...

I guess my reasoning is that applications are sharing a global 'find'
string rather than specifically being Spotlight aware--but the end
result is the same (was was trying to Google up something, but no joy).

Cheers

butchfag (apparently) - Aug 2, 2005 1:10 pm (#9 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 64
Re: Tiger Mail vs Spotlight

Got a question that is relevant to this topic. When I upgraded to
Tiger on my PB I checked my HD space because I'm aware I keep too much
stuff on the internal and things were fine. I didn't notice that my HD
space went bye bye when I opened mail for the first time (I have a LOT
of mail on the HD) and the system ran out of disk space and mail
crashed.

I got it back up and all was well after moving some large files off
the HD, but my inbox was wiped and my sent mail as well. Pulled the
inbox down off the server when I first connected but no joy on sent
mail. I noticed the mbox file was still there so I asked mail to
import my data, which it was happy to do again (now I'm at 3x the
space required for the actual mail) but no sent mail.

Does anyone have any ideas how I can get my sent mail back ? This is
my business account and having that data on hand is very nice.

Christopher Appell
European Market
FreeRecruiting.com


edward (apparently) - Aug 2, 2005 1:10 pm (#10 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 255
Re: Tiger Mail vs Spotlight



>[But what's the real world downside? Emailer used to do this long ago, but
>the filesystem then couldn't come close to handling it at a decent
>performance level. Is performance still a problem with this approach? -Adam]

I did a full backup last night, over 100Mb Ethernet to a hard disk on a
fast computer running Retrospect in server mode. I watched casually and saw
the speed vary from 49 MB/min to 465 MB/min. It was obvious from watching
that the high speeds were on large files and the low speeds occurred while
copying large directories of small files. The large files were mostly
non-compressible stuff like .mov, so no compression effects were involved.
The top speed is close to the limits of the LAN, so the actual client
performance difference may have been even greater. OTOH without doing
better controlled experiments, I can't tell whether the different was due
to the file system itself or to the Retrospect client, nor whether the
Retrospect server could have had a bottleneck.

At any rate, that's a real world case with a ten-fold performance
difference. That's still meaningful to me.

At 12:22 PM 08/01/2005 -0700, Tom Robinson wrote:
>The biggest drawback of working at the file level is we're unlikely
>to see Spotlight indexing of FileMaker databases, Entourage e-mail,
>etc. But from a technical perspective, how could Apple have
>programmed Spotlight to jump to a particular record or e-mail?
>
>[I was hoping someone who knew more would jump in, but my impression is
>that Apple is at least working on, if they haven't already shipped, an API
>that developers can use to let Spotlight look inside database files. -Adam]

A file system is just a specialized database. There's no reason a database
-- at least a record-oriented or object-oriented database -- can't be given
an interface that makes it look like a file system, where the records are
files. After all, that's exactly what a mountable disk image is -- a set of
APIs that manipulate references to objects called "files" within some
container. The API is the same whether the container is a Macintosh HD, an
ISO 9660 CD, a network disk, a .dmg file, etc etc. For the container to be
a FileMaker database or a Mail mailbox is not a big deal compared with the
radical variations among file systems under modern OSs.

In the long view, it doesn't matter whether Spotlight is set up to view
file systems as specialized databases or to view databases as specialized
file systems, as long as it takes a broad enough view to encompass both.

>I second Adam's question, that you haven't said what your objection
>to e-mails as individual files is? I love it--as Chris says,
>incremental backups fly. No more backing up 400 Megs because a few
>new messages have arrived.

As mentioned above, I see massive performance degradation backing up very
large numbers of small files.

Any search requiring a relation that Spotlight doesn't have an index for
will suffer similar degradation. Even on a modern computer, opening 100,000
or a million files takes a long time, much longer than reading the data.

Backups can be argued either way. Yes, with totally naive storage and
organization and a completely file-based backup (no database audit
techniques), small files mean less duplicated data. OTOH, my daily backups
only duplicate a very small part of my email corpus because it's organized
into monthly mailboxes, most of which don't change. The interesting thing
is that I established this organization purely to make it easier for my
mind to handle it, not for the computer -- and yet it seems to serve the
computer well. However, if you prefer to regard the email corpus as one
bundle and view it entirely via Spotlight -- an eminently reasonable
approach -- then you do have different issues with backups.

As a side, my objections to storing email in databases are based on the
fact that such databases don't use proper database backup techniques such
as audit files and thus can't be backed up reasonably, and that the
structure isn't open in a way that allows arbitrary programs to read it
without licensing software. Fix those issues and I'd pick a mail program
that uses a database. At present there's nothing on the horizon that meets
either of these criteria, much less both.

My most immediate objection to a huge number of files, though, is mostly
visceral. I've been programming almost 40 years now, and one thing that's
been pretty constant is that stressing the file system too hard is a Bad
Thing. You can argue and argue that file systems have become much more
robust (I agree) and that in theory they can handle the stress. History
makes me leery of trusting these arguments.

The one other point I'd make is in balancing indexing data vs file data.
When the average file size is only a kilobyte or so, you're reaching the
point that the directories consume nearly as much space as the files --
maybe not within a few percent, but within a small factor. This doesn't
seem to be a good use of the file system. A reasonable ratio of data to
index is a simple and imperfect, but useful, criterion in picking the
storage method.

At 12:12 PM 08/01/2005 -0400, Chris Pepper wrote:
>Backups fly -- you don't have to pick up old messages in incrementals.

See above. Incremental backups fly, full backups don't.

>Also, messages are only written once (and possibly deleted once), so
>message corruption is almost a non-issue.

At the cost of a lot more data in, and writing to, the file system
directory. Personally I'd rather do more file I/O and less directory I/O.
I'd rather have a corrupted message than a corrupted disk directory.

>In contrast, I have lots of Eudora mailboxes containing corrupt messages.
>Dealing with this is painful, and with message-per-file, would not be.

I don't know why the difference, but I've never had problems with corrupted
Eudora mailboxes except when I had a hard disk in the process of going
south, and I had problem with a lot of things then. ;-) This is in six
years of using Eudora, and before that eight years using uAccess, which
used the same mailbox format.

>Basically, this puts a lot more strain on the file system (which seems
>quite able to handle it), which is why message-per-file has been avoided
>on Mac OS X, but that doesn't seem to be a real problem today.

Again, see above. My lack of trust is based on history.

=================================================

I'm not going to argue this at length, and I won't post again unless
there's a specific point to be addressed. File systems are intended to be
used, so there's no perfect argument for using more or fewer files. It's
all in the balance.

Also, I'm not defending mbox format -- it's is a terrible database format,
yet that's basically what it's used as. It's useful because it's completely
open -- no special software needed to read it -- and has a good data/index
ratio.

Edward
Art Works by Melynda Reid: http://paleo.org


cwilbur (apparently) - Aug 2, 2005 5:30 pm (#11 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 84
Re: Tiger Mail vs Spotlight



On Aug 2, 2005, at 4:10 PM, Tom Robinson wrote:

> But isn't Spotlight just passing the same 'find' command to all
> applications? If I double-click on a TextWrangler document, the
> search string appears in the Find dialog, but the document opens at
> the top. I figure Preview scrolls to the right spot because it does
> live finds...

Exactly. TextWrangler may not understand the Spotlight information
that it receives, so it opens the document as it would if you just
double-clicked on it. But....

> I guess my reasoning is that applications are sharing a global 'find'
> string rather than specifically being Spotlight aware--but the end
> result is the same (was was trying to Google up something, but no
> joy).

applications *do* share a global "find" string, as well.

Charlton


--
Charlton Wilbur
cwilburchromatico.net



edward (apparently) - Aug 2, 2005 5:30 pm (#12 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 255
Re: Tiger Mail vs Spotlight

At 01:10 PM 08/02/2005 -0700, Edward Reid wrote:
>See above. Incremental backups fly, full backups don't.

OK, I'll add one thing. The actual copy part of an incremental backup will
fly, and it won't use much space. But scanning the disk still takes a long
time. If you have several hundred thousand email messages in separate
files, it's likely to take half an hour or more just to scan them to decide
what to back up. That's not flying.

Edward
Art Works by Melynda Reid: http://paleo.org


Nigel Stanger (apparently) - Aug 3, 2005 12:05 am (#13 Total: 21)  

Reply to this message
via email - Dunedin, New Zealand  

Photo of Author
Posts: 422
Re: Tiger Mail vs Spotlight

On 3/8/2005 12:30 PM, "Charlton Wilbur" <cwilburchromatico.net> spake thus:

> applications *do* share a global "find" string, as well.

Yes, and I wish I could turn it off. I've lost count of the number of times
that I've been searching for occurrences of a specific string in BBEdit,
then switched to some documentation perhaps and searched for something else,
then switched back to BBEdit to continue looking for occurrences of the
original string, only to find that it's now searching for the second
(totally irrelevant in context) string. Argh!

--
Nigel Stanger, Dunedin, NEW ZEALAND.
http://public.xdi.org/=nigel.stanger


barefootguru (apparently) - Aug 3, 2005 12:29 pm (#14 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 110
> Does anyone have any ideas how I can get my sent mail back ? This is
> my business account and having that data on hand is very nice.

Umm... restore the sent mail folder from a recent backup?

butchfag (apparently) - Aug 3, 2005 3:11 pm (#15 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 64
On 8/3/05, Tom Robinson <tamatiparadise.net.nz> wrote:
> > Does anyone have any ideas how I can get my sent mail back ? This is
> > my business account and having that data on hand is very nice.
>
> Umm... restore the sent mail folder from a recent backup?

An excellent suggestion however due to space limitations and the
gargantuan nature of my mailboxes I decided that only recent (two
months') worth of mail was sacrosanct and that is kept on my imap
server and backed up there. I don't really regret this decision as
while I would really like to have the sent mail back, not having it
doesn't really impact my work. I'd just like to have it back,
especially considering it's got to still be there in the mbox file.

I am seeing some other weirdness in mail since that fateful day, the
feature of clicking on the return arrow (showing I have replied to a
message already) doesn't open the corresponding sent message anymore.

Christopher Appell
European Market
FreeRecruiting.com

anthony (apparently) - Aug 4, 2005 10:47 am (#16 Total: 21)  

Reply to this message
via email - 55EA59FE  

Photo of Author
Posts: 19
Re: Tiger Mail vs Spotlight

Edward Reid wrote:

> As a side, my objections to storing email in databases are based on the
> fact that such databases don't use proper database backup techniques
> such as audit files and thus can't be backed up reasonably, and that the
> structure isn't open in a way that allows arbitrary programs to read it
> without licensing software. Fix those issues and I'd pick a mail program
> that uses a database. At present there's nothing on the horizon that
> meets either of these criteria, much less both.

You might want to look at http://www.dbmail.org/ which is a mail server
that stores the mail in MySQL, PostgreSQL, or SQLite. [Haven't used it,
just quickly found it by searching]

raykloss - Aug 5, 2005 6:01 pm (#17 Total: 21)  

Reply to this message
 

Photo of Author
Posts: 18
One aspect of Apple Mail that relates to Spotlight has not been mentioned, but is something I noticed on the first version of Tiger and remains today. Spotlight will index your sent Mail - but even if you delete it! No matter what you do to remove traces of an email, it will be able to be found with Spotlight. Even saved drafts will be seen in the global (Finder) Spotlight. Looking for the same item in Mail will show nothing.

I hope this is fixed in future versions of Spotlight and/or Mail.

Nik (apparently) - Aug 8, 2005 9:54 am (#18 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 377
Re: Tiger Mail vs Spotlight

On Aug 5, 2005, at 7:01 PM, raykloss wrote:

> One aspect of Apple Mail that relates to Spotlight has not been
> mentioned, but is something I noticed on the first version of Tiger
> and remains today. Spotlight will index your sent Mail - but even
> if you delete it! No matter what you do to remove traces of an
> email, it will be able to be found with Spotlight. Even saved
> drafts will be seen in the global (Finder) Spotlight. Looking for
> the same item in Mail will show nothing.
>
> I hope this is fixed in future versions of Spotlight and/or Mail.

I'm not sure that this is really a problem.

Spotlight is frequently a little behind activity in Mail. If you
trash a message, Spotlight will often believe that it's still in the
Inbox or some other folder.

Also, while Mail omits the Trash from certain searches, Spotlight
does not. Personally, I use the Trash as a rolling 180-day archive of
my email, so I prefer this behavior.

In any case, if you truly delete a message (trash it and then delete
it from the trash), Spotlight will eventually lose track of it. This
can, however, take a little while. (As much as 5-10 minutes.) It
might take longer still if you're using an IMAP account, because it
could still be in the IMAP cache even though it's been "deleted"
locally.

I just tested this by deleting a message. As you say, it sticks
around (and can even be opened from Spotlight) for a few minutes, but
after that, it's vanished entirely. I can no longer find it in Mail
nor in Spotlight.

Strangely enough, an mdfind command in the Terminal still shows a
hit, but it's for a file which doesn't exist anymore.

--Nik

Peter N Lewis (apparently) - Aug 9, 2005 9:44 am (#19 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 144
Re: Tiger Mail vs Spotlight

>OK, I'll add one thing. The actual copy part of an incremental
>backup will fly, and it won't use much space. But scanning the disk
>still takes a long time. If you have several hundred thousand email
>messages in separate files, it's likely to take half an hour or more
>just to scan them to decide what to back up. That's not flying.

While I'm not at all sold on the idea of one message per email (a
database is a perfectly sensible thing to store data in after all!),
I can't see why scanning several hundred thousand should take half an
hour or more - not that it wont, I would not be surprised if it did,
but it shouldn't. For example, I just did this:

ls -lR ~ >ls-lR.txt

it took around a minute and a half to scan my home folder, generating
a file with 472,000 lines (so roughly 472,000 files). That
information includes the modification date, file size, and
permissions, so ls has done pretty much all the work required to
determine if a file has changed. The file size is 27Meg, and to send
it across the network (100MB ethernet) encrypted over ssh, took about
9 seconds.

So for an incremental backup of my home directory with 470,000 files,
it should take around two minutes to determine what needs to be
backed up.

Whether any backup program could actually manage that, I don't know.
rsync should, but rsync under Tiger is so broken that it has no
chance of backing up my home directory.

All that said, I'm still dubious about the wisdom of using the file
system as a database, and I probably have over 300,000 email
messages, so in fact the number of files in my home directory will
almost double if Eudora switched to one message per mail. But there
is no excuse for an incremental backup to take more than a few
minutes to determine which files need to be copied.
    Peter.
--
<http://www.stairways.com/> <http://download.stairways.com/>

Jeff Porten (apparently) - Aug 10, 2005 7:50 am (#20 Total: 21)  

Reply to this message
via email  

Photo of Author
Posts: 342
Re: Tiger Mail vs Spotlight

Regarding "deleted" messages being available in Spotlight, note that
Mail uses a caching mechanism and doesn't immediately commit mailbox
changes to disk. So when you delete a message or move it to another
folder, there is some period of time where the message appears moved
in the UI, but is still in place in the filesystem. I have no idea
what causes this lag or what determines its length; sometimes it's
immediate, sometimes it's a minute or two.

AFAIK, the index should be immediately updated once the files are
actually modified -- that's part of the I/O hook system. However, if
mds has other work to do (other files having been modified, for
instance), I wouldn't be surprised if there was a queuing effect
resulting from that.

As for the number of files -- I got over that issue once I realized
that stock OS X plus Developer tools was around 400,000 files.
Biiiiiig change from OS 9 standards. So the 1,013,641 files I'm
carting around on my laptop doesn't cause me any agita.

Here's what is, though:

1) why the heck does an indexed search mechanism take so darned
long? Sometimes I get the near-instant results promised by Steve;
other times, it's a 2-minute wait. Still a HUGE improvement on the
20-minute waits I had under Panther, but I'd like to be able to break
the habit of opening a new mail browser every time I do a search (so
I can read something else in the meantime).

2) I'm starting to see long wait times just opening folders -- 90
seconds to open my Sent folder, for example, with under 1,000
messages in it.

Anyone with clues on fixing these, let me know. 1 Ghz G4, gig of
RAM. I frequently run under virtual memory situations that would
kill a bull moose, but I can watch my paging with MenuMeters, and
that's not what's causing these problems.

Best,
Jeff

fcchuan - Sep 5, 2005 1:17 pm (#21 Total: 21)  

Reply to this message
 

Photo of Author
Posts: 61
Re: Tiger Mail vs Spotlight

Just to come back to the mention in the original post: about massive number of messages as files.

Hasn’t this been an issue since Packages were introduced in late OS 9? Applications and documents (e.g. Pages) used in OS X, are now frequently package files as well. And if I’m not mistaken, packages are simply folders contain numerous collections of files. Sometimes packages include the tiniest elements such as .gif files, HTML help files, plist files etc. In Panther, copying a single file, would reveal a progress bar that showed that thousands of files were being copied. (The interface has been changed in Tiger.)

This means that there already is a hidden proliferation of large number of files that the file system has to handle. Apple’s decision to make Mail “messages as files” just makes that issue more overt.



  OutlineAll MessagesOlder MessagesOldest MessagesNewest MessagesNewer Messages


 [F] TidBITS  / TidBITS  / TidBITS Talk  / Tiger Mail vs Spotlight




Add a message

To add a message to this discussion, you must be a registered user. Enter your email address below. If you have an account associated with the email address you enter, you will be prompted for your password. If not, you'll be able to create a new account with no fuss.

Enter your email address:

Submit