Sponsored in part by... Fetch Softworks GET FETCH 5 FOR FREE! Fetch Softworks makes Fetch, the original
Macintosh FTP client, free for educational and charitable use.
Fetch 5.3 includes a new look and Leopard technology support.
Apply today at <http://fetchsoftworks.com/edapply>!

 [F] TidBITS  / TidBITS  / TidBITS Talk  /

Concordance Software for Mac?

[dave28c]dave28c (apparently) - 04:04am Feb 19, 2008 PST
via email - Dave Clark

I'm looking for decent and cheap Concordance software for Mac that
will create a full concordance of a large PDF. I have numerous
documents created in a word processing program (probably Word for
Windoze) and then converted to PDF that are quite large -- 30 or so
pages -- and I'd like to create a full word list. I have in mind
something like the word list that is created for deposition
transcripts, although these are not depos.

I've done a Google search and found nothing suitable, except small
programs that will allow the user to insert an identified word into a
search box, then tell you everywhere the word appears. Preview will
do that. What I am looking for is a program that gives a list of
every word, maybe with the exception of the "noise words" like a, an,
and, the, and so on.

Thanks for the help.

Dave




Mark as Read
  OutlineAll MessagesOlder MessagesOldest MessagesNewest MessagesNewer Messages

Curtis Wilcox (apparently) - Feb 19, 2008 11:51 am (#1 Total: 7)  

Reply to this message
via email  

Photo of Author
Posts: 345
Re: Concordance Software for Mac?

On Feb 19, 2008, at 6:04 AM, David Clark wrote:

> I'm looking for decent and cheap Concordance software for Mac that
> will create a full concordance of a large PDF. I have numerous
> documents created in a word processing program (probably Word for
> Windoze) and then converted to PDF that are quite large -- 30 or so
> pages -- and I'd like to create a full word list.

Sounds like a job for, Perl! But first, you need the text as text.
One thing Adobe Reader has over Preview.app (at least as of 10.4) is
Adobe Reader has a Save As Text option. Sure, in Preview you can
Select All, Copy, then Paste into a text document but Save As Text
seems easier and I think the output might be cleaner. If you want a
concordance created from multiple PDFs, you could probably merge them
together using a program like Combine PDFs then use Save As Text.

http://www.monkeybreadsoftware.de/Freeware/CombinePDFs.shtml

Now for the Perl. Since text manipulation is its forte, I figured
this was a problem someone already solved. I searched for
"concordance" on perlmonks.org and got many hits. This post at the
end of a thread from 2001 seems to work nicely.

http://perlmonks.com/?node_id=104687

I literally selected and copied it from the web page, pasted it into
a text editor, saved it (by convention perl scripts end in .pl),
typed 'perl ' in Terminal, dragged the Perl script to the window then
dragged a PDF's saved text to the window after it and hit Return. Out
came an alphabetically sorted list of words, each followed by a
number indicating the line number on which the word appears.
Actually, at first the line numbers didn't work correctly because
Adobe saves the file with a Mac line ending (CR) and the script only
recognizes Unix line endings(LF). There are many options but I opened
the text file in Text Wrangler, changed the line endings from Mac
(CR) to Unix (LF) and saved (Windows CRLF) would also work).

If you didn't want the line numbers, you could change this line:
     print ("$key ", join (', ', {$concord{$key}}), "\n");
to this:
     print ("$key\n");


When I wanted the concordance saved to a different file, I did the
same typing and dragging then added " > output.txt" which redirects
the script's output from standard output (the Terminal window) to a
file. Unless you changed directories in the Terminal window,
"output.txt" will be in your OS X account Home directory.


Randy B. Singer (apparently) - Feb 19, 2008 11:51 am (#2 Total: 7)  

Reply to this message
via email - Co-Author: The Macintosh Bible (4th, 5th, and 6th editions)  

Photo of Author
Posts: 190
Re: Concordance Software for Mac?



On Feb 19, 2008, at 3:04 AM, David Clark wrote:

> I'm looking for decent and cheap Concordance software for Mac that
> will create a full concordance of a large PDF.

Concorder (free)
http://homepage.mac.com/fahrenba/programs/concorder/concorder.html

Concorder Pro (free)
http://homepage.mac.com/fahrenba/programs/concorderPro/concorderPro.html

AnalyzeText ($49)
http://www.wuffwuffware.com/products/AnalyzeText.html


___________________________________________
Randy B. Singer
Co-author of The Macintosh Bible (4th, 5th, and 6th editions)

Macintosh OS X Routine Maintenance
http://www.macattorney.com/ts.html
___________________________________________




Nicholas Barnard - Feb 20, 2008 10:47 pm (#3 Total: 7)  

Reply to this message
Guest User  

Photo of Author
Posts: 1
Re: Concordance Software for Mac?

At 10:51 AM -0800 2/19/08, Randy B. Singer wrote:
>On Feb 19, 2008, at 3:04 AM, David Clark wrote:
>
>> I'm looking for decent and cheap Concordance software for Mac that
>> will create a full concordance of a large PDF.
>
>Concorder (free)
>http://homepage.mac.com/fahrenba/programs/concorder/concorder.html
>
>Concorder Pro (free)
>http://homepage.mac.com/fahrenba/programs/concorderPro/concorderPro.html
>
>AnalyzeText ($49)
>http://www.wuffwuffware.com/products/AnalyzeText.html

I love the fact that Randy gives us the Mac answer of three programs
whereas Curtis gives us the 40 line Unix answer.. (No offense to
either Curtis or Randy.) This is a reminder that there is always more
than three ways to skin a cat. A fact I find very useful to mention
when disciplining felines.

But seriously, this is an example of the beauty of modern Macs!

Cheers
~Nick
http://www.inmff.net

dave28c (apparently) - Feb 20, 2008 10:47 pm (#4 Total: 7)  

Reply to this message
via email - Dave Clark  

Photo of Author
Posts: 103
Re: Concordance Software for Mac?

On Feb 19, 2008, at 10:51 AM, Randy B. Singer wrote:

> Concorder (free)
> http://homepage.mac.com/fahrenba/programs/concorder/concorder.html
>
> Concorder Pro (free)
> http://homepage.mac.com/fahrenba/programs/concorderPro/
> concorderPro.html
>
> AnalyzeText ($49)
> http://www.wuffwuffware.com/products/AnalyzeText.html
>
>
> I tried concordance pro on a large pdf. First, it won't take the
> full 16-page document but apparently requires page-by-page copy
> paste. Then, the only "concordance" I got was the file name and
> frequency of occurrence of individual words. There does not seem
> to be any way to place them in the entire document. There is no
> page index.

I guess what I'm trying to do is create a detailed index after the
fact when the original author failed to do so. For example, in
several hundred pages of related PDFs, each of which is a separate
document, where do I find "tool" or "Exhibit" or "damage" or "mold"
or whatever single word. I know one can search in Preview for
individual words, but it requires going through separate documents
one by one. I still do not know how to take the result from Preview
and copy>paste it to something else, and searching the Preview help
does not seem to disclose any way to do that.

I'll try wuffwuffware.com next.

Dave Clark
http://home.earthlink.net/~dc1999/
http://web.mac.com/dave28c
http://www.clarklawfirm.com

lbyron - Feb 21, 2008 2:13 am (#5 Total: 7)  

Reply to this message
 

Photo of Author
Posts: 1
Re: Concordance Software for Mac?

Another simple option is CopyPasteX, a mulit-clipboard utility. You can download a demo at www.copypaste-x.com.

It has the option to create word lists of varying kinds from documents in its clipboard editor.

You would select all the text, copy it to your clipboard which then automatically goes to the installed CopyPaste X, double click on the copied item in the palette, and it opens up in the editor, and then go to a menu item labeled lists. The dialog that comes up gives a number of options including search for URL or email addresses or word fragments or words that contain certain letters. But it doesn't give the line number where the word appears.

Then it has some sort options as well.

I use the program for its multi-clipboard function, but this is just an extra. I just tried it and it works, but I have never had a reason to use it before.

BTW, the multi-clipboard associated text editor has a number of text cleaning options as well. Those have been very useful in the past, like cleaning up hard returns to word wrap, etc.

Eric

Dick Furnas - Feb 26, 2008 3:25 pm (#6 Total: 7)  

Reply to this message
 

Photo of Author
Posts: 8
Re: Concordance Software for Mac?

I'm sure this is not the concordance style you were looking for, but this is an incredible, interactive visualization tool for doing work for which a concordance might be a more primitive tool:

http://www.textarc.org/

It shows all the words in entire text on the screen twice. Once as a ribbon around the perimeter and again with words positioned within at the "center of gravity" of the occurrences in the perimeter. click on a word in the interior and arcs are shown to the locations on the perimeter. It can also "read" the text showing arcs to successive words wherever they occur.

KeithNealy - Feb 26, 2008 3:25 pm (#7 Total: 7)  

Reply to this message
 

Photo of Author
Posts: 3
Re: Concordance Software for Mac?

Let me recommend SpotInside from http://www.oneriver.jp/SpotInside/index.html

It won't build an index but it will search and display in context any word you care to look for. If you used Concorder to reveal a complete list of all words you could scan it for relevant ones and then use SpotInside to show each occurrence in context without opening each document.



  OutlineAll MessagesOlder MessagesOldest MessagesNewest MessagesNewer Messages


 [F] TidBITS  / TidBITS  / TidBITS Talk  / Concordance Software for Mac?




Add a message

To add a message to this discussion, you must be a registered user. Enter your email address below. If you have an account associated with the email address you enter, you will be prompted for your password. If not, you'll be able to create a new account with no fuss.

Enter your email address:

Submit