ehMac.ca

ehMac.ca (http://www.ehmac.ca/index.php)
-   Mac Masters (http://www.ehmac.ca/forumdisplay.php?f=59)
-   -   Will OCR help with this? (http://www.ehmac.ca/showthread.php?t=154166)

Slackjaw Jun 2nd, 2020 03:43 PM

Will OCR help with this?
 
Seeking some technical guidance regarding a book my 94 year old Mum-in-law is writing.

The issues are numerous . I’ll do my best to sum up what I’m up against.

The book has been written entirely as a PDF**using MS “Wordpad”.

The process to spell check the PDF’s and make other corrections is remarkably painful to the point I offered some assistance only to discover what I thought may be a somewhat*easy fix is anything but…

I’ve used google docs to convert the pdf’s to Libreoffice ,odt which allows spell checking BUT all ‘images’ do not carry over. This means all images would have to be imported and inserted back in the appropriate spots.**This would be a more daunting process that I am willing to undertake. Why? You might ask….

Well the book is now up to approximate 500 pages with more to come and on those 500 pages there are more than SEVEN HUNDRED AND THIRTY NINE images!!! With more to come.

I’m aware of the limitations PDF’s present but Wordpad is the only app mum feels comfortable with and at this point 500 pages in… what do you do?

I’m wondering if an OCR program of some description can be employed to convert the complete document with all images/photo’s in tact in place.

Hopefully I’ve explained my dilemma. Thanks for what comes back.

CliveK
* *
macOS 10.15.4

wonderings Jun 3rd, 2020 10:15 AM

Sounds like you need a page layout program. I would take a look at Affinity Publisher which is an Indesign like application for a fraction of the cost and no subscription.

https://affinity.serif.com/en-gb/publisher/

You should be able to import a word file with images like you can do in Indesign. Think there is a 90 day full feature demo available so you could give it a shot. If it works the price is around $60 I believe and may be on sale for 50% off at the moment. So both ways are really cheap for software that can hopefully help get this setup easier.

eMacMan Jun 3rd, 2020 12:49 PM

Quote:

Originally Posted by wonderings (Post 2722014)
Sounds like you need a page layout program. I would take a look at Affinity Publisher which is an Indesign like application for a fraction of the cost and no subscription.

https://affinity.serif.com/en-gb/publisher/

You should be able to import a word file with images like you can do in Indesign. Think there is a 90 day full feature demo available so you could give it a shot. If it works the price is around $60 I believe and may be on sale for 50% off at the moment. So both ways are really cheap for software that can hopefully help get this setup easier.

Does it import correctly from the pdf format the OP referred to? I would think it should but nothing is ever certain in the digital world.

I suspect if he was working with a Word file there would have been no problem in the first place.

wonderings Jun 3rd, 2020 02:03 PM

Quote:

Originally Posted by eMacMan (Post 2722018)
Does it import correctly from the pdf format the OP referred to? I would think it should but nothing is ever certain in the digital world.

I suspect if he was working with a Word file there would have been no problem in the first place.

oh Missed that bit.

This might be the one place where Publishers PDF handling could be a good thing. Affinity Publisher likes to make a PDF editable. It is a nightmare as it cannot use embedded fonts so if you are working with client supplied PDF's and place them in Publisher they can get seriously messed up. But nn this case it might be a good thing as it "should" make your PDF editable in Publisher with all your images and formatting. Trial is free so no harm in trying, but if I am understanding correctly this might just work.

pm-r Jun 3rd, 2020 02:40 PM

I think I would want to contact the potential publisher of the book and get their suggestions as to what and how to use their suggested applications.

No one will want to have to redo a 500+ page book, for spelling, formatting file type etc.

Would have thought that writing a lengthy book as a PDF would be the last choice, and I can't see any advantage for using any OCR. That's doubling up the amount of work involved at least I would think.



- Patrick
======

WCraig Jun 4th, 2020 07:26 AM

Quote:

Originally Posted by Slackjaw (Post 2721974)
...
The book has been written entirely as a PDF**using MS “Wordpad”.

 ...

Wordpad's native file format is NOT pdf. Get access to the original file on Windows. MS Word can read native Wordpad files. Or from Windows, save the file in rtf format and then many word and document processors will be able to read it.

Craig
(Let me guess...the original file is corrupt and there are no backups.)

Slackjaw Jun 4th, 2020 11:01 AM

Thanks for the suggestions. I'll take a look at the programs suggested.

@WCraig. I'm aware that .pdf is not the native file format for Wordpad but mum saved everything as a .pdf. The are no rtf files to be found.

CliveK

pm-r Jun 4th, 2020 12:05 PM

Quote:

Craig
(Let me guess...the original file is corrupt and there are no backups.)

Isn't it amazing how often that seems to be so true!!!

Multiple years of work and typing so often will disappear into uselessness.




- Patrick
======

WCraig Jun 4th, 2020 12:42 PM

Quote:

Originally Posted by Slackjaw (Post 2722122)
... @WCraig. I'm aware that .pdf is not the native file format for Wordpad but mum saved everything as a .pdf. The are no rtf files to be found.

CliveK

This makes no sense. How can your mother continue to write and edit the book? Wordpad cannot read a pdf file; it can only output them. Thus there must be a native file somewhere. I'm not at a Windows machine but I don't think rtf is the native format for Wordpad files, either.

Craig

pm-r Jun 4th, 2020 12:55 PM

Quote:

Originally Posted by WCraig (Post 2722136)
This makes no sense. How can your mother continue to write and edit the book? Wordpad cannot read a pdf file; it can only output them. Thus there must be a native file somewhere. I'm not at a Windows machine but I don't think rtf is the native format for Wordpad files, either.

Craig


I gather she must be using a Windows computer as well???

Quote:

RTF was created by the Microsoft Word team back in the 1980’s. It was intended as a universal format that could be used by most word processors, making it easier for people to share Word documents with people who don’t use Word. It was also incorporated as the default format used by Windows’ built-in WordPad app—a lightweight word processor.




- Patrick
======


All times are GMT -4. The time now is 10:44 AM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Content Relevant URLs by vBSEO 3.6.0 RC 2
vBulletin Security provided by vBSecurity v2.2.2 (Pro) - vBulletin Mods & Addons Copyright © 2020 DragonByte Technologies Ltd.
Copyright © 1999 - 2012, ehMac.ca All rights reserved. ehMac is not affiliated with Apple Inc. Mac, iPod, iTunes, iPhone, Apple TV are trademarks of Apple Inc.