Topic: Word text to HTML (Page 1 of 1) Pages that link to <a href="https://ozoneasylum.com/backlink?for=29281" title="Pages that link to Topic: Word text to HTML (Page 1 of 1)" rel="nofollow" >Topic: Word text to HTML <span class="small">(Page 1 of 1)</span>\

 
jstuartj
Paranoid (IV) Inmate

From: Mpls, MN
Insane since: Dec 2000

posted posted 06-13-2007 21:07

Ok, I've got a stupid question.

I do weekly updates on several site. For which I receive news and copy mostly as Word docs. The problem is when I copy and past, or directly from Word to my editor or export from word. I have to manual replace the incorrect encode quotes and some other characters, and the manually add paragraph brakes. Word convert to HTML mostly sucks, and I'm in need of a better solution.

Is there a better way convert this text very simple, basic HTML. Perhaps just inserting proper paragraph tags. I currently don't use Dreamweaver. My current editor of choice is Webuilder, so far I haven't found a solution using it. It does include HTML tidy, which I've played with a little, but I haven't haven't had much luck.

I've have found several online converters, I need a client side solution as I need the copy to remain confidential until posted.

J. Stuart J.

(Edited by jstuartj on 06-13-2007 21:09)

poi
Paranoid (IV) Inmate

From: Norway
Insane since: Jun 2002

posted posted 06-13-2007 21:14

HTML Tidy for the win!

jstuartj
Paranoid (IV) Inmate

From: Mpls, MN
Insane since: Dec 2000

posted posted 06-13-2007 21:22

Thanks,

I figured HTML tidy, was the correct track. I will have to play with it some more.

J. Stuart J.

poi
Paranoid (IV) Inmate

From: Norway
Insane since: Jun 2002

posted posted 06-13-2007 21:38

Last time I tried, i.e. years ago, DreamWeaver did a good job at removing the crap in WORD's markup. But HTML Tidy is supposed to it very well too. It's worth google'ing for examples on how to do it.

I just found TidyGUI. You'll have to check the license though.

reisio
Paranoid (IV) Inmate

From: Florida
Insane since: Mar 2005

posted posted 06-14-2007 02:38

You just need some sed(/ish) scripts, or an editor that does macros, etc. If Word can save as HTML and the only problem is that decent HTML is surrounded by tons of crap HTML, it might be best to go that route - save as HTML from Word, then run it through a script/macros that clean out the crap.

jstuartj
Paranoid (IV) Inmate

From: Mpls, MN
Insane since: Dec 2000

posted posted 06-14-2007 03:41

I got it Tidy working perfectly now. I found if I export the doc as HTML from word then run it trough HTML Tidy twice. I get exactly what I was after. WeBuilder 2007 has a very nice front end to Tidy, but TidyGui would do the trick as well.

Thanks for the help,

J. Stuart J.

reisio
Paranoid (IV) Inmate

From: Florida
Insane since: Mar 2005

posted posted 06-14-2007 22:08

Twice, ha. Never tried that; neat.

W3Daryl
Obsessive-Compulsive (I) Inmate

From:
Insane since: Jun 2007

posted posted 07-04-2007 20:43

I wouldn't trust a software package with a ten foot pole for this. But coming from a Search Engine Optimization I like to have my content with as minimal an amount as possible of markup. If I am using a WYSIWYG editor I copy and paste into Textpad first, then build the page. If its Dreamweaver I paste right into the source view and wrap it in tags as I go. I find it really doesn't take that much time with the exception of tabular data. Cheers.

Daryl, Web Developer
W3-Edge - Boston Website Design



Post Reply
 
Your User Name:
Your Password:
Login Options:
 
Your Text:
Loading...
Options:


« BackwardsOnwards »

Show Forum Drop Down Menu