Mysterious Re-Formatting when Pasting in Word

"Please leave a message at the beep, we will get back to you when your support contract expires."

Moderators: phlip, Moderators General, Prelates

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Thu Aug 07, 2014 3:35 am UTC

The office recently switched from WordPerfect X to Word 2013. (Evidently it was time for it to go.) I've never been quite clear on exactly how Word mucks about with "styles", and the problem rears its ugly head again lately. Frequently, when pasting data from an old WordPerfect document into Word, paragraphs will re-format themselves in strange ways – sometimes, left-justification will become full-justification (even though the WordPerfect documents are left-justified), or the header will change completely.

I can of course get around the problem by using Paste Special to insert the text as unformatted text, or by changing Word's options to preserve the document's existing format when pasting, but often the text I'm pasting will have italicized words spread around, and I don't want to lose that formatting.

Is there some way of telling exactly what formatting data Word is going to pull out of pasted text, and somehow repress it?

KnightExemplar
Posts: 5489
Joined: Sun Dec 26, 2010 1:58 pm UTC

Re: Mysterious Re-Formatting when Pasting in Word

Postby KnightExemplar » Wed Aug 20, 2014 3:28 pm UTC

Jorpho wrote:The office recently switched from WordPerfect X to Word 2013. (Evidently it was time for it to go.) I've never been quite clear on exactly how Word mucks about with "styles", and the problem rears its ugly head again lately. Frequently, when pasting data from an old WordPerfect document into Word, paragraphs will re-format themselves in strange ways – sometimes, left-justification will become full-justification (even though the WordPerfect documents are left-justified), or the header will change completely.

I can of course get around the problem by using Paste Special to insert the text as unformatted text, or by changing Word's options to preserve the document's existing format when pasting, but often the text I'm pasting will have italicized words spread around, and I don't want to lose that formatting.

Is there some way of telling exactly what formatting data Word is going to pull out of pasted text, and somehow repress it?


Have you tried turning the WordPerfect document into doc or docx, and then performing the copy/paste into Word? This technique of mine tends to work with OpenOffice...
First Strike +1/+1 and Indestructible.

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Re: Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Wed Aug 20, 2014 5:01 pm UTC

Well, technically the conversion to doc/docx would happen automatically when the document is opened in Word. (I'm not opening the WordPerfect documents in WordPerfect, you see.)

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Re: Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Sat Aug 23, 2014 2:55 am UTC

It occurs to me that when I copy and paste data from one Word document to another, the data is actually being copied in HTML form. (That's the default in the Paste Special dialog, anyway.) That doesn't help too much, though – try saving a Word document as HTML, and you get an unspeakably horrible-looking tag salad.

But that does suggest one solution to the problem: is there a tiny app out there that will look at HTML data stored on the clipboard and strip away everything but a small subset of tags?

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Re: Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Wed Dec 03, 2014 6:05 am UTC

I found myself grappling with this problem again, and this time stumbled across http://shaunakelly.com/word/styles/styl ... tting.html . The solution is simple: paragraph formatting is not copied as long as you don't copy the line break at the end of the paragraph.

This means, of course, that if I want to copy and paste more than one paragraph, I have to join them up in one document before pasting everything into the new document, and then re-insert the line break, but that's the most convenient solution I've seen so far.

I also found http://word2cleanhtml.com/ , which is almost what I'm looking for. Is there any way I can copy HTML code to the clipboard (like <b>whatever</b>) and then have it pasted as formatted text (like whatever) in Word? The only way I can see is to save the code in Notepad as a .html file, open the .html file in a browser, and copy-paste the text, which is a little bit too convoluted and might introduce more arbitrary HTML elements anyway.

User avatar
WanderingLinguist
Posts: 230
Joined: Tue May 22, 2012 5:14 pm UTC
Location: Seoul
Contact:

Re: Mysterious Re-Formatting when Pasting in Word

Postby WanderingLinguist » Fri Dec 12, 2014 9:04 am UTC

I haven't used MS Word since before they introduced the ribbon, but this might still work. The search and replace function used to have a formatting search option. So I wonder if you use a search and replace-all based on the formatting you want to preserve (italics and bold) to insert some text markers wherever that format exists. Then copy & paste without formatting and use regular search and replace to reapply the formatting you want to preserve. I don't have a copy of MS Word installed to look at, and my memory is from a really long time ago, so no guarantees. Another option might be to save it as HTML and use a text editor with good RE functionality to strip out all the tags except <P>, <B></B> and <I></I>. If I had a copy of MS Word installed, I'd try it out and see what works, but unfortunately (fortunately?) I don't use it any more and haven't for a long time.

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Re: Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Sat Dec 13, 2014 3:18 pm UTC

WanderingLinguist wrote:So I wonder if you use a search and replace-all based on the formatting you want to preserve (italics and bold) to insert some text markers wherever that format exists. Then copy & paste without formatting and use regular search and replace to reapply the formatting you want to preserve.
I'm pretty sure it doesn't work that way.

Another option might be to save it as HTML and use a text editor with good RE functionality to strip out all the tags except <P>, <B></B> and <I></I>.
That is precisely the purpose of http://word2cleanhtml.com/ mentioned in my previous post. The problem that remains is what to do with the stripped HTML afterwards.

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Re: Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Sat Dec 17, 2016 4:07 am UTC

Oh my gourd.

Today I learned that you can take absolutely plain unformatted text, paste it into a left-justified paragraph, and – if the pasted text contains a line break – the paragraph will switch to full justification, for no reason at all. The "new" paragraph following the line break will stay left-justified.

What is this nonsense!? Why should pasting unformatted text result in a change of formatting!? Is this a bug? Is there any rationality behind this whatsoever?

User avatar
ucim
Posts: 5509
Joined: Fri Sep 28, 2012 3:23 pm UTC
Location: The One True Thread

Re: Mysterious Re-Formatting when Pasting in Word

Postby ucim » Sat Dec 17, 2016 4:31 am UTC

Jorpho wrote:Oh my gourd.

Today I learned that you can take absolutely plain unformatted text, paste it into a left-justified paragraph, and – if the pasted text contains a line break – the paragraph will switch to full justification, for no reason at all. The "new" paragraph following the line break will stay left-justified.

What is this nonsense!? Why should pasting unformatted text result in a change of formatting!? Is this a bug? Is there any rationality behind this whatsoever?


I learned somewhere that the formatting instructions for a paragraph are contained inside the paragraph mark. The line break would therefore be a new paragraph mark that contains "no" formatting, for the text that is now the previous paragraph, but the text following that line break would belong to the paragraph that had the original paragraph mark that had the original formatting.

I have not verified this technically, but it explains this kind of behavior (which had puzzled me too - I wish this were documented in an easy-to-discover place)

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

User avatar
Soupspoon
You have done something you shouldn't. Or are about to.
Posts: 2384
Joined: Thu Jan 28, 2016 7:00 pm UTC
Location: 53-1

Re: Mysterious Re-Formatting when Pasting in Word

Postby Soupspoon » Sat Dec 17, 2016 7:13 am UTC

I don't have Word available to check the specifics, but could it also have something to do with the thematic autoformatting changes. "Normal" will switch to bulleted if you do something like "- Foo" at the start of a line or a numbered list if "1) Foo", double-enter may set up for the next line being a Header of some kind, etc, according to some kind of configurable but initially developer-decided 'helpful' scheme. If the pasting funtion passes through the plaintext as if typed, it might easily succumb to such strangeness.

Although I'm not entirely sure this would happen. Often being unable to deactivate enough of the preconfigured autocorrection behaviour to get a single "i", lowercase, into a spreadsheet cell when typed, rather than "I" in uppercase, I either use "=CHAR(105)" with an option of using Copy then Paste Special to make it literal, or type the "i" in a spare notepad window (or even the Run dialogue), and copypasta in, unaltered. So it should behave differently in your case.

User avatar
Jorpho
Posts: 6102
Joined: Wed Dec 12, 2007 5:31 am UTC
Location: Canada

Re: Mysterious Re-Formatting when Pasting in Word

Postby Jorpho » Sun Dec 18, 2016 3:08 am UTC

ucim wrote:I learned somewhere that the formatting instructions for a paragraph are contained inside the paragraph mark. The line break would therefore be a new paragraph mark that contains "no" formatting, for the text that is now the previous paragraph, but the text following that line break would belong to the paragraph that had the original paragraph mark that had the original formatting.

I have not verified this technically, but it explains this kind of behavior (which had puzzled me too - I wish this were documented in an easy-to-discover place)
Okay, that kind of makes sense. Thank you for that.

So when a paragraph mark with "no" formatting is added, where is the "new" formatting coming from? Something in the document default style, that somehow does not match the paragraph default style, I suppose? Is there not some way to bring these styles back into coherence?

"Normal" will switch to bulleted if you do something like "- Foo" at the start of a line or a numbered list
In the more recent versions of Word, a little lightning bolt will appear to signify that "autoformatting" has been applied when that happens, and it can easily be undone with Ctrl-Z.

User avatar
ucim
Posts: 5509
Joined: Fri Sep 28, 2012 3:23 pm UTC
Location: The One True Thread

Re: Mysterious Re-Formatting when Pasting in Word

Postby ucim » Sun Dec 18, 2016 3:39 am UTC

Jorpho wrote:So when a paragraph mark with "no" formatting is added, where is the "new" formatting coming from? Something in the document default style, that somehow does not match the paragraph default style, I suppose? Is there not some way to bring these styles back into coherence?
I would assume it is the default style for all documents, overridden by the default style for this document, overridden by any larger-scope styles that had been defined. However, that's just speculation on my part.

(...and btw, your unattributed (following) quote is from Soupspoon, not me).

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.


Return to “The Help Desk”

Who is online

Users browsing this forum: No registered users and 6 guests