Strange Characters when Pasting Text from the Web

Written by Allen Wyatt (last updated January 21, 2023)
This tip applies to Word 2007, 2010, 2013, 2016, 2019, Word in Microsoft 365, and 2021


3

Keith notes that when copying text from the web and pasting it into a document, there are many strange characters which are created, such as degree marks. He wonders if there is a way to clear these characters and return the text to an "as copied" form.

Information presented on the web uses what is called a markup language, typically HTML, in order to control how that information appears. When you copy information from the web, you are copying the markup language and then Word needs to figure out how to best paste that information. You can see that this is the case by using Ctrl+C to copy the web info and then, in Word, displaying the Paste Special dialog box. (See Figure 1.)

Figure 1. The Paste Special dialog box.

Notice that the selected option in the dialog box is to paste the information using HTML format. Choose this, and you end up with Word's approximation of what the copied information looked like on the original website. (See Figure 2.)

Figure 2. Web information pasted in HTML format.

When you paste in this format, however, notice that there are what Keith called "degree marks" between some words. These aren't degree symbols; they don't print that way. They are only visible if you have Show/Hide turned on (the control for this is on the Home tab of the ribbon) and they represent non-breaking spaces. They appear in places because the original HTML code included non-breaking spaces in those places, so Word converted them to the same in your document.

The number of non-breaking spaces that appear is not a function of anything you do in Word; it is entirely dependent on how many non-breaking spaces were used within the HTML code on the original website. And that is dependent on how the HTML was generated or by whoever designed the original webpage.

Interestingly enough, you can get rid of the non-breaking spaces by using Find and Replace after pasting into your document. Search for a single space and replace with a single space and you'll get rid of them all. Plus, if there are any actual degree marks in what you pasted (as opposed to non-breaking spaces), those degree marks will remain intact.

Of course, the "degree marks" aren't the only special characters that may end up in your pasted information. In HTML there can be lots of other characters understood by Word, such as newline characters, non-breaking hyphens, and a number of others. The presence of these other characters will, again, be dependent on the content on the original website. You can use Find and Replace to get rid of the special characters, but you'll need to figure out what the characters are so that you know what to search for and what you want it replaced with.

Because of how special characters can end up in your document, many people will not paste HTML content into their documents. Instead, they choose to paste "Text Only" or even to paste into Notepad and then copy from Notepad into Word. This means, of course, that the pasted information will need to be formatted by you, as a separate step after pasting.

The bottom line is that when you paste into your document information that you copied from a webpage, you are going to need to spend some time with that information to get it into the condition you want. If you paste as HTML, you'll need to do something to get rid of the special characters. If you paste as text, you'll need to do something to get the formatting the way you want. There is no way around this need for post-pasting work.

It is important to note that all I've been talking about so far is pasting textual information copied from the web. It is entirely possible that what you originally select and copy from the webpage will have other elements, such as tables or images. The way you choose to paste this type of content into your Word document will greatly affect what you see. Because of this, you may want to experiment with which type of pasting you perform—you may be surprised at the difference evident in the different pasting options available to you.

WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (13004) applies to Microsoft Word 2007, 2010, 2013, 2016, 2019, Word in Microsoft 365, and 2021.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...

MORE FROM ALLEN

A Picture Is Worth a Thousand Words

Nothing beats a screen shot when you are trying to convey information about using the computer. With just a couple of ...

Discover More

Recognizing a Header Row when Sorting

When you sort data in a worksheet, there are a couple ways you can do it. Using the simple way can result in ...

Discover More

Using a Custom Date Format in a Header or Footer

Need to show a custom date in the header or footer of a printout? You'll need to resort to using a macro, as described in ...

Discover More

Comprehensive VBA Guide Visual Basic for Applications (VBA) is the language used for writing macros in all Office programs. This complete guide shows both professionals and novices how to master VBA in order to customize the entire Office suite for their needs. Check out Mastering VBA for Office 2010 today!

More WordTips (ribbon)

Transposing Two Characters

If you have two characters in the wrong order, you might be interested in a shortcut you can use to switch their order. ...

Discover More

Replacing Some Smart Quotes

Smart quotes look great in a document, but may not be right for all instances of quote marks or apostrophes. If you need ...

Discover More

Automatic Non-breaking Spaces in Dates

It drives some people crazy to have a date break across two lines. If you find yourself in this mindset, then you'll ...

Discover More
Subscribe

FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] (all 7 characters, in the sequence shown) in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is nine minus 5?

2023-01-23 11:44:44

Kathy R.

I often copy things from websites and emails to word and always have to deal with the markup language. I didn't know about the unformatted text option, which will be a handy tool in my arsenal. However, that also removes all the formatting, which I usually want to keep. My solution was to make a macro that does three things that always need to be changed:

1) change non-breaking spaces to regular spaces (something I learned in a previous Word Tips, thank you!)
2) change all line breaks (^l) to paragraph breaks (^p)
3) remove all white space before paragraph breaks (^w^p) This removes spaces and tab marks at the end of the paragraph

Once the macro was created, I added it to my Quick Access toolbar. Works like a charm and saves me a ton of time!


2023-01-22 07:20:51

Brian Crane

Hi Ken
I use a little macro - I don't remember from wence it came, but I find it very useful when copying and pasting from the web. Place it on your Quick Access Toolbar with an icon of your choice - Instructions how can be found on this website and or elsewhere on the web.
I use it many time during the day.

Sub CleanPaste()
'
' CleanPaste Macro
'
Selection.PasteSpecial Link:=False, _
DataType:=wdPasteText, _
Placement:=wdInLine, _
DisplayAsIcon:=False
End Sub


2023-01-21 10:46:51

Ken Lord

I find two other things that puzzle me when I do this. First, the text often comes off with a color in the background. I've not been able to figure out how to get rid of it. A clear highlight doesn't work. Second, when I copy into a document and send that document off to be printed, the copied text appears gray, unlike the density of the text I entered in Word. Anybody else encountered this and is there a fix? Ken Lord. theavonman@juno.com.


This Site

Got a version of Word that uses the ribbon interface (Word 2007 or later)? This site is for you! If you use an earlier version of Word, visit our WordTips site focusing on the menu interface.

Videos
Subscribe

FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.