There are several non printable characters that can give you problems if you want to copy data from Excel files to another system.
In my case I wanted to copy a lot of data from Google Spreadsheet into sharepoint and it was not without problems.
Google Spreadsheet was chosen because my company had to have a quick place to share some tabular data with departments around the globe. This was before SharePoint was really implemented.
The problem was that there was a lot of characters that were messed up, and offcourse people filled in all sorts of data in the wrong cells. Inserting wrong data in cells can’t really be magically fixed using a button in Excel, but nonprinatable characters can.
What I needed was to remove the nonprintable characters, like tab, cariage return, delete etc.
What we need to remove is characters with decimal value 1-31, 127, 129, 141, 143, 144, 157 and I would change non breaking spaces into normal spaces. Then it would at least be acceptable input to sharepoint, and then all I would need to do was edit the columns in the excel file to actually adhere to my stict rules in sharepoint! ![]()
You can see what characters should be stripped below:
| Decimal | Character |
|---|---|
| 0 | null |
| 1 | start of heading |
| 2 | start of text |
| 3 | end of text |
| 4 | end of transmission |
| 5 | inquiry |
| 6 | acknowledge |
| 7 | bell |
| 8 | backspace |
| 9 | horizontal tab |
| 10 | line feed/new line |
| 11 | vertical tab |
| 12 | form feed/new page |
| 13 | carriage return |
| 14 | shift out |
| 15 | shift in |
| 16 | data link escape |
| 17 | device control 1 |
| 18 | device control 2 |
| 19 | device control 3 |
| 20 | device control 4 |
| 21 | negative acknowledge |
| 22 | synchronous idle |
| 23 | end of transmission block |
| 24 | cancel |
| 25 | end of medium |
| 26 | substitute |
| 27 | escape |
| 28 | file separator |
| 29 | group separator |
| 30 | record separator |
| 31 | unit separator |
| Decimal | Character |
|---|---|
| 127 | DEL |
| 129 | ü |
| 141 | ì |
| 143 | Å |
| 144 | É |
| 157 | ¥ |
| Decimal | Character |
|---|---|
| 160 | non breaking space |
There is a function in excel called trim() that removes extra whitespace (char 32). Thats all it’s designed to do. However there is once more type of space that comes from unicode and that is nonbreaking space ( in html). This often sneaks in when it’s text fom the net.
There is a function called Clean() that removed the first 32 characters from the 7 bit ASCII code. Characters from decimal value 0 to 31.
There is no excel function to take care of the rest directly.
Basically what I do is:
1: First replace 160 with a normal space.
2: Replace() 127, 129, 141, 143, 144, 157 with an empty string.
3: Trim() the cell.
4: Finally I Clean() the cell.
I have a little VBA that does this for the selected range. You can make a button that runs this code to easily do it.
I have a guide to creating a button in the Quick Access Toolbar here.
The reason I don’t do it on the whole workbook or sheets is that I had a column or 2 that I could not clean without messing up the data (badly formatted dates, times and all sorts of rubbish all mixed up)
01 Public Sub CleanSelection()02 03 Dim Cell As Range04 05 For Each Cell In Selection06 Cell = Replace(Cell, Chr(160), " ")07 Cell = Replace(Cell, Chr(127), "")08 Cell = Replace(Cell, Chr(129), "")09 Cell = Replace(Cell, Chr(141), "")10 Cell = Replace(Cell, Chr(143), "")11 Cell = Replace(Cell, Chr(144), "")12 Cell = Replace(Cell, Chr(157), "")13 Cell = Application.WorksheetFunction.Trim(Cell)14 Cell = Application.WorksheetFunction.Clean(Cell)15 16 Next Cell
End Sub
Related posts:
NICE!!
We have suffered from clean()’s inability to clean for a long time…. having to do a lot of hands on editing. Thanks!!
Also since i find (to my dissapointment) that there is no matching inserting functions in the worksheet in terms of speed I adapted your metod using functions.. enjoy..
=CLEAN(TRIM(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(R[X]C[X],CHAR(160),”" “”),CHAR(127),”"”"),CHAR(129),”"”"),CHAR(141),”"”"),CHAR(143),”"”"),CHAR(144),”"”"),CHAR(157),”"”")))”
(replace the X’es with the reference to the cell to be CLEANED!!)
Cheers!!
Hi Orvar
Yeah I don’t quite see why on earth it should be so troublesome to clean some cells.
It can take some time too, if you mess up. I used CTRL + A and started my macro yesterday. My oh my. I am sure I heard naughty words come out of our Citrix server.
Lesson for me was: Always select the range by hand. Cleaning all cells, dirty or not, takes a looong time. It was shut down after several hours.
Public Sub CleanSelection()
Dim cell As Range
Dim s As String
For Each cell In Intersect(Selection.Cells, ActiveSheet.UsedRange)
If Not cell.HasFormula Then
s = Replace(cell.Value, Chr(160), ” “)
s = Replace(s, Chr(127), “”)
s = Replace(s, Chr(129), “”)
s = Replace(s, Chr(141), “”)
s = Replace(s, Chr(143), “”)
s = Replace(s, Chr(144), “”)
s = Replace(s, Chr(157), “”)
s = Application.WorksheetFunction.Trim(s)
cell.Value = Application.WorksheetFunction.Clean(s)
End If
Next cell
End Sub
this code do not work for me!