Use this free tool to find and remove all hidden Unicode characters in your text. LLMs like ChatGPT can inject hidden characters like Em-Dash or Zero Width Space into text. This free tool helps identify them and remove the ones you don’t want
Our Free Tool Identifies and Removes Invisible Unicode Text Characters:
This tool lets you easily visualize all hidden Unicode characters in your text and then remove them either strategically (by specific code) or all at once.
How to use the tool:
Paste in your text
See all hidden Unicode characters in your text and information about what they do
Decide what you want to clean up
Fix all
Remove all invisible characters
Strategically fix specific ones (like remove all Em-Dash and En-Dash)
Copy your cleaned text
Note that if the text was created by an LLM (ChatGPT, Claude, Gemini etc) it will still be identified as AI-generated by Originality.ai’s AI Detector.
Key Takeaways:
One Click Cleanup - Our free Invisible Text Detector and Remover makes it easy to identify and remove any unicode hidden text characters
Privacy Assured - All processing happens locally in your browser; your text never leaves your device.
Not a Secret Watermark - LLMs like ChatGPT do inject hidden characters, but it is not for watermarking or nefarious reasons
Overuse of Certain Characters - ChatGPT does use several Unicode “hidden” characters a lot (such as the Em Dash)
Real Risks Exist - Invisible text characters can cause both security and formatting challenges
Does Not Bypass Detectors - AI Detection remains similarly effective whether the hidden characters are removed or not
All LLMs Do It - All popular LLMs inject hidden characters
What are Unicode Hidden Text Characters?
Hidden Unicode characters, whether subtly visible (em-dashes, smart quotes, no-break spaces) or completely invisible (zero-width spaces, joiners, direction marks) are special code points that don’t behave like plain ASCII. They act as a “digital ink” that reshapes how software wraps lines, splits words, parses data, or matches text, even when you don’t realize they’re there.
Does ChatGPT Watermark Text with Hidden Characters?
No - It has been incorrectly reported that the injection of hidden characters by ChatGPT is their attempt at “watermarking” their outputs. However, there are 2 reasons that this is very unlikely..
It is incredibly easy to circumvent a watermarking strategy by just removing the characters. The ease at which users can do this would render a watermarking strategy pointless.
OpenAI has stated it is just “a quirk of large-scale reinforcement learning.”
Does ChatGPT Inject Hidden Characters?
Yes - LLMs like ChatGPT do inject hidden characters. Many of these characters are harmless formatting characters (like the popular em-dash), while some can cause formatting issues (zero-width space).
Some of the most common Unicode characters that ChatGPT uses include:
Em Dash
The long dash character ChatGPT frequently inserts—especially in recent model versions (o3, 4 o, 4.1)—to break up sentences or add dramatic pauses.
Unicode: U+2014
Description: A long horizontal dash wider than a hyphen. Useful in typography for parenthetical breaks, but in code, CSV, or plain-text data it can act like a non-standard character, causing copy/paste or parsing issues.
Example: word—word ← visually similar to word-word, yet represents a distinct Unicode code point.
Smart Quotes
The curly “typographer’s” quotation marks that ChatGPT sometimes substitutes for straight quotes.
Unicode:
Left double quote (U+201C)
Right double quote (U+201D)
Left single quote (U+2018)
Right single quote (U+2019)
Description: Curved punctuation intended for print typography. They often appear where a plain straight quote (", ') would be safer—breaking code snippets, CSV files, or Markdown.
Example: “hello” or ‘world’
Zero-Width Space
A completely invisible spacing character that may slip into text when copying from ChatGPT or other editors.
Unicode: U+200B
Description: Adds no visible gap but still separates characters. Can break string matching, URLs, and word counts; causes “why won’t it paste correctly?” problems.
Example: wordword ← looks like one word, but a zero-width space sits between the two halves.
Why do LLM’s Use These Hidden Characters?
There are 3 reasons that likely contribute to the increased usage of invisible characters by LLMs:
Training Data Bias: Since LLMs train heavily on professionally edited texts where em dashes and smart quotes are standard, it would have learned to use them more frequently than in everyday writing.
Mimics Formal Tone: LLMs have a bias to sounding formal/authoritative and these characters help it achieve that.
LLM’s Don't Use a Keyboard: The reason we don’t use these invisible characters more is because they are not on any standard keyboard (ASCI, ISO etc). LLMs don’t “write” using a keyboard so it is no extra effort for them to use an invisible character compared to a human writing.
Common Uses and Concerns with Invisible Characters
Why do people use them?
Tidy text layout - A zero-width space (U+200B) or soft hyphen (U+00AD) lets writers nudge where a line breaks, so long words don’t dangle awkwardly at the edge of a column.
“Spaces” in usernames - Some sites forbid real spaces. Sneaking in an invisible Hangul Filler (U+3164) keeps John Doe readable while still passing the “no-space” rule.
Subtle watermarks - Publishers can hide a unique pattern of zero-width marks inside an article. If the text leaks, those invisible dots act like digital fingerprints.
Why do people worry about them?
Formatting - Looks the same, acts differently - Two snippets that appear identical may hash or sort differently once hidden characters are factored in, breaking exports, searches, or audit trails.
Invisible hiding places for bad code - Attackers can bury malware or secret instructions among zero-width characters; the file compiles or runs, but a human code reviewer sees nothing unusual.
Silent prompt tricks on AI - Hidden Unicode can smuggle extra instructions into a chatbot prompt, making the model reveal data or generate harmful content without the user noticing.
Your Text Looks AI-generated - If text has a lot of unique formatting that is heavily used by AI (Em Dash etc) it is clear that it was AI-generated which can cause reputational harm.
Table of All Common “Invisible” Unicode Text Characters
Below is a complete list of the most common invisible text characters and how our free tool handles them.
#
Unicode
Visibility
Description
Example
Replaced With
ANSI US Keyboard
HTML
Windows Typing
1
U+0020
Visible
Regular Space
word word
keep
Spacebar
 
Alt+32
2
U+00A0
Visible
No-Break Space
word word
→ space
 
Alt+0160
3
U+0009
Visible
Tab
word → word
→ 4 spaces
Tab
	
Alt+9
4
U+000A
Visible
Line Feed (LF)
line1\nline2
keep in double
Enter


Alt+10
5
U+000C
Visible
Form Feed

Alt+12
6
U+001C
Visible
File Separator
remove

Alt+28
7
U+000D
Visible
Carriage Return (CR)
remove
Enter

8
U+2000
Visible
En Quad
→ space
 
Alt+2000
9
U+2001
Visible
Em Quad
→ space
 
Alt+2001
10
U+2002
Visible
En Space
→ space
 
Alt+2002
11
U+2003
Visible
Em Space
→ space
 
Alt+2003
12
U+2004
Visible
Three-Per-Em Space
→ space
 
Alt+2004
13
U+2005
Visible
Four-Per-Em Space
→ space
 
Alt+2005
14
U+2006
Visible
Six-Per-Em Space
→ space
 
Alt+2006
15
U+2007
Visible
Figure Space
→ space
 
Alt+2007
16
U+2008
Visible
Punctuation Space
→ space
 
Alt+2008
17
U+2009
Visible
Thin Space
→ space
 
Alt+2009
18
U+200A
Visible
Hair Space
→ space
 
Alt+200A
19
U+202F
Visible
Narrow NBSP
→ space
 
Alt+202F
20
U+205F
Visible
Math Space
→ space
 
Alt+205F
21
U+3000
Visible
Ideographic Space
→ space
 
Alt+3000
22
U+1680
Visible
Ogham Space Mark
→ space
 
23
U+200B
Invisible
Zero-Width Space
→ space
​
Alt+200B
24
U+200C
Invisible
Zero-Width Non-Joiner
remove
‌
Alt+200C
25
U+200D
Invisible
Zero-Width Joiner
remove
‍
Alt+200D
26
U+200E
Invisible
Left-To-Right Mark
remove
‎
Alt+200E
27
U+200F
Invisible
Right-To-Left Mark
remove
‏
Alt+200F
28
U+202A
Invisible
LTR Embedding
remove
‪
Alt+202A
29
U+202B
Invisible
RTL Embedding
remove
‫
Alt+202B
30
U+202C
Invisible
Pop Directional Fmt
remove
‬
Alt+202C
31
U+202D
Invisible
LTR Override
remove
‭
Alt+202D
32
U+202E
Invisible
RTL Override
remove
‮
Alt+202E
33
U+2060
Invisible
Word Joiner
remove
⁠
Alt+2060
34
U+2061
Invisible
Function Application
remove
⁡
Alt+2061
35
U+2062
Invisible
Invisible Times
→ "x"
⁢
Alt+2062
36
U+2063
Invisible
Invisible Separator
→ ","
⁣
Alt+2063
37
U+2064
Invisible
Invisible Plus
→ "+"
⁤
Alt+2064
38
U+2066
Invisible
LTR Isolate
remove
⁦
Alt+2066
39
U+2067
Invisible
RTL Isolate
remove
⁧
Alt+2067
40
U+2068
Invisible
First Strong Isolate
remove
⁨
Alt+2068
41
U+2069
Invisible
Pop Directional Isolate
remove
⁩
Alt+2069
42
U+206A
Invisible
Inhibit Symmetric Swap
remove

Alt+206A
43
U+206B
Invisible
Activate Symmetric Swap
remove

Alt+206B
44
U+206C
Invisible
Inhibit Arabic Form Shaping
remove

Alt+206C
45
U+206D
Invisible
Activate Arabic Form Shaping
remove

Alt+206D
46
U+206E
Invisible
National Digit Shapes
remove

Alt+206E
47
U+206F
Invisible
Nominal Digit Shapes
remove

Alt+206F
48
U+2028
Invisible
Line Separator
→ \n


Alt+2028
49
U+2029
Invisible
Paragraph Separator
→ \n\n


Alt+2029
50
U+2014
Visible
Em Dash
word—word
keep
—
51
U+2013
Visible
En Dash
pages 1–10
keep
–
52
U+2019
Visible
Right Single Quote
don't
→ '
’
53
U+201C
Visible
Left Double Quote
"hello
→ "
“
54
U+201D
Visible
Right Double Quote
hello"
→ "
”
55
U+2018
Visible
Left Single Quote
'hello
→ '
‘
56
U+2026
Visible
Horizontal Ellipsis
wait...
→ ...
…
57
U+00AD
Invisible
Soft Hyphen
remove
­
58
U+034F
Invisible
Grapheme Joiner
remove
͏
59
U+2800
Visible
Braille Blank
→ space
⠀
Alt+2800
60
U+3164
Visible
Hangul Filler
→ space
ㅤ
Alt+3164
61
U+115F
Visible
Hangul Choseong Filler
remove
ᅟ
Alt + 115F
62
U+1160
Visible
Hangul Jungseong Filler
remove
ᅠ
Alt + 1160
63
U+17B4
Visible
Khmer Vowel Inherent AQ
remove
឴
Alt + 17B4
64
U+17B5
Visible
Khmer Vowel Inherent AA
remove
឵
Alt + 17B5
65
U+180B
Invisible
Mongolian VS-1
remove
᠋
Alt + 180B
66
U+180C
Invisible
Mongolian VS-2
remove
᠌
Alt + 180C
67
U+180D
Invisible
Mongolian VS-3
remove
᠍
Alt + 180D
68
U+180E
Visible
Mongolian Vowel Sep.
→ space
᠎
Alt + 180E
69
U+FE00
Invisible
Variation Selector-1
remove
︀
Alt + FE00
70
U+FE01
Invisible
Variation Selector-2
remove
︁
Alt + FE01
71
U+FE02
Invisible
Variation Selector-3
remove
︂
Alt + FE02
72
U+FE03
Invisible
Variation Selector-4
remove
︃
Alt + FE03
73
U+FE04
Invisible
Variation Selector-5
remove
︄
Alt + FE04
74
U+FE05
Invisible
Variation Selector-6
remove
︅
Alt + FE05
75
U+FE06
Invisible
Variation Selector-7
remove
︆
Alt + FE06
76
U+FE07
Invisible
Variation Selector-8
remove
︇
Alt + FE07
77
U+FE08
Invisible
Variation Selector-9
remove
︈
Alt + FE08
78
U+FE09
Invisible
Variation Selector-10
remove
︉
Alt + FE09
79
U+FE0A
Invisible
Variation Selector-11
remove
︊
Alt + FE0A
80
U+FE0B
Invisible
Variation Selector-12
remove
︋
Alt + FE0B
81
U+FE0C
Invisible
Variation Selector-13
remove
︌
Alt + FE0C
82
U+FE0D
Invisible
Variation Selector-14
remove
︍
Alt + FE0D
83
U+FE0E
Invisible
Variation Selector-15
remove
︎
Alt + FE0E
84
U+FE0F
Invisible
Variation Selector-16
remove
️
Alt + FE0F
85
U+FEFF
Invisible
Zero-Width NBSP / BOM
remove

Alt + FEFF
86
U+FFA0
Visible
Half-width Hangul Filler
→ space
ᅠ
Alt + FFA0
87
U+FFFC
Visible
Object Replacement
→ "[OBJECT]"

Alt + FFFC
Does Adding or Removing Invisible Characters Help Bypass AI Detectors?
No - Our short test below shows that the addition or removal of hidden characters did not change the detectability of AI-generated content. We created 2 pieces of AI content and modified it by adding extra invisible characters, as well as stripping all Unicode characters using our free tool on this page.
The result was that the content was detectable by most tools regardless of the status of Invisible Characters.
Below are the findings…
Text AI Sample 2 - Extra Invisible Characters had a TON of Hidden Characters!
Getting ChatGPT to intentionally add extra characters resulted in this warning from them…
Do All LLMs Inject Hidden Characters?
Yes - We looked at the same prompt in several popular LLMs to see if Hidden Characters are
Prompt: Write a LinkedIn post about the benefits of formatting LinkedIn posts.
No model produced an Invisible Character but all relied heavily on formatting with visible Unicode characters.
Founder / CEO of Originality.ai I have been involved in the SEO and Content Marketing world for over a decade. My career started with a portfolio of content sites, recently I sold 2 content marketing agencies and I am the Co-Founder of MotionInvest.com, the leading place to buy and sell content websites. Through these experiences I understand what web publishers need when it comes to verifying content is original. I am not For or Against AI content, I think it has a place in everyones content strategy. However, I believe you as the publisher should be the one making the decision on when to use AI content. Our Originality checking tool has been built with serious web publishers in mind!