People Companies Advertise Archives Contact Us Jason Dowdell

Main > Archives > 2006 > March > Google Special Character Search String Bug

Wednesday, March 15, 2006

Google Special Character Search String Bug

Reproducible Google BugThere are only two special characters (on a standard keyboard) that return any results in Google. The other 20 symbols force Google to return a blank page. Not a page that says 0 Results Returned, but literally a blank page. Only the standard Google header and footer are displayed.

These are the only symbols that return any results.
& (ampersand)
_ (underscore)

These 20 return no result count at all and are a bug (or a feature as I'm sure Google would claim).
` acute apostraphe
~ tilda
! exclamation point
@ at sign
## pound
$ dollar sign
% percent
^ carat
* asterick
( left parenthesis
) right parenthesis
- dash
= equal sign
+ plus sign
/ right slash
, comma
< less than symbol
. period
> greater than symbol
? question mark
; semi-colon
: colon
' apostraphe
" double quote
[ left bracket
{ left curly bracker
] right bracket
} right curly bracket
\ left slash
| pipe symbol

What can be learned from this mysterious behavior in Google's search results?
The two that return search results & (ampersand) and _ (underscore) are not part of any special query syntax used by Google and are found quite often in business names (at least the ampersand is). There must be something more to the fact that the underscore returns results but I have yet to put my finger on it.

Google allows you to conduct advanced searches using specific symbols and constructs and calls them the Advanced Search Operators as seen here. Further below I show the list of Google's publicly known advanced search operators.

For example: if you want to find all of the pages Google has indexed on marketingshift you'd simply type site:www.marketingshift.com.

The key here is the ":" (colon) because it tells Googles query parser to look for a special search construct operator to the left of the colon. Thus it makes sense that Google doesn't use the ":" (colon) in determining relevance on a given page. But it makes no sense to not inform the user via the search results page of this behavior. This only explains a few symbols on the list of special characters causing the blank search results page to be shown.

The complete list of advanced search operators in Google are as follows
~ (when placed in front of a word also finds results with synonyms for that word)
" (used to search for exact phrases when enclosing a phrase in double quotes)
- (used when you want to exclude a term from the search results or to subtract numbers)
+ (used to require a term in addition to the main keyphrase in the search results or to add numbers)
: (string to left of colon denotes special type of search to execute)
* (wildcard character that represents one or more words or to multiply numbers)
/ (used to divide numbers)
% (used to calculate a percentage)
^ (used to raise the power of a number)
... (used to define a range of numbers to search between)

That leaves us with at least 18 other special characters that return no results.
` (acute apostraphe)
! (exclamation point)
@ (at sign)
# (pound)
$ dollar sign
( (left parenthesis)
) (right parenthesis)
= (equal sign)
, (comma)
< (less than symbol) > (greater than symbol)
? (question mark)
; (semi-colon)
' (apostraphe)
[ (left bracket)
{ (left curly bracker)
] (right bracket)
} (right curly bracket)
\ (left slash)
| (pipe symbol)

My best guess is that it was easier for Google to exclude all special keyboard characters by default and explicitly handle the list of known advanced search operators since that would require less resources. I also think another factor adding to them excluding the terms is that they'll have uses for them later (or are already testing more advanced query syntax currently)
and will use them in their other niche search efforts. I combed through the Google search features page but didn't find anything that gave me a better clue of up and coming (or existing) query syntax that might use the symbols but then again, I'm no Google Engineer.

But no matter what this is a bug and just gets added to the list of other Google bugs.

Steps to Breaking Google
1.) go to google.com (duh)
2.) enter any of the 18 special characters mentioned 2 paragraphs up
3.) laugh, snort and IM your buddies the link to show them what a genius you are

By Jason Dowdell at 02:23 PM | Comments (24)

(24) Thoughts on Google Special Character Search String Bug

I found your post because I was searching for the meaning of an error message that contains a '#' symbol.

Yahoo and A9 block # as well.

Seems that the search engine companies are creating a black hole in the world's search space. i.e. You could hide information in plain site by exploiting the special character bug of all the major search engines.

Comments by Ed : Wednesday, August 23, 2006 at 07:01 PM

Is there any search engine that provides search results for characters like # etc or words containing such characters?

if you search Google for #100 it returns the same results as for 100 but if you search for $100 they were forced to make an exception and it returns different results thatn for 100

Google are nuts. Their so called exact (quoted) search is not exact either. When asked about this they don't reply.

Comments by John Middlemas : Saturday, September 23, 2006 at 02:35 PM

Um...how long have you folks been online? Really?

Char() programming has been in search engines starting with Archie, created in 1990 by Alan Emtage, and later Prodigy, before most people had their first version of Windows.

These "bugs" are intentional, and all search engines are different upon encoding.

If I were you, I would take this post down.

It has been the subject of laughter and ridicule thruought IT professionals.

Comments by [ARB1D3_[00L3R : Monday, September 25, 2006 at 04:57 PM

Hi,

http://www.google.com/apis/reference.html
might help you, although you will find there are some long words in there which may be difficult.

Comments by progman : Saturday, November 04, 2006 at 09:58 AM

no need to be mean.
i've been coding (a little) for about 12 years, i learned a little basic in 88 (so i COULD call it 20 years...), and i've been a linux user for a decade. i've even used char()...

one thing i've never looked into is how search engines are coded, so i never made the connection.
i got curious about the "bug" recently and decided to look it up.
your answer gave me one of those "ah, duh!" moments, but it didn't make me feel like a retard as you seem to think i should have...

remember kids, there is a difference between ignorance and stupidity. help people learn and the world will be less of a stupid place :P

Comments by beta : Wednesday, June 04, 2008 at 07:19 PM

Thanks for posting this. I was looking for way to search for a "#" (pound sign) in conjunction with an IP address. This makes it clear one can't on Google.

Posts like this are very helpful since otherwise people will waste time looking in vain for a way to conduct such a search. Also if enough such posts are placed, maybe one or another search engines will think about a way to allow such searches.

The negative comments are uncalled for but do help explain why such searches don't work.

Comments by Richard F : Tuesday, July 29, 2008 at 07:50 AM

Thoroughly agree. No need to be mean, there's enough meanness in the world already and not enough help.

Stumbled upon this page when searching for a special dial code like *99xxx#. Yeah, I could have searched for "special dial codes" or something, but still point is, there could be legitimate search requirements like this not met by any search engines.

And the point about hiding in plain sight is interesting. How about "@l q@ed@ @$$!n@t!on pl@n$", sounds interesting enough?

Comments by Jonnyace : Thursday, September 11, 2008 at 11:44 PM

This post was useful to me - I wanted to search for mentions of something costing £535 but not all the phone numbers etc with 535 in them, and this post has made it clear that's not possible with Google. So thank you!

Comments by Tom Richards : Monday, November 03, 2008 at 06:33 AM

The way I see it -- and yes, I've been programming since '93 and online since '94 (y'all remember gopher???) -- there is absolutely no technical reason for the exclusion of any (yes, ANY) special characters when doing a search. Google *could* allow searching by any character string (even those that they're using for special purposes) if they really wanted to. The point is that they DON'T want to. They got to the point where they're the "defacto" search engine for a lot of people and have been resting on their laurels when it comes to actually improving searching on the internet. They added a "code search" a while back, but even that has the same irritating limitations on characters used for searching -- try searching for something that uses some actual code sequence (like what I was looking for '\A' (that's a backslash A in case it get's stripped) and you'll find all sorts of results for 'A'. Um. That's a BIG difference. Yes, backslash IS the character escape character of choice for a lot of languages, but they are receiving a string from the browser. They could escape the backslash to allow it in searches. They could do a lot of things.

I think the point is that Google needs to be made aware that people are getting FED UP with their lack of allowing people to search on these chars and this post will (hopefully one day) wake them up to that fact.

Comments by Jon Warren : Wednesday, December 17, 2008 at 02:49 PM

same here, searched for <<< and no results were returned. thanks for the post.

Comments by Qatari Boy : Friday, January 02, 2009 at 05:36 AM

Apparently if google were to allow searches for special characters, the time of processing would be enormous:

http://www.google.com/support/websearch/bin/answer.py?answer=430

Comments by Cesaar : Tuesday, January 13, 2009 at 03:55 PM

Amusing to see programmers explaining that they're all incompetent, not just Google.

Comments by Andy : Monday, January 19, 2009 at 04:07 PM

"It has been the subject of laughter and ridicule thruought IT professionals."

Hey John Middlemas, I like your statement. I also like you and your little group of IT professionals. You obviously found this posting somehow so google must have gave it enough importance to have it distributed across the web. Most people just can't shut up until they make ass of themselves. If you don't have anything important to contribute to this posting then don't say crap. I would understand if this posting bad in some ways but it's not. I found it useful because I was searching for character string.

Comments by The Koi Man : Tuesday, March 17, 2009 at 01:13 PM

Jason, your site isn't taking the apostrophe too well.

Comments by The Koi Man : Tuesday, March 17, 2009 at 01:18 PM

@The Koi Man

Um, just so you know, "John Middlemas" did not write the comment to which you are replying. "[ARB1D3_[00L3R" wrote it. I do agree with you that the original comment was uncalled for, but if you're going to take the time to b*tch someone out over it, you should probably make sure you're b*tching out the correct person, you know?

Comments by fata_morgana_pseudonym : Monday, April 27, 2009 at 03:28 PM

>>remember kids, there is a difference between ignorance and stupidity. help people learn and the world will be less of a stupid place :P

I think you mean ignorant place, stupid.

Comments by bob smith : Friday, June 26, 2009 at 03:43 PM

If found this post when looking for how to search for java / xhtml code.

Special chars works somewhat if you try
http://www.google.com/codesearch

E.g searching for #{dataModel will actually return some docs with this on, as opposed to normal google search

Comments by jens : Tuesday, November 17, 2009 at 07:00 AM

Bug or not, it's annoying as hell. When is someone going to invent a search engine that will let me search for exact phrases, special characters included? Google's code search would be fine if it indexed the rest of the internet. It always baffles me how coders never write software for themselves.

Comments by idiot : Wednesday, November 18, 2009 at 09:41 AM

I don't think this is a stupid post (although I smirked at bob smith's post!). Google could tackle this if they wanted to.

I'm looking for a way to remove the dot on whole numbers going from the Excel format #,##0.## Google thinks I'm searching for 0 on normal and code search.

Maybe it's costly to include special characters in search time but how about making it an option, so normally they're ignored but something like [#,##0.##] forces them to be captured?

I've no idea how to put my search into words, so Google has failed me this time around.

Comments by stupid : Friday, November 20, 2009 at 06:47 AM

The technical reason (yes, there is one) is that Google (and Lycos and Altavista and Yahoo) is not a character search engine. It is a word search engine, a database of websites and words. All special characters are treated as delimiters between words (unless they are a part of a word in Google's vocabulary e g C++). A character search through the entire web would take a very long time, a word search/look-up takes a couple of seconds.

Comments by S Arnold : Monday, April 12, 2010 at 08:07 AM

Exactly, thanks, useful post.

I've been trying to find out which template engine uses the '<?=' syntax. Impossible to figure out with Google :-(

Comments by dr B : Saturday, September 18, 2010 at 04:07 PM

Great article, I have many times looked for an exact search including all ascii characters only to be thwarted by the likes of Google.

Comments by John : Sunday, September 19, 2010 at 08:31 AM

Besides glyphs it would be nice if searches for musical themes could be done and on image content were more accurate. How about finding images that only use Blue with yellow?

Comments by Fame Ous Anon : Tuesday, September 21, 2010 at 10:39 AM

It's absolutely not true that it would take more time to search for special characters. Why would you think searching for "a" would be less complex than searching for ":///"? The letter "a" occurs a countless number of times on the internet. The combination of characters does not.

It's already been said, but I'll reiterate: Google INTENTIONALLY does not allow searching for characters. It only takes the simplest understanding of code to know how to implement that feature; the fact that they haven't is literally proof that they do not care about providing a quality product. They have no reason to, because their users do not pay them. Ad revenue has nothing to do with it because advertisers will never request that feature.

Google is the Macbook of search engines. It's simplified and dumbed down so that the entire public can use it. The problem is that nobody has built a quality alternative. Nobody wants to make money I guess.

Comments by Anon : Friday, September 24, 2010 at 07:31 PM

Post a Comment











Subscribe to Marketing Shift PostsSubscribe to The MarketingShift Feed