PDF | Google is the most popular search engine ever created, but Google's search capabilities are so powerful, they sometimes discover. in the “Google Hacking” book. • For much more detail, I encourage you to check out. “Google Hacking for Penetration Testers” by Syngress. Publishing. Google doesn't care if you type your query in lowercase letters (hackers), up- percase (HACKERS Google Hacking Google Maps and Google Earth.
|Language:||English, Spanish, Arabic|
|Genre:||Science & Research|
|PDF File Size:||19.23 MB|
|Distribution:||Free* [*Regsitration Required]|
Hacking for Penetration Testers (Syngress, ISBN: ). this query would return every PDF file that Google has crawled, but it. Google Hacking. Making Competitive. Intelligence Work for You. Google Hacking . Making Competitive . Google Hacking for Penetration Testers. Johnny Long. Google Hacking for Penetration Testers Using Google as a Security Testing Tool Johnny Long [email protected] What we're doing • I hate pimpin', but.
Improper command termination can be abused quite easily by an attacker. Socket; calls. We need IO:: Page Scraping with Perl This piece of code drives all the The socket is subroutines. Even if password protected, the client reveals the server name and port. Thanks to lester for this one! Thanks to murfie for this one! Thanks to server1 for this one!
Active WebCam Thanks klouw! Toshiba Network Cameras intitle: Who do you want to disconnect today? Found by m00d! Thanks to darksun for this one! Hrmmm… Thanks JimmyNeutron! Firewalls - Smoothwall Uh oh… this firewall needs updating… Thanks Milkman! Firewalls - IPCop Uh oh… this one needs updating too! Thanks Jimmy Neutron! IDS Data: Cisco Switches Thanks Jimmy Neutron! Too lazy to install PHP Nuke?
Thanks to arrested for this beauty! Thanks stonersavant! Thanks murfie! Oh wait.. This product allows web management of power outlets! Google search locates login page. What does any decent hacker do to a login page?
Hacking Power Systems! Who do you want to power off today? Thanks to JimmyNeutron for this beauty! Sipura SPA B: Or the last number that dialed them? Thanks stonersavant!!! Videoconferencing Who do you want to disconnect today? Thanks yeseins!!! PBX Systems No password required. Usernames, Passwords and Secret Stuff, oh my! Digital camera image dumps…. Thanks xlockex! Old School! Finger… Google Hacking circa !!?!? Thanks to Jimmy Neutron! Open SQL servers Already logged in, no hacking required!
Thanks Quadster! Netscape History Files Oops.. POP email passwords! Thanks to digital. What do you want to delete today??? Thanks JimmyNeutron! More Explorers?!?! Why hack when you can… click? Are sensitive, non-public Government documents on the web? Locked out! Credit card info on the web? Getting shell.. Getting serialz… wha-hay!! Generosity like this could change the world. Students have a right to know what crimes take place on campus. This one is for a county.
Credit Validation Question: What keeps someone from using a pilfered credit card number and expiration date to make an online purchase? That little code on the back of the card. Steal their identity.
The most commonly used Boolean operator is AND. This operator is used to include multiple terms in a query. For example, a simple query like hacker could be expanded with a Boolean operator by querying for hacker AND cracker.
The latter query would include not only pages that talk about hackers but also sites that talk about hackers and the snacks they might eat. Some search engines require the use of this operator, but Google does not.
By default, Google automatically searches for all the terms you include in your query. In fact, Google will warn you when you have included terms that are obviously redundant, as shown in Figure 1. There should be no space following the plus symbol. For example, if you were to search for and, justice, for, and all as separate, distinct words, Google would warn that several of the words are too common and are excluded from the search.
To force Google to search for those common words, preface them with the plus sign. It has no ill effects if it is used excessively. In addition, the words could be enclosed in double quotes. This generally will force Google to include all the common words in the phrase.
This query presented as a phrase would be and justice for all. Another common Boolean operator is NOT. The best way to use this oper- ator is to preface a search word with the minus sign —. Be sure to leave no space between the minus sign and the search term. Consider a simple query such as hacker. This query is very generic and will return hits for all sorts of occupations, like golfers, woodchoppers, serial killers, and those with chronic bronchitis. With this type of query, you are most likely not interested in each and every form of the word hacker but rather a more specific rendi- tion of the term.
To narrow the search, you could include more terms, which Google would automatically AND together, or you could start narrowing the search by using NOT to remove certain terms from your search.
To remove some of the more unsavory characters from your search, consider using queries such as hacker —golf or hacker —phlegm. Or just try a Google Video search for lumberjack song. Talk about twisted.
A less common and sometimes more confusing Boolean operator is OR. The OR operator, represented by the pipe symbol or simply the word OR in uppercase letters, instructs Google to locate either one term or another in a query. Forget all that order of operations stuff you learned in high school algebra.
For our purposes, an AND is weighed equally with an OR, which is weighed as equally as an advanced operator. These factors may affect the rank or order in which the search results appear on the page, but have no bearing on how Google han- dles the search query. From those pages, show me only the pages that contain either the words username, userid, or user in the text of the document.
From those pages, only show me documents that are CSV files. For the purposes of learning how to create queries, all we need to remember is that Google reads our query from left to right.
Fortunately, Google is not offended or affected by parenthesis. The previous query can also be submitted as intext: Although Google tends to provide very relevant results for most basic searches, we will begin looking at fairly complex searches aimed at locating a very narrow subset of Web sites. GNU Zebra uses a file called zebra. After downloading the latest version of Zebra from the Web, we learn that the included zebra.
Interface's description. Static default route sample. However, Google takes some liberties with this search query, making the results less than adequate, as shown in Figure 1. This makes our next query: For starters, the seattlewireless hit we had in our first search is missing. This was a valid hit, but because the configuration file was not named zebra. This is a great lesson to learn about search reduction: This makes our new query "!
Software installations like this one often ship with a sample configuration file to help guide the process of setting up a custom configuration. Most users will simply edit this file, changing only the settings that need to be changed for their environments, saving the file not as a. In this situation, the user could have a live configura- tion file with the term zebra.
Reduction based on this term may remove valid configuration files created in this manner. Notice that our zebra. This is less a gamble than reducing based on zebra. If Google safely ignores part of a human-friendly query, leave it alone. The human readers will thank you! Every Google query can be represented with a URL that points to the results page. Submitting a search through the Web interface takes you to a results page that can be represented by a single URL.
For example, consider the query ihack- stuff. Once you enter this query, you are whisked away to a URL similar to the following: This URL then becomes not only an active connection to a list of results, it also serves as a nice, compact sort of shorthand for a Google query. Any experienced Google searcher can take a look at this URL and realize the search subject.
This URL can also be modified fairly easily. By changing the word ihackstuff to iwritestuff, the Google query is changed to find the term iwritestuff.
This simple example illustrates the usefulness of the Google URL for advanced searching. A quick modification of the URL can make changes happen fast!
The first part of the URL, www. Browsing to this URL presents you with a nice, blank search page. The question mark after the word search indicates that parameters are about to be passed into the search script. Parameters are options that instruct the search script to actually do something.
The basic syntax will look something like this: Special Characters Hex encoding is definitely geek stuff, but sooner or later you may need to include a special character in your search URL.
Most modern browsers will adjust a typed URL, replacing special characters and spaces with hex-encoded equivalents. If your browser supports this behavior, your job of URL construction is that much easier. Try this simple test. If your browser refuses to convert those spaces, the query will not work as expected. There may be a setting in your browser to modify this behavior, but if not, do yourself a favor and use a modern browser.
Internet Explorer, Firefox, Safari, and Opera are all excel- lent choices. You start with a URL and you modify it as needed to achieve varying search results. If you need some added parameters, you can add them directly to the base URL in any order.
If you need to modify parameters in your search, you can change the value of the parameter and resubmit your search. If you need to remove a parameter, you can delete that entire parameter from the URL and resubmit your search.
You simply make changes to the URL and press Enter. The browser will automatically fetch the address and take you to an updated search page. Depending on the options you selected and the search terms you provided, you will see some or all of the vari- ables listed in Table 1. These parameters can be added or modified as needed to change your search criteria.
This should be set to your native tongue. Located Web pages are not translated. Only display pages written in this language. Google suggests UTF This negates the need to sur- round the phrase with quotes. The lr value instructs Google to only return pages written in a specific language. This is not the same as the lr variable, which restricts our results to pages written in a specific language, nor is it like the translation service, which translates a page from one language to another.
We have not asked Google to restrict or modify our search in any way. Notice that our URL is different: Unlike the hl option Table 1. We have asked Google to return only pages written in Danish.
This means that if you change this value in your URL, it sticks for future searches. The best way to change it back is through Google preferences or by changing the hl code directly inside the URL.
However, restrict has nothing to do with language. This variable gives you the ability to restrict your search results to one or more countries, determined by the top-level domain name. Although inexact, this vari- able works amazingly well. Consider a search for people in which we restrict our results to JP Japan , as shown in Figure 1.
Our URL has changed to include the restrict value shown in Table 1. As our sidebar reveals, the host does in fact appear to be located in Japan. Japan Network Information Center address: Chiyoda-ku, Tokyo , Japan country: JP phone: Beginners to Google searching are encouraged to use the Google-provided forms for searching, paying close attention to the messages and warnings Google provides about syntax.
Boolean opera- tors such as OR and NOT are available through the use of the minus sign and the word OR or the symbol , respectively, whereas the AND operator is ignored, since Google automat- ically includes all terms in a search.
Advanced search options are available through the Advanced Search page, which allows users to narrow search results quickly. Advanced Google users narrow their searches through customized queries and a healthy dose of expe- rience and good old common sense.
These basic premises serve as the foundation for a successful search. They can also help clarify a search for fellow humans who might read your queries later on. To have your questions about this chapter answered by the author, browse to www. Some people like using nifty toolbars. Where can I find information about Google tool- bars? Ask Google. Google can almost always provide an answer if you can figure out the query.
There are a few ways. From the search results page, modify the query slightly and look at how the URL changes when you submit it. Keep an eye on the search engine hacking forums at http: It boils down to personal preference, and many advanced Google users use each of these techniques in different ways.
Many lengthy Google sessions begin as a simple query typed into the www. Depending on the narrowing process, it may be easier to add or subtract from the query right in the search field. Which technique you decide to use ultimately depends on your tastes and the context in which you perform searches. Chapter 2 Advanced Operators Solutions in this chapter: When advanced operators are not provided in a query, Google will locate your search terms in any area of the Web page, including the title, the text, the Uniform Resource Locator URL , or the like.
We take a look at the following advanced operators in this chapter: Although they re relatively easy to use, they have a fairly rigid syntax that must be followed.
The basic syntax of an advanced operator is operator: When using advanced opera- tors, keep in mind the following: In most cases, Google will treat a syntactically bad advanced operator as just another search term. For example, providing the advanced operator intitle without a following colon and search term will cause Google to return pages that contain the word intitle.
For example, a search term can be a single word or a phrase sur- rounded by quotes. If you use a phrase, just make sure there are no spaces between the operator, the colon, and the first quote of the phrase. Some advanced operators combine better than others, and some simply cannot be com- bined.
We will take a look at these limitations later in this chapter. They are generally used once per query and cannot be mixed with other operators.
Examples of valid queries that use advanced operators include these: Google This query will return pages that have the word Google in their title. Remember from the previous chapter that this query could also be given as intitle: This technique also makes it easy to supply a phrase without having to type the spaces and the quotation marks around the phrase.
Google interprets that space as the end of your advanced operator search term and continues processing the rest of the query. Again, notice that intitle only applies to the phrase index of. Figure 2. These messages are often the key to unraveling errors in either your query string or your URL, so keep an eye on the top of the results page.
Sometimes, however, Google is less helpful, returning a blank results page with no error text, as shown in Figure 2. In this case, we simply abused the allintitle operator. Most of the operators that begin with all do not mix well with other operators, like the inurl operator we provided.
This search got Google all confused, and it coughed up a blank page. Some operators can only be used in performing a Web search, and others can only be used in a Groups search. Refer to Table 2. If you have trouble remembering these rules, keep an eye on the results line near the top of the page. If Google picks up on your bad syntax, an error message will be displayed, letting you know what you did wrong.
Sometimes, however, Google will not pick up on your bad form and will try to perform the search anyway. These are the words Google interpreted as your search terms. Intitle and Allintitle: The title is displayed at the top of most browsers when viewing a page, as shown in Figure 2.
In the context of Google groups, intitle will find the term in the title of the message post. For example, consider the same page shown in Figure 2. When using intitle, be sure to consider what text is actually from the title and which text might have been inserted by the browser. The thing to remember is that the title is the text that appears at the top of the Web page, and you can use intitle to locate text in that spot. Allintitle breaks this rule.
Allintitle tells Google that every single word or phrase that follows is to be found in the title of the page. For example, we just looked at the intitle: Notice also that the allintitle search is also more restrictive, returning only a fraction of the results as the intitle search. Google highlights your search terms everywhere they appear in the search results. Experiment with modifying a Google cache URL.
Locate your search terms in the URL, and add words around your search terms. If you do it correctly and those words are present, Google will highlight those new words on the page. Locate a String Within the Text of a Page The allintext operator is perhaps the simplest operator to use since it performs the function that search engines are most known for: For this reason, the allintext operator should not be mixed with other advanced operators.
Inurl and Allinurl: Finding Text in a URL Having been exposed to the intitle operators, it might seem like a fairly simple task to start throwing around the inurl operator with reckless abandon.
I encourage such flights of searching fancy, but first realize that a URL is a much more complicated beast than a simple page title, and the workings of the inurl operator can be equally complex.
The beginning of a URL consists of a protocol, followed by: Following the pathname comes an optional filename.
A common basic URL, like http: The protocol, http, indicates that this is basically a Web server. The server is located at www. As we saw in the previous chapter, a Google search can be conveyed as a URL, which can look something like http: Second, there are a ton of special characters sprinkled around the URL, which Google also has trouble weeding through. Attempting to specifically include these special characters in a search could cause unexpected results and might limit your search in undesired ways.
Third, and most important, other advanced operators site and filetype, for example can search more specific places inside the URL even better than inurl can. These factors make inurl much trickier to use effectively than an intitle search, which is very simple by comparison.
As with the intitle operator, inurl has a companion operator, known as allinurl. Consider the inurl search results page shown in Figure 2. Replacing the intitle search with an allintitle search, we receive the results page shown in Figure 2. This time, Google was instructed to find the words admin and index only in the URL of the document, resulting in about a million less hits.
Just like the allintitle search, allinurl tells Google that every single word or phrase that follows is to be found only in the URL of the page. And just like allintitle, allinurl does not play very well with other queries. Narrow Search to Specific Sites Although technically a part of a URL, the address or domain name of a server can best be searched for with the site operator. Site allows you to search only for pages that are hosted on a specific server or in a specific domain.
Although fairly straightforward, proper use of the site operator can take a little bit of getting used to, since Google reads Web server names from right to left, as opposed to the human convention of reading site names from left to right. Consider a common Web server name, www. To locate pages that are hosted on blackhat. Both of these servers end in blackhat. Take, for example, a query for site: Truth be told, this result is odd.
Google and the Internet at large reads server names really domain names from right to left, not from left to right. So a Google query for site: So why does Google return results? What is that thing? I coined the term googleturd to describe what is most likely a typo that was crawled by Google.
Depending on certain undisclosed circumstances, oddball links like these are sometimes retained. Googleturds can be useful, as we will see later on. The filetype operator can help you search for these types of files. More specifically, filetype searches for pages that end in a particular file extension. The file extension is the part of the URL following the last period of the filename but before the question mark that begins the parameter list. Since the file extension can indicate what type of program opens a file, the filetype operator can be used to search for specific types of files by searching for a specific file extension.
Extension Approx. Just look at how many more hits Google is reporting! The jump in hits is staggering. TIP The ext operator can be used in place of filetype. A query for filetype: You can see that Google has searched and converted a file by looking at the results page shown in Figure 2.
This indicates that Google recognized the file as a Microsoft Word docu- ment. A link to the original file is also provided. This is the cached version of the original page, converted to HTML.
Keep these things in mind: This operator flakes out when ORed. As an example, the query filetype: The query filetype: However, when you start adding to this precocious combination with things like filetype: This operator can be mixed with other operators and search terms.
The real hackers play in the gray areas all the time. The filetype operator opens up another interesting playground for the true Google hacker. Consider the query filetype: At the time of this writing, this query returns over 7, results, all of which are odd in their own right. Search for Links to a Page The link operator allows you to search for pages that link to other pages. Instead of pro- viding a search term, the link operator requires a URL or server name as an argument.
Shown in its most basic form, link is used with a server name, as shown in Figure 2. The link operator can be extended to include not only basic URLs, but complete URLs that include directory names, filenames, parameters, and the like. Keep in mind that long URLs are much more specific and will return fewer results than their shorter counterparts.
In fact, the cached banner does not make any reference to your search query, as shown in Figure 2. To properly use the link operator, you must provide a full URL including protocol, server, directory, and file , a partial URL including only the protocol and the host , or simply a server name; otherwise, Google could return unpredictable results.
As an example, consider a search for link: This search is not the proper syntax for a link search, since the domain name is invalid. The correct syntax for a search like this might be link: So what exactly is being returned from Google for a search like link: Figures 2. Google offers another clue as to how it handles invalid link searches through the cache page. As shown in Figure 2. The link operator cannot be used with other operators or search terms. Locate Text Within Link Text This operator can be considered a companion to the link operator, since they both help search links.
The inanchor operator, however, searches the text representation of a link, not the actual URL. For example, in Figure 2. When you click that link, you are taken to the URL http: If you were to look at the actual source of that page, you would see something like this: This is not the same as using inurl to find this page with a query like inurl: Computers inurl: This search will be handy later, especially when we begin to explore ways of searching for relationships between sites.
The inanchor operator can be used with other operators and search terms. If you would like to jump right to the cached version of a page without first performing a Google query to get to the cached link on the results page, you can simply use the cache advanced operator in a Google query such as cache: Just as with the link operator, passing an invalid hostname or URL as a parameter to cache will submit the query as a phrase search.
A search for cache: The cache operator can be used with other operators and terms, although the results are somewhat unpredictable. Search for a Number The numrange operator requires two parameters, a low number and a high number, separated by a dash. This operator is powerful but dangerous when used by malicious Google hackers.
As the name suggests, numrange can be used to find numbers within a range. For example, to locate the number , a query such as numrange: When searching for numbers, Google ignores symbols such as currency markers and commas, making it much easier to search for numbers on a page. A shortened version of this operator exists as well. Instead of supplying the numrange operator, you can simply provide two num- bers in a query, separated by two periods.
The shortened version of the query just men- tioned would be Notice that the numrange operator was left out of the query entirely. This operator can be used with other operators and search terms. It would be extremely irresponsible of me to share these pow- erful queries with you. Fortunately, the abuse of this operator has been curbed due to the diligence of the hard-working members of the Search Engine Hacking forums at http: The members of that community have taken the high road time and time again to get the word out about the dangers of Google hackers without spilling the beans and creating even more hackers.
This sidebar is dedicated to them! Search for Pages Published Within a Certain Date Range The daterange operator can tend to be a bit clumsy, but it is certainly helpful and worth the effort to understand. You can use this operator to locate pages indexed by Google within a certain date range.
Every time Google crawls a page, this date changes. If Google locates some very obscure Web page, it might only crawl it once, never returning to index it again. If you find that your searches are clogged with these types of obscure Web pages, you can remove them from your search and subsequently get fresher results through effective use of the daterange operator.
The parameters to this operator must always be expressed as a range, two dates separated by a dash. If you only want to locate pages that were indexed on one specific date, you must provide the same date twice, separated by a dash.
It is too easy to be true. Both dates passed to this operator must be in the form of two Julian dates. The Julian date is the number of days that have passed since January 1, B. For example, the date September 11, , is represented in Julian terms as Google does not officially support the daterange operator, and as such your mileage may vary. Google seems to prefer the date limit used by the advanced search form at www.
As we discussed in the last chapter, this form creates fields in the URL string to perform specific functions.
For example, to find pages that have been updated within the past three months and that contain the word Google, use the query http: This might be a better alternative date restrictor than the clumsy daterange operator. Just understand that these are very different functions.
The daterange operator must be used with other search terms or advanced operators. It will not return any results when used by itself. The parameter to this operator must be a valid URL or site name. You can achieve this same functionality by supplying a site name or URL as a search query. Just as with the link and cache operators, passing an invalid hostname or URL as a parameter to info will submit the query as a phrase search. A search for info: Show Related Sites The related operator displays sites that Google has determined are related to a site, as shown in Figure 2.
The parameter to this operator is a valid site name or URL. Passing an invalid hostname or URL as a parameter to related will submit the query as a phrase search. A search for related: The related operator cannot be used with other operators or search terms. Search Groups for an Author of a Newsgroup Post The author operator will allow you to search for the author of a newsgroup post. The param- eter to this option consists of a name or an e-mail address. Attempting to use this operator outside a Groups search will result in an error.
Johnny, the search results will include posts written by anyone with the first, middle, or last name of Johnny, as shown in Figure 2. In most cases, these are not real names. This is the nature of the newsgroup beast.
Pseudo-anonymity is fairly easy to maintain when anyone can post to newsgroups through Google using nothing more than a free e-mail account as verification. Simple searches such as author: Johnny or author: Johnny ihackstuff. Consider a search like author: This search fails pretty miserably, as shown in Figure 2. Search Group Titles This operator allows you to search the title of Google Groups posts for search terms. This operator only works within Google Groups.
This is one of the operators that is very com- patible with wildcards. For example, to search for groups that end in forsale, a search such as group: In some cases, Google finds your search term not in the actual name of the group but in the keywords describing the group. Consider the search group: Not all of the groups returned contain the word win- dows, but all the returned groups discuss Windows topics. If you get odd results when throwing group into the mix, try using other operators such as intitle to compensate.
Search Google Groups Subject Lines The insubject operator is effectively the same as the intitle search and returns the same results. Searches for intitle: This is most likely because the subject of a group post is also the title of the post. Just like the intitle operator, insubject can be used with other operators and search terms. This operator took only one argument, a group message identi- fier. A message identifier or message ID is a unique string that identifies a newsgroup post. The format is something like xxx yyy.
To view message IDs, you must view the original group post format. When viewing a post see Figure 2. You will be taken to a page that lists the entire content of the group post, as shown in Figure 2. When operational, the msgid operator does not mix with other operators or search terms.
Search for Stock Information The stocks operator allows you to search for stock market information about a particular company. The parameter to this operator must be a valid stock abbreviation. If you provide an valid stock ticker symbol, you will be taken to a screen that allows further searching for a correct ticker symbol, as shown in Figure 2.
Show the Definition of a Term The define operator returns definitions for a search term. Fairly simple, and very straightfor- ward, arguments to this operator may be a word or phrase. Links to the source of the defini- tion are provided, as shown in Figure 2. Search Phone Listings The phonebook operator searches for business and residential phone listings. Three operators can be used for the phonebook search: The parameters to these oper- ators are all the same and usually consist of a series of words describing the listing and loca- tion.
In many ways, this operator functions like an allintitle search, since every word listed after the operator is included in the operator search.
A query such as phonebook: If you supply what looks like an address including a state or a name and a state as a standard query, Google will return a link allowing you to map the location in the case of an address or a phone listing in the case of a name and street match. Simply fill out the form at www. Consider a query for phone- book: In this case, you need to provide more information in your query to get hits, not fewer keywords, as Google suggests.
Consider phonebook: This table also lists operators that can only be used within specific Google search areas and operators that cannot be used alone. The values in this table bear some explanation. Used Alone? For example, a search for allintext: Sum Dum Goy intitle: Dragon gives you that empty feeling inside— like a year without the classic The Last Dragon see Figure 2. This section focuses on pointing out a few of the potential bad collisions that could cause you headaches.
First, consider a query like something —something. By asking for something and taking away something, we end up with This is an obvious example, but consider intitle: It gets a bit tricky when the advanced operators start overlapping. Consider site and inurl. The URL includes the name of the site. A query like site: Save the rule breaking for your required Google hacking license test!
Remember, unless you use an advanced operator, your search term can appear anywhere on the page, including the title, URL, text, and even anchors. Get out of the habit of combining them before you get into the habit of using them! However, this query suffers from an ordering problem, a fairly common problem that can really throw off some narrow searches. By changing the query to allinanchor: URL modification, discussed in Chapter 1, can provide you with lots of options for modifying a previously submitted search, but advanced operators are better used within a query.
As such, they should be the tools used by the good guys when considering the protection of Web-based information.
Most of the operators can be used in combination, the most notable exceptions being the allintitle, allinurl, allinanchor, and allintext operators. Advanced Google searchers tend to steer away from these operators, opting to use the intitle, inurl, and link operators to find strings within the title, URL, or links to pages, respectively.
Allintext, used to locate all the supplied search terms within the text of a document, is one of the least used and most redundant of the advanced operators. Filetype and site are very powerful operators that search specific sites or specific file types. When crawling Web pages, Google generates specific information such as a cached copy of a page, an information snippet about the page, and a list of sites that seem related.
This information can be retrieved with the cache, info, and related operators, respec- tively. To search for the author of a Google Groups document, use the author operator.
The phonebook series of operators return business or residential phone listings as well as maps to specific addresses. The stocks operator returns stock information about a specific ticker symbol, whereas the define operator returns the definition of a word or simple phrase.
Use the advanced search form at groups. Do other search engines provide some form of advanced operator? Yes, most other search engines offer similar operators. Yahoo is the most similar to Google, in my opinion. This might have to do with the fact that Yahoo once relied solely on Google as its search provider. The operators available with Yahoo include site domain search , hostname full server name , link, url show only one document , inurl, and intitle.
The Yahoo advanced search page offers other options and URL modifiers. You can dis- sect the HTML form at http: AltaVista offers domain, host, link, title, and url operators. The AltaVista advanced search page can be found at www. Where can I get a quick rundown of all the advanced operators? Check out www. This page describes various operators and is a good summary of this chapter. It is assumed that new operators are listed on this page when they are released, but keep in mind that some operators enter a beta stage before they are released to the public.
Sometimes these operators are discovered by unsuspecting Google users throwing around the colon separator too much. How can I keep up with new operators as they come out? What about other Google- related news and tips? There are quite a few Web sites that we frequent for news and information about all things Google.
The first is http: Not endorsed or sponsored by Google, this site is often more pointed, and sometimes more insightful. This is one of the best places to get news about new features and capabilities Google has to offer. Google Alerts sends you e-mail when there are updates to a search term.
You could use this tool to uncover new opera- tors by alerting on a search term such as google advanced operator site: Last but not least, watch Google Trends at www. You might just catch a few Google hackers in the wild. Is the word order in a query significant? If you are interested in the ranking of a site, especially which sites float up to the first few pages, order is very significant. Google will take two adjoining words in a query and try to first find sites that have those words in the order you specified.
Switching the order of the words still returns the same exact sites unless you put quotes around the words, forcing Google to find the words in that order , regardless of which order you provided the terms in your query. To get an idea of how this works, play around with some basic queries such as food clothes and clothes food. The list could be endless.
Google is so hard to keep up with. How about this one: Throw view: To find out where all the hackers in the world are, try hackers view: Chapter 3 Google Hacking Basics Solutions in this chapter: We present this information to help you become better informed about their motives so that you can protect yourself and perhaps your customers.
I suggest you at least click a few various cached links from the Google search results page before reading further. That anonymity only goes so far, and there are some limitations to the coverage it provides. Google can, however, very nicely veil your crawling activities to the point that the target Web site might not even get a single packet of data from you as you cruise the Web site.
The simple fact is that if Google crawls a page or document, you can almost always count on getting a copy of it, even if the original source has since dried up and blown away.
The banner shown in Figure 3. Figure 3. The cache banner in Figure 3. To capture this data, tcpdump is simply run as tcpdump —n. Your installation or implementation of tcpdump might require you to also set a listening interface with the —i switch. The output of the tcpdump command is shown in Figure 3. This is a port 80 Web conversa- tion between our browser machine This is the type of traffic we should expect from any transaction with Google, but the beginning of the capture reveals another port 80 Web connection to The connection to this server can be explained by rerunning tcpdump with more options specifically designed to show a few hundred bytes of the data inside the packets as well as the headers.
The partial capture shown in Figure 3. Shift-reloading forces most browsers to contact the Web host again, not relying on any caches the browser might be using. HTT 0x Accept-Langu 0x Accept- 0x Referer 0x00a0: Lines 0x30 and 0x40 show that we are downloading via a GET request an image file—specifically, a JPG image from the server.
Farther along in the network trace, a Host field reveals that we are talking to the www. Because of this Host header and the fact that this packet was sent to IP address This means that when viewing the cached copy of the Phrack Web page, we are pulling images directly from the Phrack server itself.
If we were striving for anonymity by viewing the Google cached page, we just blew our cover! This means that not only were we not anonymous, but our browser informed the Phrack Web server that we were trying to view a cached version of the page! So much for anonymity. Penetration testers use proxy servers to emulate what a real attacker would do during an actual break-in attempt. Locating working, high-quality proxy servers can be an arduous task, unless of course we use a little Google hacking to do the grunt work for us!
To locate proxy servers using Google, try these queries: Nothing like Googling for proxy servers! Remember, though, that there are lots of places to obtain proxy servers, such as the atomintersoft site or the samair. Try Googling for those! The cache banner does, however, provide an option to view only the data that Google has captured, without any external references.
As you can see in Figure 3. This parameter forces a Google cache URL to display only cached text, avoiding any external references. Pulling it all together, we can browse a cached page with a fair amount of anonymity without a proxy server, using a quick cut and paste and a URL modification. As an example, consider query for site: Instead of clicking the cached link, we will right-click the cached link and copy the URL to the Clipboard, as shown in Figure 3. Browsers handle this action differently, so use whichever technique works for you to cap- ture the URL of this link.
The URL should now look something like http: Press Enter after modifying the URL to load the page, and you should be taken to the stripped version of the cached page, which has a slightly different banner, as shown in Figure 3. Unfortunately, the stripped page does not include graphics, so the page could look quite dif- ferent from the original, and in some cases a stripped page might not be legible at all.
Notice the search terms we used listed after the base page URL.
Simply add or subtract words and press Enter, and Google will highlight your terms! For example, to include fear and risk to the list of highlighted words, simply add them into the URL, making it read something like www. Did you ever know that Marshmallow Peeps actually feel fear? Just ask Google. Directory Listings A directory listing is a type of Web page that lists files and directories that exist on a Web server.
Designed to be navigated by clicking directory links, directory listings typically have a title that describes the current directory, a list of files and directories that can be clicked, and often a footer that marks the bottom of the directory listing. Each of these elements is shown in the sample directory listing in Figure 3. Unfortunately, directory list- ings have many faults, specifically: They do not prevent users from down- loading certain files or accessing certain directories.
This task is often left to the protection measures built into the Web server software or third-party scripts, mod- ules, or programs designed specifically for that purpose.
All this adds up to a deadly combination. Locating Directory Listings The most obvious way an attacker can abuse a directory listing is by simply finding one!
Locating directory listings with Google is fairly straightforward. An obvious query to find this type of page might be ntitle: Unfortunately, this query will return a large number of false positives, such as pages with the following titles: Index of Native American Resources on the Internet LibDex - Worldwide index of library catalogues Iowa State Entomology Index of Internet Resources Judging from the titles of these documents, it is obvious that not only are these Web pages intentional, they are also not the type of directory listings we are looking for.
These queries indeed reveal directory listings by not only focusing on index. This is easily accomplished by adding the name of the directory to the search query.
This technique can be extended to just about any kind of file by keying in on the index. This technique will generally find more results than the somewhat restrictive index. Server Versioning One piece of information an attacker can use to determine the best method for attacking a Web server is the exact software version.
An attacker could retrieve that information by con- necting directly to the Web port of that server and issuing a request for the Hypertext Transfer Protocol HTTP Web headers. It is possible, however, to retrieve similar informa- tion from Google without ever connecting to the target server. One method involves using the information provided in a directory listing. Notice that some directory listings provide the name of the server software as well as the version number. An adept Web administrator could fake these server tags, but most often this information is legiti- mate and exactly the type of information an attacker will use to refine his attack against the server.
The listing shown in Figure 3. This query will locate all directory listings on the Web with index of in the title and server at anywhere in the text of the page. This might not seem like a very specific search, but the results are very clean and do not require further refinement. Notes from the Underground… Server Version?
Who Cares? Although server versioning might seem fairly harmless, realize that there are two ways an attacker might use this type of information. If the attacker has already chosen his target and discovers this information on that target server, he could begin searching for an exploit which may or may not exist to use against that specific software ver- sion.
Inversely, if the attacker already has a working exploit for a very specific version of Web server software, he could perform a Google search for targets that he can compromise with that exploit. An attacker, armed with an exploit and drawn to a potentially vulnerable server, is especially dangerous. Even small information leaks like this can have big payoffs for a clever attacker. This query would find pages like the one listed in Figure 3. As shown in Table 3.