-
A query is anything you enter in a search box (keywords, numbers, operators).
-
Much can be learned by keeping a search log (it is very easy to forget what queries worked especially well).
-
There are only three methods of searching on the Internet:
-
Subject Directory (like using the table of contents in a book),
-
Browsing (like flipping through the pages),
-
Search Engine (like using the index in a book).
-
How much you know about a subject and familiarity with search tools determines which method of searching is the most efficient.
-
Choose Browsing when you know a lot about a subject to start with and have some specific Web sites in mind.
-
Choose a Subject Directory when you know next to nothing about a subject.
-
Choose a Search Engine when you know how to build an effective query.
-
Most often, a search engine is the most efficient means of searching.
-
A one-word query is the weakest query.
-
Adding more than four keywords to a query reduces the probability it will return relevant information.
-
An optimal query usually consists of 2 to 4 carefully chosen keywords or numbers.
-
On average, there is only a 1 in 5 probability that any keyword you use in a query was also used in the information you hope to find. Stated another way, 4 out of 5 keywords used are not likely to produce the desired results.
-
When 3 keywords are used, the odds may be as poor as 1 in 125;
-
When 4 keywords are used, the odds may be as poor as 1 in 625;
-
When 5 keywords are used, the odds may be as poor as 1 in 3,125.
-
Untrained searchers (particularly students) tend either to search only with 1 term or use too many terms.
-
The best keywords tend to be proper nouns and numbers (even with the number is spelled out).
-
Many times there is a better (more specific) term that can be used (unless the keyword is already a proper noun or number).
-
Verbs make poor keywords--turn them into nouns to improve their performance.
-
Important concepts may make poor keywords for other reasons: multiple meanings (search engine only matches words--not meanings), frequency of occurence (too frequent is usually the problem), specificity of meaning (too specific or not specific enough).
-
Search engines ignore stop words--common parts of speech such as pronouns, conjunctions, prepositions.
-
Searches are often effective without using operators (other than AND). When in doubt, don't use operators.
-
It is unnecessary to type in AND (+) when a space is included between keywords.
-
Word order is usually not important unless the objective is to return an exact match (e.g. Los Angeles vs. Angeles Los).
-
Use "quotes" around keywords only when there is no doubt that is exactly the phrase that is desired.
-
Placing quotes around one term has no effect (except in the case of stop words).
-
Use NOT (-) before a keyword only when there is a small likelihood that desired information will not be eliminated from the results.
-
Use OR only between equivalent terms (e.g., freedom OR liberty).
-
The only way to search the live Internet is to Browse (click links)
-
All subject directories and search engines search only databases.
-
Databases are compiled 24/7 by software programs known as crawlers or spiders.
-
Database caches may be refreshed anywhere between a matter of hours or a matter of weeks. Sites with blogs tend to be refreshed more quickly.
-
Google's Cache is the information indexed in its database. (The Title and URL link to the live Internet, which explains why there may be a discrepancy between search results and the live page).
-
For any search, Google returns up to 85 pages of results with 10 results per page.
-
Information that is indexed in a different database than the one being searched is often referred to as the Deep Web or Invisible Web.
-
Information that is indexed in the same database but buried in the results retrieved is another type of Deep Web search problem.
-
Commercial search engines (Google, Yahoo! MSN, etc.) are good at finding other databases: combine the subject matter with keywords like database OR information OR repository OR almanac, etc.
-
A commercial search engine is usually a good place to do an initial search (unless a specific database or Web page is already known).
-
"Page Not Found" information may be retrieved by searching the Way Back Machine (archive.org) using the URL of the lost page.
-
Snippets (or abstracts) are the results returned by a database search; snippets contain valuable clues for homing in on desired information.
-
The process of 'homing in' on information requires careful scanning, paying attention to keywords, including unfamiliar terms.
-
Use the FIND command (shortcut: Ctrl + F) to locate terms on a page efficiently.
-
Efficient searching usually involves replacing keywords with more specific keywords found in snippets and web pages.
-
The most likely reason a search fails is that sufficient time was not taken paying attention to clues.
-
When a search produces few results, try using a hypernym--a less specific synonym (e.g., replacing 'musicals' with 'productions').
-
When a search produces a lot of results (the usual case), try using a hyponym--a more specific synonym (e.g., replacing 'number' with 'statistic').
-
Each set of results is an opportunity to decide if the information retrieved answers the original question or whether the search needs to be refined (the 'revision decision').
-
Two broad questions help determine the quality of the information retrieved: "is the source of the information credible?" and "is the content of the information credible?"
-
Four criteria are helpful in determining the credibility of a source: author, publisher, objectivity (bias) and links found in the information.
-
Four criteria are helpful in determining the credibility of content: evidence, accuracy, date and external links to the information.
-
Evaluation should include a minimum of three criteria and at least one from both categories.
-
Links to (link:) is a powerful way to determine who thinks the information is worth linking to. This is essentially the unsolicited reference list from the Internet.
-
Information about author, publisher and date may be found using a variety of Investigative search techniques which include truncation (shortening the URL), searching the Web site and searching other databases using information found on the page.
-
The domain of a Web site may not be an accurate indicator of reliability (anyone can purchase .org for example).
-
The ~ sign in a URL indicates the site is a personal web page.
-
Establishing credibility is best achieved through dialogue with others.
-
Ethical use for students is basically practicing proper citation principles to avoid plagiarism; ethical use for educators is a matter of following guidelines for fair use to avoid violating copyright laws.
-
Paraphrasing is not a method of avoiding plagiarism. It must be accompanied with proper citation.
-
Fair use is decided in federal court and involves four criteria: the purpose of the copying, the nature of the copying, the extent of the copying, the effect of the copying on the copyright owner. In general, it is safe to copy up to 10% of a written work as long as the 10% does not constitute the 'heart' of the work.