Searching | Evaluation | Ethical Use
Text to Moby Dick
Scripts to Shakespeare's plays
Text to Harry Potter
Text to The Lord of the Rings
If you are searching for information that can only be found in a book (or other copyrighted materials) published in the past 50 to 75 years, chances are you won't find it on the Internet. You might find the title and description. You might even find an excerpt. But you're not likely to find the full content.
Many publishers of these materials advertise their products on-line. They make it easy to purchase them on-line. But they seldom make the full content of the materials available on-line. You're much more likely to find what you're looking for in a library or bookstore (either locally or on-line).
If the book, pictures or music is more than 75 years old (and the copyright has not been renewed), there is a better chance that you will find it on the web. The more famous the item, the more likely you are to find it. Classics of literature, for example, are increasingly available on the web.
Online Learning Module: Traditional Sources
If you were searching for traditional print and media resources, you would probably ask, "Am I more likely to find the information in a book, a newspaper, a journal, a magazine, a catalog, a telephone book, or an atlas?" Then you might ask where you would be most likely to find that particular source--in a bookstore? Which bookstore? A library? Which library?
You can save yourself lots of time and frustration if you ask these two basic questions before you begin searching on the Internet.
Which kinds of online sources (eTexts, online magazines, online newspapers etc.) should I look for?
Where would I be most likely to find that particular online source?
Resources: Website Listings of Online Publications
Books: To find non-copyrighted books, you can conduct a general search for the title, author or related information using a Subject Directory or a Search Engine. Or you can visit one of the sites that specializes in providing on-line non-copyrighted books at no cost.
The Gutenberg Project has digitized more than 1500 texts including works of Light Literature; such as Alice in Wonderland, Through the Looking-Glass, Peter Pan, Aesop's Fables, etc. Also included are classics like the Bible, the collected works of Shakespeare, Moby Dick, Paradise Lost, and many others. You can also find online versions of popular references; such as Roget's Thesaurus, almanacs, encyclopedias, dictionaries, etc. The Gutenberg Project also includes a rich set of links to other eText sites.
Newspapers and Magazines: Many American as well as international newspapers now have web sites where the current edition of their papers can be read on-line without charge. Many also maintain archives that make it possible for you to search for past articles. Be advised, however, that viewing the full versions of these articles often involves a fee.
Newspapers:
Magazines:
Yahoo's listing of magazines by category
NewsTrawler (a more limited list of magazines)
Government Documents: The U.S. government publishes a wide range of information from governmental agencies, at no cost, on the Internet. This information is generally published by the many individual governmental agencies.
Government Documents:
University of Michigan's Government Documents site.
Federal Resources for Educational Excellence (FREE)
U.S. Government Printing Office Multi-Database search of US government documents
Search engines work by matching patterns of words you choose (your search terms or keywords) with patterns of words in the documents they have indexed. That means that many different patterns of keywords can be used to retrieve the same document. However, most of those patterns will also retrieve other documents that don't have anything to do with what you are looking for.
To avoid sifting through a sea of irrelevant items focus on keywords that might be unique to the document(s) you really want.
Words that are specific to a discipline's professional vocabulary* are particularly powerful and more likely to yield relevant documents. The more unique the terms you choose, the fewer irrelevant documents you are likely to get.
*If you are looking for a specific Macintosh computer, rather than searching for "Macintosh" or "Apple", search for "imac" or "powerbook".
Online Learning Module: Using Keywords Effectively
Quotation marks " " force a search engine to find the exact phrase within the quotations.
The plus + operator (Boolean AND) insures that a keyword will be in the documents you find.
The minus - operator (Boolean NOT) insures that the keyword will not be in the documents returned by your search.
The quotation marks ( " " ) operator will focus your query more precisely by creating a phrase of two or more terms that will be treated like a single term. Only documents containing all the terms in exactly the same order as in your phrase will be returned for your query--so be certain that the phrase you are looking for is likely to used on the page you need. The most common use of quotation marks is to designate people's names or other proper nouns, however the unique combination of Proper Nouns is often specific enough to be used without quotation marks. Quotation marks are important in situations where one or more of the words alone is likely to retrieve many documents not related to your topic (e.g., words with multiple meanings).
Combining operators brings power to your research. When you enter keywords into the query field of most search engines, the system will find documents containing any of your keywords. As a result, you get documents, sometimes highly ranked documents, that do not contain all your search terms. When a search engine returns millions of hits rather than hundreds, it is time to consider using operators.
Currently, most search engines treat a space between terms as the + operator. The real power in a query with more than one word depends on how specific or unique each term is. Too many keywords (resulting in lots of ANDs) actually eliminates many pages that contain relevant information but not all the search terms.
Another way to make your search more precise is to use the minus ( - ) operator. This operator can be used to exclude terms from the results of your query. When you searched earlier you probably got hits for a number of companies seeking to sell you various 'Artists' versions of the song America the beautiful. Try this syntax to eliminate company websites: +"America the Beautiful" -com . Be certain that you don't eliminate a term that could be contained on a page with information you need. Use the minus operator sparingly.
Online Learning Module: Operators
Many words have more than one acceptable spelling (both American and British). Trying both spellings will broaden your search.So if you don't find what you are looking for on your first try, consider using spelling variations to broaden your search and explore further.
Common International Spellings: Colour vs. Color, Centre vs. Center, Theatre vs. Theater
It is also true that the documents on the Internet contain misspellings. Sometimes trying a popular misspelling will produce additional useful documents.
Common Misspellings: Shakespear vs. Shakespeare, Hazzard vs. Hazard, Tomarrow vs. Tomorrow
Online Learning Module: Spelling
While the enormous size of the Internet makes it possible to gather documents from a wide variety of sources, sites that specialize in particular subjects or topics are often the most efficient and effective way to get what you want. This is particularly true if time is limited.
Searching specialized sites dedicated to your topic in depth is a powerful strategy. Once you find a promising resource use the site's internal search capabilities or site map to look more deeply into the material.
Sites containing a lot of detailed information on specific subjects:
US Geological Survey (USGS): Good for information on earthquakes, floods, volcanoes, maps, and more.
PBS The Public Broadcasting System maintains an deep and rich site of materials that archives information on a diverse group of topics.
The Librarian's Index to the Internet is an in depth subject directory created and maintained by professional librarians.
Online Learning Module: URLs: How do URLs Work?
Online Learning Module: Sitemaps: What are they and how do they work?
Successful searching is a process. If you are seeking complex information, rather than simple facts, searching will be a multi-step process.
Where and how will you look?
When using human edited Subject Index, you select a beginning subject category, and explore one or more sub-categories. Each step involves a decision. If you don't find what you seek, you retrace your steps and try a new approach.
Online Learning Module: Search Engines
Online Learning Module: Subject Indexes
How do I refine my search?
Rethinking your steps while searching is called "refining" your search strategy. Refining a search is like the game called Twenty Questions. Each time you ask a question and get a response, you eliminate possibilities and narrow your search for the correct answer. Each time you take a new step in the search process, you get new responses that may help you find your way to the information you are seeking.
As an introduction to refining your strategy 'on the go', consider the Online Learning Module: Search Box Strategy
There are three main ways of refining your search:
Alter the specific steps you are taking (Tactics)
Try a new step based on the results of the previous step
Systematically search the most highly ranked links.
Search more deeply in a particular site by using the sitemap or truncating the URL.
Limit your search to the most promising domain (.edu, .net, .com).
Online Learning Module: Sitemaps: What are they and how do they work?
Online Learning Module: URLs: How do URLs Work?
Alter your search query (the terms and operators you use)
Adding, deleting, or substituting keywords or operators to get better results.
Using the plus ( + ) operator to demand terms be included in the search results.
Using the minus ( - ) operator to demand terms be excluded from the search results
Using quotation ( " " ) marks to request the entire phrase, not just one of the words in the phrase
Online Learning Module: What is a Query?
Online Learning Module: Operators
Alter your main approach to the search (Strategy)
Change the focus of your search based on analysis of the initial results (use the Search Box Strategy). Reconsider your choice of search engines.
Rethink or refine your main concept by looking in a human edited Subject Index.
If you can't find what you are looking for with a Search Engine consider the Invisible Web.
If you can't find what you're looking for in a newspaper database, try searching a magazine database.
Should you even be using the Internet? Are traditional resources better?
Online Learning Module: Search Box Strategy or Search Process
Online Learning Module: Search Engines
Online Learning Module: Subject Indexes
Online Learning Module: Invisible Web: How Can You Search It?
Online Learning Module: Traditional Sources
Search engines retrieve documents by matching the terms in your search query with terms that appear in the documents they have indexed. This matching is a mechanical process: terms must match exactly. That means that you have to find exactly the same terms used by the author of the document to get a match.
If your first search query doesn't retrieve what you are looking for, consider using new terms (also called keywords) to describe the same concept. One way to do this is to use synonyms for one or more of the terms in your query.
One of the best ways to find such synonyms is to look in the search results returned by the first query. This is sometimes called scanning the snippets. You can also open and scan promising documents. When writing about their subjects, people often use synonyms that can provide clues for search queries. Another way to generate synonyms is to use a thesaurus. Also, some search engines will provide a list of terms related to your search query along with the results of your search.
Quick Tips for finding better search terms:
Look in the documents returned by your first search. Find the documents that apply to your search and find discipline specific terms (expert vocabulary used by the experts such as "stellar cartography" or "seismic hazards").
If your search results are too broad, use hyponyms (more specific terms such as Ford, Chevrolet, or Toyota).
If your search results are too specific, use hypernyms (more general or broad terms, such car, truck, or automobile).
Online Learning Module: Synonyms
Online Learning Module: Hyper- and Hypo- Nyms
Online Learning Module: Nyms: How Lesser Known Nyms Help You Improve a Search
When you query a search engine, the engine attempts to match your keywords against all the text in a web document, no matter where in the document the text appears. This approach will maximize the number of documents you retrieve so you won't miss something relevant. However it is sometimes more efficient and effective to look for matches with the text found in the most important parts of a web document.
Keywords that appear in the title of a page or in the actual URL (web address) of the site may indicate greater relevance to your search. Documents that contain your query terms in the title are more likely to be about the topic you are researching. After all, that's what titles are for.
Search engines often provide advanced operators that help you search specific parts of a webpage's format. Google offers several powerful format operators. Consider a few examples from Google's Advanced Operators page:
intitle:
If you include [intitle:] in your query, Google will restrict the results
to documents containing that word in the title.
For instance, [intitle:google search] will return documents that mention the word "google" in their title, and mention the word "search" anywhere in the document (title or no). Note there can be no space between the "intitle:" and the following word.
inurl:
If you include [inurl:] in your query, Google will restrict the results to
documents containing that word in the url (universal resource locator, also called the web address).
For instance, [inurl:google search] will return documents that mention the word "google" in their url, and mention the word "search" anywhere in the document (url or no). Note there can be no space between the "inurl:" and the following word. Using format operators is a powerful, and specific way to focus your search. Why not try it yourself and see?
Online Learning Module: Formats: How to search documents in non-HTML formats.
There are documents hidden in areas of the Internet that search engines simply can’t see. The search engine’s robotic ‘crawlers’ either miss or are locked out of these areas on the Internet. Behind the barriers lie treasure troves of quality information. Collectively this information is called the Invisible Web. There are many categories of invisible information missed by the popular search engines. The most common are: webpages that have been skipped by search engine crawlers, online database content, and password protected websites.
Searching the Invisible Web for information is a two-step process. First you search for online resources likely to hold the 'invisible' information. Once you find such a site, you go there and use local tools to search the site itself.
Without a search strategy, the size of the Invisible Web creates information overload. To search the Invisible Web you need a plan that helps you quickly find the most likely sources of quality information. As you develop a personal search strategy that is both flexible and focused you become a more efficient researcher
Strategies to Consider:
If you already know a site that has the kind of information you are looking for, go directly to the source. If you are looking for the population of a US city, instead of using a traditional search engine, search the database available at http://www.census.gov. When seeking information on the Invisible Web, use a search engine to find a websites with databases about your topic. Go to the website and query the database directly. Use websites that list Invisible Web collections and provide easy access to search forms. When visiting a site with Invisible Web resources use the site's search tools to refine your search to the specific information you are looking for.
Invisible (or Deep) Web Resources
These sites lead you to invisible web content:
Invisible-web.net - The Invisible Web Directory, with over 1000 databases. This site guides you to hidden online database search forms.
Infomine - Scholarly Internet Resource Collections. Your guide to academic databases.
CompletePlanet - BrightPlanet claims to have103,000 search sites organized into more than 4,000 subject headings.
Online Learning Module: Invisible Web: How Can You Search It?
Online Learning Module: The Opaque Web
Online Learning Module: Vanishing Web
©2008 Illinois Mathematics and Science Academy®
1500 W Sullivan Rd, Aurora IL 60506-1067 USA • +1 630-907-5000