AustLII

AustLII Guide to Legal Research on the Web


9. Using internet search engines to find law

9.1. Finding law using general (‘non-law’) resources

Legal information and sites targeted specifically at lawyers make up only a tiny portion of the web, both in Australia and internationally. This has two consequences for those interested in legal research. First, there may be a lot of valuable legal information ‘out there’ which is not on law-specific sites. Second, it is likely to be hard to find because of the far greater volume of non-law information surrounding it (‘noise’).

The tools for finding information in general (non-law) sources are the essentially the same as for legal sources: intellectually constructed indexes/directories that are usually browsed hierarchically, and search engines that search over automatically constructed indexes of every web page. In both cases the scope of these resources may be worldwide, or may be limited to the Australian portion of the internet.

The intellectually constructed directories are rarely of use, except if non-law material is sought, although some such as Yahoo! Australia & NZ are large enough to have significant subsets of specific legal information.

The search facilities can be deceptive, because none of them search over the ‘real’ world wide web (in the sense of every word on every web page), but only over a representation of it that they have constructed for search purposes. There are a few main types of representations that these engines search (often in combination):

1.summaries of web pages submitted by users / owners;
2.summaries / reviews of web pages written by editors or indexers;
3.automatically constructed abstracts of web pages (eg the title and the first 256 characters on a page);
4.every word on every web page indexed, but with significant limits (such as robot exclusion standards) on which pages are indexed (Alta Vista is the best known engine to do this);
5.a page ranking system ie the popularity of websites/webpages based on the number of other sites linking to them (Google produces fast and accurate results using this mechanism);
6.combining search results from a number of search engines - meta search engines.

This Chapter starts by reviewing those indexes and search facilities which concentrate on the ‘Australian internet’, and then moves to those which have a worldwide focus.

9.2. Australian Search Engines

To find a range of Australian search engines/directories go to the WorldLII >> Categories >> Countries >> Australia >> Research >> Australian Search Engines page in the WorldLII Catalog.

2003031593.png

Australia ‘Australia Search Engines’ page

9.3. Searching the ‘Australian internet’

Although the search engines/directories listed above depend to a large extent on users submitting URLs and descriptions for inclusion, some Australian search engines also traverse the web indexing pages that they find.

9.3.1. The Web Wombat

The Web Wombat <http://www.webwombat.com.au/> is a spider or web crawler, traversing the web to find and index every word it finds on pages (ie not just the existence of web sites but their content) Web Wombat is one of the leading locally-developed Australian search engines and directories. Information on its site indicates that it has the largest online database of searchable information on Australia, with references to more than 11.5 million documents. It also includes an Australian directory for Law Resources.

Searches using the Web Wombat such as the ‘family law’ search below do produce very relevant results - in that case, hundreds of relevant documents were found.

2003031594.png

The Web Wombat results screen

Search tips

There no longer seems to be any manual or help pages available for the Web Wombat, so it is difficult to say with precision what types of searches will work best. Some tips and featured searches (examples of searches) are given if you click on ‘Advanced Search’.

Web Wombat automatically expands search terms, so you should only search for the ‘stems’ of words. For example, to search for cryptography, cryptographic etc, a search for ‘cryptograph’ is best;

There is no need to use search connectors like AND and OR, and they are ignored anyway. Just type in the search terms one after another. For example to search for documents concerning cryptography and privacy, a search for ‘cryptography privacy’ will suffice;

Phrases can be searched for using double quotation marks eg "freedom of information";

Web Wombat does rank the items it retrieves according to a simple rule of relevance. For example, if you search for 4 search terms, it will first display all those containing all 4 search terms, then those containing 3, and so on. As this is a very simple method of ranking, you will often need to look at all items retrieved, and do not stop if you start seeing irrelevant items;

Searches can be either ‘search all Australia’ or limited to government or education websites;

To obtain an overview of your search results (which is necessary because of the ranking method described above): increase the ‘Number of results per page’ field so that you get 50 or 100 items returned.

9.4. Indexes of the whole internet, worldwide

There are two main methods of finding material over the whole internet, worldwide (legal and non-legal materials): intellectual directories /indexes which attempt to classify research resources and search engines which allow users to specify their own search terms.

Search Engine Watch at <http://searchenginewatch.com/> provides extensive lists of all the major search engines on the web, their strengths and weaknesses, specialised search engines and much more information on searching in general.

Links to most of the commonly used directories and search engines are available at the WorldLII >> Categories >> Research >> Search Engines page.

2003031595.png

The WorldLII Catalog ‘Search Engines’ page

All of the major worldwide internet indexes are valuable, but this Chapter will concentrate on the two internet search engines which may be the most useful general tools for finding legal resources, the Alta Vista search engine and Google.

9.5. AltaVista

The AltaVista Search Company <http://www.altavista.com/> uses a web ‘spider’ or ‘robot’ to index every word on every web page that the spider accesses, thus allowing free text searches across the whole internet. It then attempts to rank all of the items that it retrieves according to their likely relevance to the search request (a more complex version of the ranking that AustLII does with ‘freeform’ searches). AltaVista provides a choice of two types of searches: ‘Basic search’

2003031596.png

AltaVista - Basic search

and ‘Advanced search’

2003031597.png

AltaVista - Advanced search

Go to the Home > Help > Search page at <http://www.altavista.com/sites/help/search/default> for the search and help menus. The AltaVista help files for both types of searches are at Appendix 11.3.

2003031598.png

AltaVisa - Search help

9.6. Google

Information from the site indicates that Google's index <http://www.google.com/> comprises more than 2 billion URLs and represents the most comprehensive collection of the web pages on the internet. Google uses a different search algorithm to other internet search engines based on page ranking of the popularity of the website/webpage and the proximity of your search terms. It certainly achieves fast and accurate results.

It has both a basic search interface and an advanced search facility

2003031599.png

Google - Basic search interface
20030315100.png

Google - Advanced search facility

Google also offers an "I'm Feeling Lucky" button, which takes you directly to the site of the highest ranked result in your search, cached pages of websites which may have moved, as well as a ‘similar pages’ search facility.

The Google Search Help files for both types of searches are available in the Appendices.