Searching Mechanisms
1. How do search engines such as Alta Vista differ from information directories?
I used to use Alta Vista, before the year 2000. It was and probably still is a good service. I may start to go back and use it again, because I have concerns about the privacy implications of using some other search engines. Nope just tried it and didn’t like it.
Doctorow, describes AltaVista and Lycos as ranking “documents that matched our keywords”.(Doctorow. 08 March, 2002)
He also goes on to say: “AltaVista tried to get computers to do both the repetitive parts (capturing billions of documents) and the creative parts (figuring out what the documents are about). This yielded the largest collection of randomly organized documents in the world, a Web-accessible version of a library where all the books have been re-shelved by axe-grinding illiterates who wanted to make sure that no matter what you were looking for, you’d find porn.” (Doctorow. 08 March, 2002)
“Altavista is a hybrid engine. It independently gathers results from its own index of 550 million pages, and supplements this with data from LookSmart and listings from Overture.” (Metamend., n.d.)
Search engines offer different ways of cataloging websites and articles, the following list summarises many of those ways:
- By using an automated service that browses the web and indexes words used within the site
- Using people to catalog sites, which as long as is responsible is probably the most reliable way for search engines to create meaningful and accurate site listings. The downside of this is that it is intensive and therefore time consuming, and cannot keep up to pace with the growth of the internet.
- Using a combination of automated services and human cataloging.
- By using data mining algorithms in combination with automated indexing to sort sites that are considered rel event word based search.
2. What is a spider? What does it do?
In the context of search engines it is an automated searching mechanism that downloads content or pages within a site for later indexing by a search engine which then indexes the text and media within websites. This indexed text and media is then cataloged and the information used to pinpoint searches based on keywords contained within the search.
Wikipedia describe a spider as “Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner”.(Wikipedia, n.d.)
3. Describe a search situation where the requirement for recall is high ?
The health services would require a service that uses large amounts of statistical evidence and specific diagnostic information that requires correct dosages, in order to diagnose a problem or support insightful analysis.
Law, would require a large database of past cases as well as laws and regulations outlining cases and approaches.
Engineering would require specific information about strengths and weaknesses of certain structures.
4. What is a meta-search engine? Provide examples.
Wikipedia describes it as: “a search tool that sends user requests to several other search engines and/or databases and aggregates the results into a single list or displays them according to their source”. (Wikipedia, n.d.)
Examples: (UC Berkeley n.d.)
5. What is spamming?
My understanding of spamming is that in many countries it is illegal including Australia. It is the process and application of taking email addresses from businesses and individuals and sending those users email messages or some other forms of messages without their prior consent. It is an ethical privacy issue because unless that company or individual has provided you with their details, then obtaining their email addresses without their consent and sending them unsolicited email’s can be regarded as spam. The problem is that many people display their email’s for others to freely use, in the past spammers have argued that they obtained email addresses legally, but the law now regards this farming of email addresses as illegal unless you are sending information that can be regarded as related to their business in some way.
How can you get your site listed at major search sites; and how could you improve your site ranking?
Actually its a black art, in that nobody knows because the search engine companies keep their secrets closely guarded or closed source. Therefore a lot of SEO sites will provide you with techniques but those techniques tend to change.
The following ways help, but cannot guarantee success:
- Building clean standards compliant webs sites (sites that pass the W3C standards), and thats a really good thing!
- By not trying to add meta information to the site that isn’t true. Like adjusting the meta tags in your pages to have keywords and description about text or information that doesn’t exist on the site.
- In the past adding meta-tags like keywords, description and author was regarded as the main way to get high rankings, but now this is not necessarily the case. It doesn’t hurt to do it and it may have other benefits.
- By submitting your site to one or more search engine, not sure this helps because it can’t be measured.
- By adding a sitemap to your site, to help spiders index your site.
- By using third party tools like google sitemaps and analytics that provide services for viewing traffic visiting the site. The tradeoff is that you are invading the privacy of users who visit your site, by sending their details to google, without their consent.
- Paying to have your site at the top of the pile.
- Making sure you have something relevant to say.
- Don’t get blacklisted, by having some of the things mentioned earlier ie untruthful meta-tags.
- Updating your information on a regular basis so that you get return visitors.
References
-
Wikipedia.(n.d.)
Web crawler.
Retrieved 24th July 2009 fromhttp://en.wikipedia.org/wiki/Web_crawler
-
Cory Doctorow. (08 March, 2002).
How I Learned to Stop Worrying and Love the Panopticon
Retrieved 24th July 2009 fromhttp://www.oreillynet.com/pub/a/network/2002/03/08/cory_google.html
-
Metamend. (n.d.)
The Search Engine Altavista
Retrieved 24th July 2009 fromhttp://www.metamend.com/altavista.html
-
Wikipedia.(n.d.)
Metasearch engine.
Retrieved 24th July 2009 fromhttp://en.wikipedia.org/wiki/Metasearch_engine
-
UC Berkeley (n.d.)
Meta-Search Engines
Retrieved 24th July 2009 fromhttp://www.lib.berkeley.edu/TeachingLib/Guides/Internet/MetaSearch.html
