Discussion Board
There are no previous messagesGo to the following message
Current Forum: Homework 5 General Forum
Date: Wed Nov 14 2001 2:41 pm
Author: Costa, Christopher J. <chriscosta@cmu.edu>
Subject: Website address consistency

If I have a page that links to google.com, but I spell it http://www.GoOglE.coM in the link, I will still get to google.com. In fact the web server is so smart it replaces my ugly spelling with the "real" spelling of http://www.google.com in the address bar. However, since our engine will put a cookie at http://www.GoOglE.coM, when I see a link for http://www.google.com I will again visit the page. The solution is not as simple as making all the pages lowercase, since after the domain (when I am actually addressing the file system of the server) the pages and directory structure are (in most cases, depending on what server they are running) case sensitive.
This problem extends itself deeper as well. If I am computing the indegree, then if some site refernces google as http://www.GoOglE.coM, it will not be counted as part of the indegree of http://www.google.com when it is the same page.

My question is whether it is acceptable behavior to assume everyone uses the same address to access the same html documents and servers? (Thus allowing http://www.GoOglE.com to be "different" from http://www.google.com, and http://google.com)
Post response

There are no previous messagesGo to the following message
Current Thread Detail:
Website address consistency      Costa, Christopher J.      Wed Nov 14 2001 2:41 pm       
Re: Website address consistency      Ghosh, Debmallo S.      Fri Nov 16 2001 9:05 am       

Back to previous screen