Yes, I'm working on the Sun machines in WeH 5204.
Even when I output all the exceptions as you suggested, I still get the same output as before when running WebReader -- just the line "0 total page elements retrieved." Running WebCrawler directly on the pages returns no exceptions either, as one would expect if the PageLexer was returning nothing either. After a little more testing, I think the problem is in the StreamTokenizer. When I put a "System.out.println( tokens );" in the HttpTokenizer file, which should output the entire StreamTokenizer object the HttpTokenizer uses, the only output I get is "Token[NOTHING], line 1", instead of the expected string of "Token[ blah ], line blah" that I get when running the tokenizer on any other page. I hope this helped ... |