I like to think of myself as being up to date with the latest technologies and communications infrastructures. I have a high-end Macintosh on my desktop, I have an Ethernet connection to the internet world, I've got modem connections to other places in the world, I've got lots and lots of files occupying lots and lots of disk space, well, I think you get the idea. I've really been taking this whole information superhighway thing for granted, because I'm so well-connected here in my office. Admittedly, there are times that I forget that not everyone has the type of connectivity that I do. So, sometimes I get lax about investigating new toys that I come across. Well, this one came back to bite me.
About a year ago, I first heard about the Web. The term didn't carry much significance for me, and I was fine with what I had access to, so I ignored it. But time and time again, I kept hearing the term used. I began to hear more and more people talk about it. So I finaly decided to investigate for myself. What I found was a communications mechanism far beyond what I thought it could have been, not to mention one that can be rather confusing at first.
The Web is really just another way to access information on the Internet. There is not a separate communications network that is dedicated to the Web, instead the Web provides a different mechanism for accessing sites that are on the Net. If you're already familiar with Internet services such as Gopher and FTP, you already know about those parts of the Web.
Let's take a moment and compare the Web to Gopherspace. Mark Thacker has written many articles about Gopher in previous issues of Benchmarks and other publications. With Gopher, you run a client on your computer that connects to a Gopher server on another computer, and the information stored on the server is displayed to you through the client software. The Web operates identically: you run a client (usually called a Web browser) on your computer which connects to a Web server to display information.
So what is the difference between Gopher and the Web? Gopher is essentially a text-retrieval system. Sure, it can be used to download files of all types to a local disk, but it can only interactively display text items. These items are presented in a menu-like hierachy for selection. Because of the textual nature of Gopher, there are clients for just about every type of computer imaginable, including a dumb terminal. No special graphics capabilities are needed to access the information contained within a Gopher system. Gopher systems can generally connect only to other Gopher servers. There are a few Gopher to WAIS and other gateways, but Gopher works best when talking to its own kind.
News from the CWIS/Gopher Hole, a column by Mark Thacker, is a regular feature of Benchmarks. This issue's column is found on page 22. Two recent articles by Mark that might be of particular interest to readers of this issue appeared in the May/June 1994 Benchmarks on pages 19 and 21.
Web browsers are more visually-oriented. As such, the most prominent Web clients are heavily graphic, although there are a few character-based clients being used. Not only will a Web client display textual information, it will also display certain types of images inside the client, and some clients will handle some forms of audio directly as well. Information items are presented in a free-form hypertext metaphor instead of a structured list or menu (though a Web site could set its information up in this way). But possibly the most important functional difference between Web and Gopher clients is that Web clients can access many other systems directly, such as FTP and Gopher. So it is possible to use a Web client to access a Gopher site in very much the same way as a Gopher client would access the site.
Which system is better? That's a question that cannot be answered. There are pluses and minuses to both systems, but the system to use is the one that will afford you best access to the information you wish to retrieve.
Without getting into too much technical detail, let's look into how information is delivered to a Web cleint. We'll begin by looking at how the information is prepared on a Web server and end with a brief overview of some of the Web clients that are currently available.
The metaphor used by Web clients and servers to disseminate information is that of a page. When your client connects to a server, a Web page is displayed. While each page may look completely different, each one is formatted to follow an exacting specification. This format is called HTML.
HTML is an acronym for HyperText Markup Language, and it is the language that each Web client speaks. When your client connects to a Web server, it downloads an HTML file. The codes in that file specify what the client is to display on the screen. The HTML file includes information about which GIF images should be displayed (if any), the style and appearance of any text, and links to other Web information sites.
Without going into detail, let me illustrate how a page might be formatted. In this example, I'll borrow liberally from my own home page (see the address in my byline). I've created an HTML file which contains information about myself and where I work (and the kind of work I should be doing), a few images of myself and campus, and many links to other pages. Some of these links are to other local HTML files (such as my Texas Rangers Baseball Schedule or project list), and some are to pages that exist on other servers (such as the Dinosaur Exhibit at the University of Hawaii http://www.hcc.hawaii.edu/dmos/dmos.l.html, the Web guide to Nasa http://hypatia.gsfc.nasa.gov/NASA_homepage.html, and Dr. Fun http://sunsite.unc.edu/Dave/drfun.html. The person viewing my page doesn't have to know where all these resources are. He or she will simply click on the indicated link area of the page and be transported to the other site automagically. Really, it's not magic, but it sure seems like it sometimes.
How does the client software know where to get the information indicated in a link? And how does it know what type of information it is accessing? Each link contains a formatted address which indicates not only the type of link, but also specifically where that link exists. If you've been reading a number of mailing lists or newsgroups recently, you've probably seen something like this within a message or signature: http://lipsmac.acs.unt.edu/Rangers/schedule.html
This is a Web address that indicates the location of my Texas Rangers Web page. Let's break the address down and see what's really being said.
The http: tells the client to make a hypertext transfer protocol connection. Other types of connections would be ftp: and gopher:. The client will treat each connection type differently because it's accessing a different set of services.
The //lipsmac.acs.unt.edu tells the client the address of the computer where the server software is running. This is a standard Internet address that would be used in making an FTP connection or a Gopher connection.
The /Rangers/schedule.html tells the client which file on the server to display. With Web clients, the extension on a filename is important. html tells the client that the file is an HTML-formatted file. Other extensions have different meanings (.gif indicates a GIF file, .au indicates an audio file, etc.).
Most Web clients have options where the location of the page being viewed can be displayed on the screen along with the page. Some people like to see this information, others may get confused by it.
Let's see another example. Suppose my client found the link ftp://ftp.unt.edu/pub/antivirus/mac/gatekeeper-127.hqx It would make an anonymous FTP connection to ftp.unt.edu and download the file named gatekeeper-127.hqx from the /pub/antivirus/mac directory. Makes sense now, right? Right. Me neither.
So is it possible for an Internet site to run more than one server simultaneously? Absolutely. At one point I had an FTP server, a Gopher server, and a Web server running on my Mac at the same time. I quit doing that for a number of reasons, most importantly being that I was running out of memory! But it is possible to see several different types of links to the same address. One might see a link ftp://mimas.acs.unt.edu/ or gopher://mimas.acs.unt.edu/ in a page. While these links go to the same machine, there may be reasons to access the information differently.
Web Clients
Being the Mac-head that I've become, I'm most familiar with the two Web clients that I've used on that platform. They are NCSA Mosaic 2.00 and MacWeb 0.98. Both are built around the Mac GUI and function fairly well in that environment. There are still some problems with both clients, but they are very usable for Web browsing. On the PC, there is WinMosaic from NCSA for Windows, and Lynx for DOS. I'm only marginally familiar with WinMosaic, and I've only seen Lynx once. On the UNIX side of things, you have XMosaic from NCSA and Lynx again. The article below lists some sources for these clients.
I've found in the few weeks that I've been experimenting with Web software that it really is an easy access method to information out in the world. While the number of FTP and Gopher sites is still very large, the number of Web sites are increasing rapidly. Especially when you have people like me putting up their own pages for whatever ego strokes they may receive as a result. But as non-technoids begin to hook up to the superhighway, Web servers and clients are going to make it an easy transition for them. Plus, commercial services like America Online have plans to add Web functionality to their services. Soon, we'll all be travelling at light speed on this information web being weaved daily.
Next Article
If you have problems or questions about this server, please contact us as soon as possible. You can send mail to the following address: www@unt.edu