Introduction to the Internet

[Dr. Goolkasian's Home Page]

Internet

The Internet is a network of networks. Communication between computers and networks on the Internet is handled by a specific set of commands or protocols called TCP/IP (Transmission Control Protocol/Internet Protocol). You generally will not be aware of TCP/IP because communications software provides a high-level interface that handles the protocols. Some networks attached to Internet do not use TCP/IP internally, but they can exchange information by means of gateways.
The Internet is often metaphorically called an "information highway" in recognition of the pathways that connect computers on the net. It is either self-regulating or non-regulated, depending on your viewpoint. No specific agency monitors or administrates the Internet as a whole. Generally, you will find the Internet to be a tolerant, permissive environment that comes closer to representing a sense of global community than almost any social or cultural paradigm in existence today. Some politicians and bureaucrats contemplate anarchy lurking in the libertarian recesses of Internet, and are calling for monitoring the content of information being transmitted. The twin issues of privacy and security over the Internet are being hotly debated. Among other issues unresolved is whether information on the Internet falls under the aegis of print media or broadcast (radio, television) media.

You are using the Internet to transfer information. Electronic mail (e-mail) transfers messages between two parties asynchronously: You send a message to a friend, and later she sends a reply. Although most e-mail activities would not be considered real-time (immediate) communication, there are chat capabilities that allow messaging in near real-time. The metaphor for chat is "push to talk" as you would do when using a CB radio or walkie-talkie. Recently, "push to talk" audio has become available to cybernauts with audio-capable computers.

Newsgroups are similar to e-mail messaging, but newsgroups are not point-to-point communications in the sense of e-mail. A newsgroup is organized around a central theme, and readers contribute comments to the newsgroup. Comments (and replies to comments) are then disseminated by news servers. It's similar to the editorial page in your local newspaper. Some newsgroups are edited (moderated in net parlance), others are not. Unmoderated newsgroups are often forums for copyright infringement, and illicit behavior.

Much information resides as data files stored on various machines around the Internet. The amount of information is so great that you need to use search engines to cruise the net looking for specific topics of interest. Gopher and Archie are two kinds of software used for network searches. Both of these packages are based on client/server models, just like newsgroups. Your local computer runs the client software that allows connections to remote server machines, usually mainframe computers. Both involve protocols for how they operate.

Gopher was originally a menu-based system that allowed connections to remote computers, text-based searching of documents on that computer, and retrieval of documents to the local computer. Archie is a search engine that reaches databases of information archived at many anonymous FTP sites on the Internet. FTP (File Transfer Protocol) is specifically designed for file transfer. Most Archie clients now allow for file retrieval as well as searches. Some client software (Fetch on the Macintosh, for example) are optimized for file transfer and lack the search capability. One important difference between Archie and Fetch is that Archie does not maintain a continuous link to the remote computer as Fetch does. Archie connects only long enough to search a database or download a file.

The World-Wide-Web (WWW) is part of Internet. It, too, uses the client/server model. WWW servers provide information in hypertext. Hypertext contains words or other objects that are links to other words, pictures, or even other servers. Client computers use browser software that properly accesses a WWW site, and then displays a hypertext document in the proper form. Most WWW browsers now have graphical interfaces. Also, other Internet capabilities such as Archie, Gopher, FTP, and e-mail may be run directly from a single WWW browser.

Networks

Access to Internet requires a physical connection between your desktop computer and another computer, usually a mainframe computer, that has either a direct connection to Internet or a gateway to Internet. When computers are connected by physical devices, such as wires (or, in some places, by infrared relays), a network is formed. Not all networks are in the Internet domain.
Local area networks (LANs) provide communication between a cluster or a small group of clusters of computers and other devices on the network. For example, a single academic department on the UNCC campus may have 20 Macintosh computers, three printers, and a shared hard disk array all connected to a LocalTalk network. LocalTalk comprises a set of protocols and hardware that enable all of the attached computers to share data and to print to any of the three printers. It is even possible to install software on this network that will provide e mail within the LAN. Without a hardware router, however, none of the computers on the LAN will be able to access the Internet or even other services provided by the University's mainframe computers.

It is possible for any of the 20 computers to get on Internet via a telephone connection to a commercial Internet provider. This procedure, that requires a modem and software using either SLIP (Serial Line Internet Protocol) or PPP (Point to Point Protocol), allows Internet access but at communication rates limited by the speed of the modem. Home and many business users currently can get Internet services only in this way. Soon, it is likely that the company that provides cable TV to your home will also have Internet available.

Fortunately, most campus users do not have to worry about serial lines to Internet. Most UNCC offices and dorms have fast Ethernet connections that have ready Internet access. All that is required is the appropriate client software.

Accounts, Access, and Addresses

You must have a user account to access most networks. LAN accounts are usually assigned by a systems manager in the department. Computing Services assigns accounts on campus mainframe computers. Faculty, staff, and students may request accounts individually, and the accounts remain active as long as you are associated with the university. Faculty may also request temporary accounts for classes. The appropriate forms are available from Computing Services.

At this point, it is useful to describe how each computer on the campus Ethernet and, indeed, on Internet, is uniquely identified. Desktop computers attached to Ethernet are each identified by an IP (Internet Protocol) number. The IP number is an address; it is expressed as a group of four numbers separated by periods. (The periods are referred to as a "dot" when you are pronouncing an address). None of the numbers can be greater than 256. Each computer is also identified by a domain name. . The domain name has at least two words or numbers separated by periods. A typical domain name has three words. The first word is the computer name, the second is the domain name, and the third is the top-level domain name; you will come across domain names with more than three words.

IP numbers and domain names are registered with a domain name server so that each address is unique. When a new address is registered, it may take a few hours for the information to propagate to all name servers on Internet. Individuals (people, that is) are assigned user names (or login names) when their accounts are set up on a particularly network. When you send e-mail to someone, you must use their complete address in this way:

bighead@email.uncc.edu.
The name "e-mail" in this address is actually a mail server in the uncc domain. The "@" is a required separator between the user name and domain name. User names can be fairly literal or a mixture of letters and numbers depending on the naming convention used by the system administrator. Top-level domain names tell you something about the organization whose computer you have accessed. The following table lists the common names and their significance.

Top-level Domain Name Meaning and Example

com commercial organizations
microsoft.com
edu educational institutions
uncc.edu
gov government agencies
edcftp.cr.usgs.gov
mil military agencies
pentagon.army.mil
org nonprofit organizations
eff.org

TELNET - Remote Login Procedures

Although software to access Internet is available for most desktop computer operating systems, there are some instances when you will want to connect directly to a remote computer. If you don't have Eudora, you can still conduct e-mail by logging onto a mainframe and running one of the mailer applications on that computer. After all, Eudora (or any other Internet mail application) just provides a convenient interface that replaces many command line functions with point-and-click functions.

A remote computer may be any computer on Internet with an IP number. One way to connect to a remote computer is to use a "dumb" terminal like those on which you use Aladdin in Adkins Library. A dumb terminal is essentially a monitor with enough hardware to maintain a connection with a mainframe computer; all applications that appear to run at the terminal are actually being processed on the remote. In contrast, a desktop computer is a stand-alone device that can run programs on its local processor. You can connect to remote computers from your local computer, but the two computers must both be running software that permits the connection. Furthermore, you often will need an account (or a password) to connect to the remote computer.

TELNET is a terminal emulation program that is used for remote logins to UNIX computers. Actually, if you are already connected to a mainframe by a terminal, you can issue TELNET commands from the terminal to connect to any other computer with an IP number. Otherwise, run a local TELNET session from your computer.

On the Macintosh, the most popular TELNET application is NCSA TELNET. (NCSA is the National Center for Supercomputing Applications in Illinois). NCSA TELNET may require configuring unless you are using a lab computer or Project Desktop (DOS) has been installed on your computer by Computing Services. Most of the configuration options in NCSA TELNET are self-explanatory. Remember that TELNET emulates a terminal, so most of what you will see in your TELNET session is a series of command lines.

Start the NCSA TELNET application and select "Open Connection" from the File menu. You must indicate the domain name of the host computer that you wish to connect to in the resulting dialog box. When you click OK, TELNET will look for the host computer using a domain name server. If no connection is made, check that you have entered the correct name. You may also use the IP number.

Electronic Mail

Electronic mail, or "e-mail", is a means of electronically sending messages and other information from one computer to another. It is not necessary that the two computers have a direct connection (although they may in some LANs). Consider how you send a letter to someone by way of the U.S. Postal Service (also known as "snail mail"). Your letter is sent, bearing the proper address, to a local postal station which then sorts mail by ZIP code. Before reaching its final destination, your letter may have passed through several other stations. Finally, upon delivery into the recipient's mailbox, your letter's passage is complete. In some cases, you may have requested a return receipt to verify that the letter got to the proper person.

Electronic mail works in a similar, but digital, manner. You compose a message on your local computer either within an e-mail application or in a word processor. If you used a word processor, you probably will have to save the file in an ASCII format. ASCII or "text" files use a special binary code for representing standard alphanumeric characters and some other special information like control codes. ASCII codes are identical across most operating systems, so they are the standard for e-mail. Some e-mail applications, however, are designed to work from within an active word processor and will capture the active document for you.

Before you send a message, you must enter a valid Internet e-mail address into the program. Make sure that the address includes a mail server. When you send the message, your own e-mail server will pass the message to another, nearby server, and so on until the message reaches its destination. You can read the path taken by a received message in the header of your messages.

Privacy and Security

E-mail may not be as private or secure as snail mail. It is possible for lurkers or hackers anywhere along an e-mail message path to intercept the message and read it. Some corporate site e-mail systems may be considered corporate property, including any messages sent within the system, and employees in such systems should remember this before using the system for private conversation.

In any event, it is not a good idea to send information that you wish to remain private over e-mail. This certainly would include credit card numbers, passwords, and personal identification numbers for you bank accounts. You gain some security by sending encrypted messages, where you have encoded the message with appropriate software before sending it. For this system to work, the recipient must have the key to decode your message. Also, some encryption programs may not use particularly secure or tough algorithms. Furthermore, you usually will have to send the encrypted file as an attachment rather than a message created within your e-mail program.


Sending and Receiving Messages

E-mail addresses are not case-sensitive, but you must use the proper syntax of username@domain.top-domain. For example, my e-mail address at UNCC is fgg00arb@email.uncc.edu. Most non-commercial Internet domains use standard format usernames. Commercial information services that provide subscribers with Internet e-mail access may have specific syntax requirements for sending messages onto the net.

Finding Addresses with "Ph"

Acting like an electronic phonebook, a Ph server, if it is available in your domain, allows you to look up mail and phone information at a remote location, usually at the level of an institution or organization. E-mail addresses are often included. To get the e-mail address of a user at another location, you must know the correct remote domain address. In the previous TELNET session, we were able to get the mailing address and office phone number of the Chancellor.

Finding Addresses with "finger"

When a Ph Server is not available, it may be possible to find the e-mail address of a remote contact by using the UNIX "Finger" command. Finger accesses the user login file on a remote computer. The command may be executed from several points. First, if you are logged onto a remote UNIX computer with TELNET, you can issue the Finger command within the TELNET session window. Eudora has a Finger/Ph window that allows you to type in information for the Finger command. Finally, there are versions of the utility available for desktop computer operating systems. The Macintosh version is called, appropriately, "Finger"; it is available for FTP downloading on most Macintosh software archives.

Mailing Lists (List Servers)

Another form of e-mail is the electronic mailing list. Mailing lists are available by subscription. They allow for either public discussion of specific topics or for mass mailings of information.

Subscribing to Mailing Lists

In general, you subscribe to a mailing list by sending an e-mail message to the list administrator on the list server. The subscription process is usually automated, most commonly by a program named ListServ. You join a list by sending an e-mail message to the administrative address for the list with a single statement - "subscribe (your name)" or "subscribe (listname) (you name)". You can determine that a mailing list is automated if the administrative Internet address for the list is in the form listserv@domain.top-domain. Some lists are administrated by an actual person, in which case you should send a polite one-sentence request in the body of your message to be added to the list.
Reading mailing list activities is done with an e-mail application like Eudora. The messages appear in your mailbox like any other e-mail. For lists that have a lot of activity, you may want to set the server to send information in digest mode. A digest is created by the server every few days. The digest is usually sent in one or two parts; the leading text in the first part is a serial list of messages in that particular digest described by subject line. To read a given digest message within a Eudora session you can issue a "Find" command on the subject text.

Lists can really accumulate if you subscribe to several lists. If you go on vacation, make certain that your mail server does not automatically respond back to the list that either your "mailbox is full" or that you are "on vacation." The automated response may bounce to everyone subscribed to that mailing list!



Network Newsgroups

Network newsgroups are part of the USENET, a collection of mainly UNIX-based computers among which information is shuttled by a communications standard called UUCP (UNIX to UNIX Copy Protocol). Internet sites access newsgroups on USENET by means of a network news server. The desktop application that you use to read newsgroups must be compatible with the server software. At UNCC, the news server is news.uncc.edu.

n the spirit of Internet, network newsgroups have sequential "dot" addressing. Unlike the Internet domain system, however, newsgroup names start with the most general designation. For example, the newsgroup sci.geo.earthquake is in the top-level "sci" (science) category, the "geo" (geo- logy, physics, science, graphy, etc), and is specifically concerned with earthquakes. The newsgroup comp.sys.mac.graphics is about graphics applications for the Macintosh computer. Most of the newsgroup periodically post FAQs (Frequently Asked Questions) and answers describing the group.

There are several good, free newsreaders for the Macintosh. Remember that you can read news with a command-line interface by TELNETing to a UNIX machine, but why do that when the graphical interface is available? The three most popular programs are Nuntius, NewsWatcher, and InterNews.

InterNews lets you create subscription files with newsgroups sorted however you want. In the following example, to add the sci.virtual-worlds newsgroup to the Fractals etc. subscription file, you just click and drag the newsgroup onto the Fractals etc. icon. This screen image also shows how individual articles in a newsgroup are listed in a separate, scrolling window. Unread newsgroups and articles have solid bullets next to them; you can mark a newsgroup read by selecting its name in the Newsgroup window and typing command-R.

Comparison to E-mail

You must use caution if you decide to reply to newsgroup posting; unlike e-mail, any newsgroup response gets posted to the publicly readable group. Thousands of readers may read your article, so make sure that you don't write something that you will regret! When you choose "Compose:New" from the Internews menubar, NewsWatcher warns you the first time you reply to a posting in a session.

FTP - File Transfer

One part of the example TELNET session was a file transfer from the ftp.merit.edu site to my local space on the unccvm.uncc.edu machine. This was made possible by an ftp server on the remote machine. In order to get the file onto my desktop computer, I must do a remote logon onto the campus computer, then put the file onto the desktop. FTP clients are available for desktop computers. These programs bypass multiple transfers and the clumsy command line interface to make getting and sending files over the Internet easy.

Perhaps the most popular (and free!) FTP client for the Macintosh OS is Fetch. Here are a series of screenshots using Fetch to get the same world map we earlier transferred via TELNET.
Choose "File:New Connection" (or type "command-N") to open a new connection dialog. The ftp.merit.edu server is an anonymous server. This means that anyone on Internet may download files from the remote computer; uploading files to anonymous servers is usually restricted or prohibited. Enter a User ID of "anonymous" for such servers. Anonymous servers don't really require passwords, but it is proper net manners to enter your e-mail address as a password. Some anonymous ftp servers keep a log of users.

We know from the TELNET session that the map file is in the "maps" directory. When you know the directory that you want, you can enter the path using "/" to separate sub directories. If you don't know the directory, leave this field blank and you will be logged into the root directory of the server. Fetch will display a list of directories when the connection is made. Look for a directory named "pub" and get its listing by double-clicking the directory name. Many anonymous ftp servers put publicly accessible files in the "pub" directory.

Once the connection is made, Fetch displays another window showing the contents of the directory "maps" with size and date information about the contents. To download the file to your local computer, select the file with a single click, the press the "Get File" button. Fetch will display information about the file during and after the download procedure. Note that Fetch can be configured to save transferred files into a folder that you designate. Also, remote files reside on remote servers in either binary or Text (ASCII) format. Fetch will attempt to recognize the file type. Many files are also saved in "binhex" format, which is used for network transfers, and most are compressed in one of several schemes to save bandwidth (the amount of time needed to transfer the file). Binhexed files have a ".hqx" suffix, as in "file.hqx". Macintosh binary files may be encoded with one of several compression formats. The most common are "sit" (Stuffit archive) and "cpt" (Compactor archive). You might see a file named "file.sit.hqx." Fetch can be configured to automatically unbinhex and unstuff a transferred file if you have the proper helper applications available to your local computer.

By the way, the file we downloaded from merit was a plain ASCII file with no compression. It has a ".ps" suffix that indicates that it is a Postscript. Postscript is a page description language used by the popular graphics program Adobe Illustrator among others. Postscript readers and interpreters are found on most operating systems.

Searching the Internet for Information

Gopher and Archie are two information search-and-retrieval tools that grow more useful as the volume of information on the Internet expands. As WWW browsers become more sophisticated, however, some of the capabilities in these tools are being integrated either into the browsers or are appearing as linked engines.

TurboGopher
is a graphical Gopher client for the Macintosh computer. It was created by workers at the University of Minnesota, where Gopher originated. The screenshots below are from a test version called TurboGopher VR. The test version has all of the features of the current standard release of TurboGopher, but the test includes a three-dimensional navigation aid. The illustrations only show standard features of the software.

You can define a "home gopher" within TurboGopher's configuration settings. The home gopher should be the Internet address of a Gopher server. The default is the University of Minnesota. When you start the application, a Home Gopher window opens. Within the window are a series of folder or file icons. The folder icons represent directories on the server.

Veronica is a search tool for GopherSpace. Archie is a search tool for FTPSpace. Archie queries FTP servers for files by name. Archie is not available on the unccvm.uncc.edu site. There are a number of remote Archie servers that can be reached by TELNET, however.

World Wide Web (WWW)

The World Wide Web is probably the hottest part of Internet today. When someone is "surfing the 'net", he or she is usually on the WWW. Graphical interfaces in Web client software reduce UNIX command lines to point-and-click actions. With minor inconvenience, anyone can browse Web documents and even set up a dedicated Web page.

A fundamental difference between the WWW and other Internet services is that Web documents are based on the concept of hypertext. Gopher servers, for example, are based on hierarchically aligned menus. Hypertext documents in the Web are non-linear. Each Web page may contain several hot links. A link usually contains a URL (Universal Resource Locator), that is a pointer to other Internet resources.

The URL may simply point to another part of the current Web page, or it may point to another Web site (also known as a home page). In that case, the definition of the URL will be of the form:

http://sunsite.unc.edu/Dave/drfun.html

where "http" stands for Hypertext Transfer Protocol and "html" is HyperText Markup Language. A link to a Gopher server begins with gopher:// and a link to an FTP server begins with ftp://. Many URLs start with "www." because the document that the URL points to a really just an ASCII file perhaps among many others on the server. The www prefix helps anyone browsing the directory to understand that this document is for reading by WWW browsers.
If you used early word-processing programs on microcomputers, you will understand the html programming syntax. Content within a Web page document is marked with html tags so that a Web browser can correctly display the text as a header, body text, emphasized text, a URL, etc. Web pages are easily created in word processors, and you can see them in all their graphical beauty be opening the file from within a Web browser. Letting other Web browsers see your creation requires that you set up a WWW server, however.

Web Servers and Web Clients

The WWW is a subset of Internet determined by the installation of Web servers. So, in the same sense that not all Internet sites have Gopher servers, not all sites have Web servers. Any computer on Ethernet at UNCC can be set up as a WWW server with the appropriate software. In fact, other services such as FTP and Gopher can now be placed on individual desktop computers alongside the Web. The software to accomplish these tasks is available for free or for nominal cost (compared to other kinds of software).

WWW browser development occurred on several operating system platforms simultaneously. This is an advantage for users, because it means that the appearance and function of a particular browser is nearly identical regardless of the client OS. The first graphical Web browser, NCSA Mosaic, is available for all common platforms (for free!). So is the most widely-used browser, Netscape Navigator (it's free, too!). A recent trend is browsers with built-in Web page editors so that you can make and see a document as it is created.

Both NCSA Mosaic and Netscape Navigator are full-featured graphical browsers. Each allows you to save URLs (in a "hotlist" or a "bookmark" respectively) for future reference. Many users prefer Netscape Navigator because it displays graphics faster. The developers of Netscape Navigator have implemented several html features ("enhancements") that are not yet part of the accepted html standards. For example, Navigator is capable of displaying interlaced graphics. When an image is displayed by the browser, it is painted in a series of sweeps, rather than in a sequence of adjacent lines as done by Mosaic. The effect is that the graphic appears to arrive on your computer monitor faster (but it doesn't in actuality).

Searches

Searching for information on the Internet is really centralized from within the World Wide Web. Because browsers can access anonymous FTP, Gopher, and other non-Web information sources, you may be able to abandon (or at least put in the closet) specialized clients for these services.

Engines specifically designed to search Web pages are also available by clicking on the Netscape Navigator "Net Search" button. That click will bring up a page of WWW search engines recommended by NetCom, the creator of Navigator. The Lycos engine from Carnegie-Mellon University is the most comprehensive (but often inaccessible) of those available. You can limit the number of hits and do Boolean searches from Lycos. There are also many indices for various kinds of information on the Web (and other parts of Internet.) Yahoo iis one of the more comprehensive indices. Web pages that are indices to indices are beginning to appear on the WWW.