Introduction to the Internet
[Dr. Goolkasian's Home Page]
Internet
The Internet is a network of networks. Communication between computers and
networks on the Internet is handled by a specific set of commands or protocols
called TCP/IP (Transmission Control Protocol/Internet Protocol). You generally
will not be aware of TCP/IP because communications software provides a high-level
interface that handles the protocols. Some networks attached to Internet
do not use TCP/IP internally, but they can exchange information by means
of gateways.
The Internet is often metaphorically called an "information highway"
in recognition of the pathways that connect computers on the net. It is
either self-regulating or non-regulated, depending on your viewpoint. No
specific agency monitors or administrates the Internet as a whole. Generally,
you will find the Internet to be a tolerant, permissive environment that
comes closer to representing a sense of global community than almost any
social or cultural paradigm in existence today. Some politicians and bureaucrats
contemplate anarchy lurking in the libertarian recesses of Internet, and
are calling for monitoring the content of information being transmitted.
The twin issues of privacy and security over the Internet are being hotly
debated. Among other issues unresolved is whether information on the Internet
falls under the aegis of print media or broadcast (radio, television) media.
You are using the Internet to transfer information. Electronic mail (e-mail)
transfers messages between two parties asynchronously: You send a message
to a friend, and later she sends a reply. Although most e-mail activities
would not be considered real-time (immediate) communication, there are chat
capabilities that allow messaging in near real-time. The metaphor for chat
is "push to talk" as you would do when using a CB radio or walkie-talkie.
Recently, "push to talk" audio has become available to cybernauts
with audio-capable computers.
Newsgroups are similar to e-mail messaging, but newsgroups are not
point-to-point communications in the sense of e-mail. A newsgroup is organized
around a central theme, and readers contribute comments to the newsgroup.
Comments (and replies to comments) are then disseminated by news servers.
It's similar to the editorial page in your local newspaper. Some newsgroups
are edited (moderated in net parlance), others are not. Unmoderated
newsgroups are often forums for copyright infringement, and illicit behavior.
Much information resides as data files stored on various machines around
the Internet. The amount of information is so great that you need to use
search engines to cruise the net looking for specific topics of interest.
Gopher and Archie are two kinds of software used for network
searches. Both of these packages are based on client/server models,
just like newsgroups. Your local computer runs the client software that
allows connections to remote server machines, usually mainframe computers.
Both involve protocols for how they operate.
Gopher was originally a menu-based system that allowed connections to remote
computers, text-based searching of documents on that computer, and retrieval
of documents to the local computer. Archie is a search engine that reaches
databases of information archived at many anonymous FTP sites on the Internet.
FTP (File Transfer Protocol) is specifically designed for file transfer.
Most Archie clients now allow for file retrieval as well as searches. Some
client software (Fetch on the Macintosh, for example) are optimized
for file transfer and lack the search capability. One important difference
between Archie and Fetch is that Archie does not maintain a continuous link
to the remote computer as Fetch does. Archie connects only long enough to
search a database or download a file.
The World-Wide-Web (WWW) is part of Internet. It, too, uses the client/server
model. WWW servers provide information in hypertext. Hypertext contains
words or other objects that are links to other words, pictures, or even
other servers. Client computers use browser software that properly accesses
a WWW site, and then displays a hypertext document in the proper form. Most
WWW browsers now have graphical interfaces. Also, other Internet capabilities
such as Archie, Gopher, FTP, and e-mail may be run directly from a single
WWW browser.
Networks
Access to Internet requires a physical connection between your desktop computer
and another computer, usually a mainframe computer, that has either a direct
connection to Internet or a gateway to Internet. When computers are connected
by physical devices, such as wires (or, in some places, by infrared relays),
a network is formed. Not all networks are in the Internet domain.
Local area networks (LANs) provide communication between a cluster or a
small group of clusters of computers and other devices on the network. For
example, a single academic department on the UNCC campus may have 20 Macintosh
computers, three printers, and a shared hard disk array all connected to
a LocalTalk network. LocalTalk comprises a set of protocols and hardware
that enable all of the attached computers to share data and to print to
any of the three printers. It is even possible to install software on this
network that will provide e mail within the LAN. Without a hardware router,
however, none of the computers on the LAN will be able to access the Internet
or even other services provided by the University's mainframe computers.
It is possible for any of the 20 computers to get on Internet via a telephone
connection to a commercial Internet provider. This procedure, that requires
a modem and software using either SLIP (Serial Line Internet Protocol) or
PPP (Point to Point Protocol), allows Internet access but at communication
rates limited by the speed of the modem. Home and many business users currently
can get Internet services only in this way. Soon, it is likely that the
company that provides cable TV to your home will also have Internet available.
Fortunately, most campus users do not have to worry about serial lines to
Internet. Most UNCC offices and dorms have fast Ethernet connections
that have ready Internet access. All that is required is the appropriate
client software.
Accounts, Access, and Addresses
You must have a user account to access most networks. LAN accounts are usually
assigned by a systems manager in the department. Computing Services assigns
accounts on campus mainframe computers. Faculty, staff, and students may
request accounts individually, and the accounts remain active as long as
you are associated with the university. Faculty may also request temporary
accounts for classes. The appropriate forms are available from Computing
Services.
At this point, it is useful to describe how each computer on the campus
Ethernet and, indeed, on Internet, is uniquely identified. Desktop computers
attached to Ethernet are each identified by an IP (Internet Protocol)
number. The IP number is an address; it is expressed as a group of four
numbers separated by periods. (The periods are referred to as a "dot"
when you are pronouncing an address). None of the numbers can be greater
than 256. Each computer is also identified by a domain name. . The
domain name has at least two words or numbers separated by periods. A typical
domain name has three words. The first word is the computer name, the second
is the domain name, and the third is the top-level domain name; you will
come across domain names with more than three words.
IP numbers and domain names are registered with a domain name server
so that each address is unique. When a new address is registered, it
may take a few hours for the information to propagate to all name servers
on Internet. Individuals (people, that is) are assigned user names (or login
names) when their accounts are set up on a particularly network. When you
send e-mail to someone, you must use their complete address in this way:
bighead@email.uncc.edu.
The name "e-mail" in this address is actually a mail server in
the uncc domain. The "@" is a required separator between the user
name and domain name. User names can be fairly literal or a mixture of letters
and numbers depending on the naming convention used by the system administrator.
Top-level domain names tell you something about the organization whose computer
you have accessed. The following table lists the common names and their
significance.
Top-level Domain Name Meaning and Example
com commercial organizations
microsoft.com
edu educational institutions
uncc.edu
gov government agencies
edcftp.cr.usgs.gov
mil military agencies
pentagon.army.mil
org nonprofit organizations
eff.org
TELNET - Remote Login Procedures
Although software to access Internet is available for most desktop computer
operating systems, there are some instances when you will want to connect
directly to a remote computer. If you don't have Eudora, you can still conduct
e-mail by logging onto a mainframe and running one of the mailer applications
on that computer. After all, Eudora (or any other Internet mail application)
just provides a convenient interface that replaces many command line functions
with point-and-click functions.
A remote computer may be any computer on Internet with an IP number. One
way to connect to a remote computer is to use a "dumb" terminal
like those on which you use Aladdin in Adkins Library. A dumb terminal is
essentially a monitor with enough hardware to maintain a connection with
a mainframe computer; all applications that appear to run at the terminal
are actually being processed on the remote. In contrast, a desktop computer
is a stand-alone device that can run programs on its local processor. You
can connect to remote computers from your local computer, but the two computers
must both be running software that permits the connection. Furthermore,
you often will need an account (or a password) to connect to the remote
computer.
TELNET is a terminal emulation program that is used for remote logins to
UNIX computers. Actually, if you are already connected to a mainframe by
a terminal, you can issue TELNET commands from the terminal to connect to
any other computer with an IP number. Otherwise, run a local TELNET session
from your computer.
On the Macintosh, the most popular TELNET application is NCSA TELNET. (NCSA
is the National Center for Supercomputing Applications in Illinois). NCSA
TELNET may require configuring unless you are using a lab computer or Project
Desktop (DOS) has been installed on your computer by Computing Services.
Most of the configuration options in NCSA TELNET are self-explanatory. Remember
that TELNET emulates a terminal, so most of what you will see in your TELNET
session is a series of command lines.
Start the NCSA TELNET application and select "Open Connection"
from the File menu. You must indicate the domain name of the host computer
that you wish to connect to in the resulting dialog box. When you click
OK, TELNET will look for the host computer using a domain name server. If
no connection is made, check that you have entered the correct name. You
may also use the IP number.
Electronic Mail
Electronic mail, or "e-mail", is a means of electronically sending
messages and other information from one computer to another. It is not necessary
that the two computers have a direct connection (although they may in some
LANs). Consider how you send a letter to someone by way of the U.S. Postal
Service (also known as "snail mail"). Your letter is sent, bearing
the proper address, to a local postal station which then sorts mail by ZIP
code. Before reaching its final destination, your letter may have passed
through several other stations. Finally, upon delivery into the recipient's
mailbox, your letter's passage is complete. In some cases, you may have
requested a return receipt to verify that the letter got to the proper person.
Electronic mail works in a similar, but digital, manner. You compose a message
on your local computer either within an e-mail application or in a word
processor. If you used a word processor, you probably will have to save
the file in an ASCII format. ASCII or "text" files use a special
binary code for representing standard alphanumeric characters and some other
special information like control codes. ASCII codes are identical across
most operating systems, so they are the standard for e-mail. Some e-mail
applications, however, are designed to work from within an active word processor
and will capture the active document for you.
Before you send a message, you must enter a valid Internet e-mail address
into the program. Make sure that the address includes a mail server.
When you send the message, your own e-mail server will pass the message
to another, nearby server, and so on until the message reaches its destination.
You can read the path taken by a received message in the header of your
messages.
Privacy and Security
E-mail may not be as private or secure as snail mail. It is possible for
lurkers or hackers anywhere along an e-mail message path to intercept the
message and read it. Some corporate site e-mail systems may be considered
corporate property, including any messages sent within the system, and employees
in such systems should remember this before using the system for private
conversation.
In any event, it is not a good idea to send information that you wish to
remain private over e-mail. This certainly would include credit card numbers,
passwords, and personal identification numbers for you bank accounts. You
gain some security by sending encrypted messages, where you have encoded
the message with appropriate software before sending it. For this system
to work, the recipient must have the key to decode your message. Also, some
encryption programs may not use particularly secure or tough algorithms.
Furthermore, you usually will have to send the encrypted file as an attachment
rather than a message created within your e-mail program.
Sending and Receiving Messages
E-mail addresses are not case-sensitive, but you must use the proper syntax
of username@domain.top-domain. For example, my e-mail address at
UNCC is fgg00arb@email.uncc.edu. Most non-commercial Internet domains
use standard format usernames. Commercial information services that provide
subscribers with Internet e-mail access may have specific syntax requirements
for sending messages onto the net.
Finding Addresses with "Ph"
Acting like an electronic phonebook, a Ph server, if it is available in
your domain, allows you to look up mail and phone information at a remote
location, usually at the level of an institution or organization. E-mail
addresses are often included. To get the e-mail address of a user at another
location, you must know the correct remote domain address. In the previous
TELNET session, we were able to get the mailing address and office phone
number of the Chancellor.
Finding Addresses with "finger"
When a Ph Server is not available, it may be possible to find the e-mail
address of a remote contact by using the UNIX "Finger" command.
Finger accesses the user login file on a remote computer. The command may
be executed from several points. First, if you are logged onto a remote
UNIX computer with TELNET, you can issue the Finger command within the TELNET
session window. Eudora has a Finger/Ph window that allows you to type in
information for the Finger command. Finally, there are versions of the utility
available for desktop computer operating systems. The Macintosh version
is called, appropriately, "Finger"; it is available for FTP downloading
on most Macintosh software archives.
Mailing Lists (List Servers)
Another form of e-mail is the electronic mailing list. Mailing lists are
available by subscription. They allow for either public discussion of specific
topics or for mass mailings of information.
Subscribing to Mailing Lists
In general, you subscribe to a mailing list by sending an e-mail message
to the list administrator on the list server. The subscription process
is usually automated, most commonly by a program named ListServ. You
join a list by sending an e-mail message to the administrative address for
the list with a single statement - "subscribe (your name)" or
"subscribe (listname) (you name)". You can determine that a mailing
list is automated if the administrative Internet address for the list is
in the form listserv@domain.top-domain. Some lists are administrated
by an actual person, in which case you should send a polite one-sentence
request in the body of your message to be added to the list.
Reading mailing list activities is done with an e-mail application like
Eudora. The messages appear in your mailbox like any other e-mail. For lists
that have a lot of activity, you may want to set the server to send information
in digest mode. A digest is created by the server every few days. The digest
is usually sent in one or two parts; the leading text in the first part
is a serial list of messages in that particular digest described by subject
line. To read a given digest message within a Eudora session you can issue
a "Find" command on the subject text.
Lists can really accumulate if you subscribe to several lists. If you go
on vacation, make certain that your mail server does not automatically respond
back to the list that either your "mailbox is full" or that you
are "on vacation." The automated response may bounce to everyone
subscribed to that mailing list!
Network Newsgroups
Network newsgroups are part of the USENET, a collection of mainly UNIX-based
computers among which information is shuttled by a communications standard
called UUCP (UNIX to UNIX Copy Protocol). Internet sites access newsgroups
on USENET by means of a network news server. The desktop application that
you use to read newsgroups must be compatible with the server software.
At UNCC, the news server is news.uncc.edu.
n the spirit of Internet, network newsgroups have sequential "dot"
addressing. Unlike the Internet domain system, however, newsgroup names
start with the most general designation. For example, the newsgroup sci.geo.earthquake
is in the top-level "sci" (science) category, the "geo"
(geo- logy, physics, science, graphy, etc), and is specifically concerned
with earthquakes. The newsgroup comp.sys.mac.graphics is about graphics
applications for the Macintosh computer. Most of the newsgroup periodically
post FAQs (Frequently Asked Questions) and answers describing the group.
There are several good, free newsreaders for the Macintosh. Remember that
you can read news with a command-line interface by TELNETing to a UNIX machine,
but why do that when the graphical interface is available? The three most
popular programs are Nuntius, NewsWatcher, and InterNews.
InterNews lets you create subscription files with newsgroups sorted however
you want. In the following example, to add the sci.virtual-worlds newsgroup
to the Fractals etc. subscription file, you just click and drag the
newsgroup onto the Fractals etc. icon. This screen image also shows
how individual articles in a newsgroup are listed in a separate, scrolling
window. Unread newsgroups and articles have solid bullets next to them;
you can mark a newsgroup read by selecting its name in the Newsgroup window
and typing command-R.
Comparison to E-mail
You must use caution if you decide to reply to newsgroup posting; unlike
e-mail, any newsgroup response gets posted to the publicly readable group.
Thousands of readers may read your article, so make sure that you don't
write something that you will regret! When you choose "Compose:New"
from the Internews menubar, NewsWatcher warns you the first time you reply
to a posting in a session.
FTP - File Transfer
One part of the example TELNET session was a file transfer from the ftp.merit.edu
site to my local space on the unccvm.uncc.edu machine. This was
made possible by an ftp server on the remote machine. In order to get the
file onto my desktop computer, I must do a remote logon onto the campus
computer, then put the file onto the desktop. FTP clients are available
for desktop computers. These programs bypass multiple transfers and the
clumsy command line interface to make getting and sending files over the
Internet easy.
Perhaps the most popular (and free!) FTP client for the Macintosh OS is
Fetch. Here are a series of screenshots using Fetch to get the same
world map we earlier transferred via TELNET.
Choose "File:New Connection" (or type "command-N") to
open a new connection dialog. The ftp.merit.edu server is an anonymous
server. This means that anyone on Internet may download files from the remote
computer; uploading files to anonymous servers is usually restricted or
prohibited. Enter a User ID of "anonymous" for such servers. Anonymous
servers don't really require passwords, but it is proper net manners to
enter your e-mail address as a password. Some anonymous ftp servers keep
a log of users.
We know from the TELNET session that the map file is in the "maps"
directory. When you know the directory that you want, you can enter the
path using "/" to separate sub directories. If you don't know
the directory, leave this field blank and you will be logged into the root
directory of the server. Fetch will display a list of directories when the
connection is made. Look for a directory named "pub" and get its
listing by double-clicking the directory name. Many anonymous ftp servers
put publicly accessible files in the "pub" directory.
Once the connection is made, Fetch displays another window showing the contents
of the directory "maps" with size and date information about the
contents. To download the file to your local computer, select the file with
a single click, the press the "Get File" button. Fetch will display
information about the file during and after the download procedure. Note
that Fetch can be configured to save transferred files into a folder that
you designate. Also, remote files reside on remote servers in either binary
or Text (ASCII) format. Fetch will attempt to recognize the file type. Many
files are also saved in "binhex" format, which is used for network
transfers, and most are compressed in one of several schemes to save bandwidth
(the amount of time needed to transfer the file). Binhexed files have a
".hqx" suffix, as in "file.hqx". Macintosh binary files
may be encoded with one of several compression formats. The most common
are "sit" (Stuffit archive) and "cpt" (Compactor archive).
You might see a file named "file.sit.hqx." Fetch can be configured
to automatically unbinhex and unstuff a transferred file if you have the
proper helper applications available to your local computer.
By the way, the file we downloaded from merit was a plain ASCII file
with no compression. It has a ".ps" suffix that indicates that
it is a Postscript. Postscript is a page description language used by the
popular graphics program Adobe Illustrator among others. Postscript readers
and interpreters are found on most operating systems.
Searching the Internet for Information
Gopher and Archie are two information search-and-retrieval tools that grow
more useful as the volume of information on the Internet expands. As WWW
browsers become more sophisticated, however, some of the capabilities in
these tools are being integrated either into the browsers or are appearing
as linked engines.
TurboGopher is a graphical Gopher client for the Macintosh computer.
It was created by workers at the University of Minnesota, where Gopher originated.
The screenshots below are from a test version called TurboGopher VR. The
test version has all of the features of the current standard release of
TurboGopher, but the test includes a three-dimensional navigation aid. The
illustrations only show standard features of the software.
You can define a "home gopher" within TurboGopher's configuration
settings. The home gopher should be the Internet address of a Gopher server.
The default is the University of Minnesota. When you start the application,
a Home Gopher window opens. Within the window are a series of folder or
file icons. The folder icons represent directories on the server.
Veronica is a search tool for GopherSpace. Archie is a search tool for FTPSpace.
Archie queries FTP servers for files by name. Archie is not available on
the unccvm.uncc.edu site. There are a number of remote Archie servers
that can be reached by TELNET, however.
World Wide Web (WWW)
The World Wide Web is probably the hottest part of Internet today. When
someone is "surfing the 'net", he or she is usually on the WWW.
Graphical interfaces in Web client software reduce UNIX command lines to
point-and-click actions. With minor inconvenience, anyone can browse Web
documents and even set up a dedicated Web page.
A fundamental difference between the WWW and other Internet services is
that Web documents are based on the concept of hypertext. Gopher
servers, for example, are based on hierarchically aligned menus. Hypertext
documents in the Web are non-linear. Each Web page may contain several hot
links. A link usually contains a URL (Universal Resource Locator), that
is a pointer to other Internet resources.
The URL may simply point to another part of the current Web page, or it
may point to another Web site (also known as a home page). In that case,
the definition of the URL will be of the form:
http://sunsite.unc.edu/Dave/drfun.html
where "http" stands for Hypertext Transfer Protocol and "html"
is HyperText Markup Language. A link to a Gopher server begins with gopher://
and a link to an FTP server begins with ftp://. Many URLs start
with "www." because the document that the URL points to a really
just an ASCII file perhaps among many others on the server. The www prefix
helps anyone browsing the directory to understand that this document is
for reading by WWW browsers.
If you used early word-processing programs on microcomputers, you will understand
the html programming syntax. Content within a Web page document is marked
with html tags so that a Web browser can correctly display the text as a
header, body text, emphasized text, a URL, etc. Web pages are easily created
in word processors, and you can see them in all their graphical beauty be
opening the file from within a Web browser. Letting other Web browsers see
your creation requires that you set up a WWW server, however.
Web Servers and Web Clients
The WWW is a subset of Internet determined by the installation of Web servers.
So, in the same sense that not all Internet sites have Gopher servers, not
all sites have Web servers. Any computer on Ethernet at UNCC can be set
up as a WWW server with the appropriate software. In fact, other services
such as FTP and Gopher can now be placed on individual desktop computers
alongside the Web. The software to accomplish these tasks is available for
free or for nominal cost (compared to other kinds of software).
WWW browser development occurred on several operating system platforms simultaneously.
This is an advantage for users, because it means that the appearance and
function of a particular browser is nearly identical regardless of the client
OS. The first graphical Web browser, NCSA Mosaic, is available for all common
platforms (for free!). So is the most widely-used browser, Netscape Navigator
(it's free, too!). A recent trend is browsers with built-in Web page editors
so that you can make and see a document as it is created.
Both NCSA Mosaic and Netscape Navigator are full-featured graphical browsers.
Each allows you to save URLs (in a "hotlist" or a "bookmark"
respectively) for future reference. Many users prefer Netscape Navigator
because it displays graphics faster. The developers of Netscape Navigator
have implemented several html features ("enhancements") that are
not yet part of the accepted html standards. For example, Navigator is capable
of displaying interlaced graphics. When an image is displayed by the browser,
it is painted in a series of sweeps, rather than in a sequence of adjacent
lines as done by Mosaic. The effect is that the graphic appears to arrive
on your computer monitor faster (but it doesn't in actuality).
Searches
Searching for information on the Internet is really centralized from within
the World Wide Web. Because browsers can access anonymous FTP, Gopher, and
other non-Web information sources, you may be able to abandon (or at least
put in the closet) specialized clients for these services.
Engines specifically designed to search Web pages are also available by
clicking on the Netscape Navigator "Net Search" button. That click
will bring up a page of WWW search engines recommended by NetCom, the creator
of Navigator. The Lycos engine from Carnegie-Mellon University is
the most comprehensive (but often inaccessible) of those available. You
can limit the number of hits and do Boolean searches from Lycos. There
are also many indices for various kinds of information on the Web (and other
parts of Internet.) Yahoo iis one of the more comprehensive indices.
Web pages that are indices to indices are beginning to appear on the WWW.