This page last updated on: 05/18/98
(Minor typo updated on 12/21/99. Reading over this article, a lot has changed in the almost four years since I first wrote it, which is not surprising at all. If I get a little time I'll add some notes on DSL, cable modems, firewalls, and some of the protocols that have become common in the last few years.)
(Diagram added on 3/25/00, which was created and provided to me by Michael Dawson - thanks Mike!)
HTTP, TCP/IP, PPP, HTML, URL's ... Are you new to this emerging frontier of the Internet but starting to suffer from acronym overload? While new acronyms and abbreviations seem to be invented every day, there's a common set of underlying mechanisms being used on the Internet, and I'm going to explain these in layperson terms. This article is aimed for anyone wanting a better understanding of how Internet applications work and communicate with each other. It always helps to know the foundation of a particular technology, no matter how experienced or how new you are.
In this article I do assume a beginning familiarity with using Internet applications such as Web browsers and e-mail programs. I also assume a small amount of computer and technical knowledge, but even if you're new to using computers as well as being new to the Internet, you should be able to follow along with maybe an occasional look in a technical dictionary.
To say it in a slightly different way, a network protocol (including all of the Internet protocols) is the term used to describe how computer systems communicate with each other at the bit and byte level. Network protocols are layered on top of each other, with each layer providing additional capabilities, but using the facilities provided by the lower layer.
The common term for a network location is 'address', and each system on the Internet has an address. This address is called an IP address, and there's two formats for an IP address. Internally, each computer system uses an IP address that is composed of four numbers, usually written for humans with dots between each number. An example IP numeric address is '198.137.231.1' (which happens to be the main IP address for NW Nexus in Bellevue, WA). However, since it's easier for humans to remember names instead of numbers, most IP addresses have corresponding English-like names, also separated with dots. The previous address written as a name is 'halcyon.com'. Scattered throughout the Internet are systems with the responsibility of translating Internet name addresses into the IP address numeric form. These systems are called 'name servers'.
In general, it is better to use an Internet name address rather than the IP numeric address. This is because IP numeric addresses can sometimes change for a given location, and the change will be transparent if you are using the Internet name address rather than the IP numeric address. (The name servers have to be updated, of course.) Occasionally you do need to use the numeric form of an Internet address, and most Internet applications allow you to enter either format.
Another term used in conjunction with Internet name addresses is 'host name', because every Internet address must correspond to a computer system (a 'host') somewhere on the Internet. The systems that provide IP name to number translation are called 'Domain Name Servers', or DNS.
TCP (which is an abbreviation for 'Transmission Control Protocol') is very common on the Internet, and is almost always mentioned together with IP, making the acronym TCP/IP (TCP running on top of IP).
Some applications use a different protocol running on top of IP called UDP ('User Datagram Protocol'). UDP sends data one chunk at a time (called a 'datagram') to the other system and doesn't provide a virtual connection like TCP does. UDP also doesn't provide the same guarantees that TCP does, which means that datagrams may be lost or arrive out of sequence. Each received datagram is checked for internal integrity (like TCP), but if it has been corrupted it is dropped, rather than re-transmitted (as TCP does).
You might be wondering why UDP is used instead of TCP since UDP is not as reliable. To provide the extra guarantees, TCP has a lot of overhead compared to UDP, which makes TCP slower than UDP. For applications where performance is more important than reliability, UDP makes more sense. Some examples include audio and video streaming over the Internet, and Internet phone applications.
More typical, however, is that one application will initiate a request, and the other application will respond. The initiating application is called a client, and the responding application is called a server. Usually a server application handles multiple client connections at the same time, and runs on a more powerful system hooked up to the Internet. Client / server communication is at the heart of most Internet applications.
Without looking at the protocol details yet, here's a brief look at typical client / server relationships on the Internet:
SLIP and PPP both allow IP data to be sent over dial-up lines. SLIP is an abbreviation for 'Serial Line IP' and PPP is short for 'Point-to-Point Protocol'. Both take IP data and package it up so that it can be sent over modem dial-up lines. PPP is considered to be newer and better than SLIP, although many Internet providers continue to support SLIP dial-up access.
While connected to an ISP using SLIP or PPP, your system is now another location on the Internet, with its own IP address. Your account with the ISP may assign you a permanent, fixed IP address and name, or it may provide what is called a 'dynamic' IP address. Since at any given time only a subset of dial-up lines are in use for an ISP, the provider may assign an IP number (and also typically an IP name) from a pool of available addresses.
Most Winsock implementations provide PPP capabilities (and SLIP) as well as LAN connectivity. When using a Winsock connection (whether SLIP, PPP, Ethernet, or some other access type), the PC is a true Internet system (sometimes called a 'node'), with all the potential of other Internet systems. This is in contrast to dial-up type connections and accounts that provide only terminal or character-mode access (sometimes called 'shell accounts'). These type of connections don't use the Winsock interface, and the PC is then not a true Internet node. (Plus you also don't get the Windows graphical interface while using 'shell' access.)
Here's a link to an excellent collection of Winsock applications: The Ultimate Collection of Winsock Software
To the IP protocol, ISDN is simply a different transport for the IP messages. Many winsock implementations provide PPP over ISDN lines, and it will be an integral part of Win95 future releases.
Here's a link to an excellent ISDN resource: Dan Kegel's ISDN Page
When an Internet site makes files available to the general public, this is called 'anonymous' FTP. A password does not need to be supplied, although the user e-mail address is typically requested. Some sites have confidential files or directories, and an FTP login and password is needed to download or upload.
Telnet is a way to remotely login to another system on the Internet. A telnet server must be running on the remote system, and a telnet client application is run on the local system. When you are logged in to a system using telnet, it is as if you were logged in locally and using the operating system command line interface on the telnet server system. Typical operating systems for telnet servers are Unix, Windows NT, and VMS.
HyperText Markup Language (HTML) is not an Internet protocol - it is the internal format of Web pages. HTML consists of a set of tags and internal commands that are embedded inside Web pages to control the appearance and layout of Web pages, as well as links to other Web pages.
FTP, telnet, SMTP, and almost all other Internet protocols are built-in to Web browsers. FTP, for example, is used to download application executables as well as other files (whenever you are asked for a 'save file' location, FTP is probably being used to transfer the file).
The protocol is first, followed by a colon and two slashes. In this case it is using the HTTP protocol, which means a Web page is at that location. The next portion is the Internet host name www.unitedmedia.com. Somewhere on the Internet is a system with that name, with a corresponding IP numeric address provided by the Internet DNS service. The last portion is the directory location, in this case /comics/dilbert. Since a Web server will typically have many different Web pages on multiple directories, the URL provides a way of specifying where to look.
Another example URL: ftp://butler.hpl.hp.com/stl/
This specifies using the FTP protocol to go to a system named butler.hpl.hp.com, then to the stl directory on that system. A listing of the files in that directory will be displayed, and the appropriate files can be downloaded with the FTP protocol.
E-mail is handled with the mailto prefix: mailto:cliffg@halcyon.com
An Internet e-mail address is composed of two parts - the user name, and the server location. For example, my Internet e-mail address is: cliffg@halcyon.com
I have an account and user name cliffg with NW Nexus, and their e-mail system name is halcyon.com.
SMTP relies on having servers running at both sites (source and destination). If you're using PPP or SLIP to connect to the Internet, your system is typically not connected all the time, and for many users doesn't have a fixed name (it is dynamically assigned from a pool of names and addresses). In this case the e-mail is stored on the ISP e-mail server, and after logging in a special connection is made by the e-mail client to get the waiting e-mail messages.
E-mail systems on a LAN sometimes don't use the SMTP protocol (e.g. CC:Mail). In this case a translation is made between the two e-mail protocols so that e-mail can be interchanged. This is commonly called an e-mail gateway.
Internet Relay Chat (IRC) is a text-based chat mechanism that runs over the Internet. IRC clients provide the user interface for typing, while IRC servers pass the information back and forth, as well as organize the channels that are used for chatting.
There are other Internet application protocols in use, with the same underlying client / server model of communication.
Some client applications use only one high-level protocol, such as an FTP client, while others provide multi-protocol access (for example most Web browsers provide almost all high-level Internet protocols). Each high-level Internet protocol has a server that handles requests from the client application.
The most dramatic example is the global set of servers providing WWW access to Web browsers everywhere. Using the HTTP protocol, formatted text, graphics, and images are delivered from a Web (HTTP) server to Web clients (browsers), and hypertext links allow quick and easy access to other Web servers. The Java and Javascript languages allow interactive applications to be written which reside on a Web server and run within a Web browser as needed.
Commercial services such as Compuserve and America On-Line have internal networks that are different from the Internet. Increasingly, however, they are offering network gateways between their internal networks and the Internet, and providing software that allows users to access either one.
This is an exciting time in the world of global communications, and my hope
is that this article has helped explain and open up some of the mysteries of
Internet network protocols.