C# TCP/IP Socket Programming: 2012

Wednesday, October 31, 2012

Network Protocols

we shall look into the functionality and purpose of the protocols of the TCP/IP suite in the following order:

Basic Protocols
Internet Protocols
E-mail Protocols
Other Protocols

Basic Protocols

As we can see, the TCP/IP protocol suite has a much simpler layered structure than the seven layers of the OSI model. The Transmission Control Protocol (TCP) and User Datagram Protocol (UDP) protocols are transport protocols corresponding to OSI layer 4. Both protocols make use of the Internet Protocol (IP), an OSI layer 3 protocol (the network layer). As well as these three protocols, there are two more basic protocols in the TCP/IP suite that extend the IP protocol: ICMP and IGMP. The functionality of these protocols must be implemented in the layer housing the IP protocol, hence they are shown in that layer in the preceding figure.
IP-Internet Protocol

The Internet Protocol connects two nodes. Each node is identified by a 32-bit address, called its IP address. When sending a message, the IP protocol receives the message from upper level protocols such as TCP or UDP and adds the IP header which contains information about the destination host.

Subnets

Connecting two nodes of different networks requires a router. The host number is defined by 24 bits of a Class A IP address; while with a Class C network, just 8 bits are available. A router splits the host number into a subnet number and host number. Adding additional routers will reduce broadcasts in the network, which can reduce network load. The main reason for adding routers is to improve connectivity between sites in different buildings, cities, and so on.

Transport Layer-Port Numbers

The IP protocol uses IP addresses to identify nodes on the network, while the transport layer (layer 4) uses endpoints to identify applications. TCP and UDP protocols use a port number together with an IP address to specify an application endpoint.

The server must supply a known endpoint for a client to connect to, although the port number can be created dynamically for the client.

TCP and UDP port numbers are 16 bits, and can be divided into three categories:
- System (Well-Known) Port Numbers
- User (Registered) Port Numbers
- Dynamic or Private Ports
  
  TCP-Transmission Control Protocol
  
  Connection-oriented communication can use reliable communication where the layer 4 protocol sends acknowledgements of data receipts, and requests retransmission if data is not received or is corrupted. The TCP protocol uses such reliable communication.
  
  Some of the application protocols that use TCP are HTTP, FTP, SMTP, and Telnet.
  
  TCP requires that a connection must be opened before data can be sent. The server application must perform a so-called passive open to create a connection with a known port number, where rather than making a call to the network, the server listens and waits for incoming requests. The client application must perform an active open by sending a synchronize sequence number (SYN) to the server application to identify the connection. The client application can use a dynamic port number as a local port. The server must send an acknowledgement (ACK) to the client together with the sequence number (SYN) of the server. The client in turn answers with an ACK, and the connection is established.
  
  UDP-User Datagram Protocol
  
  Contrary to TCP, UDP is a very fast protocol as it specifies just the minimum mechanism required for data transfer. Of course this has some disadvantages. Messages can be received in any order, and a message that was sent first could be received last. The delivery of UDP messages is not guaranteed at all, and messages can be lost, or even two copies of the same message might be received. This latter scenario can happen when two different routes are used to send the message to the same destination.
  
  UDP does not require a connection to be opened, and data can be sent as soon as it is ready. UDP doesn't send acknowledgement messages, so the data can be received, or it can be lost. If reliable data transfer is needed over UDP, it must be implemented in a higher-level protocol.
  
  So what are the advantages of UDP, why would we want to use an unreliable protocol such as this? To understand the most important reason for using UDP, we have to differentiate between unicast, broadcast, and multicast communications.
  
  ICMP-Internet Control Message Protocol
  
  ICMP is a control protocol used by IP devices to inform other IP devices of activity and errors in the network. Without TCP, IP is not a reliable protocol, and there are no acknowledgements, no error control for data (only a header checksum), and no retransmissions.
  
  Errors detected may be reported with ICMP messages. The ICMP messages are used to send feedback about the status of the network. For example, a router sends an ICMP 'destination unreachable' message if a suitable entry for a network cannot be found in a routing table. A router can also send an ICMP 'redirect' message if a better path was found.
  
  IGMP-Internet Group Management Protocol
  
  Similarly to ICMP, IGMP is an extension to the IP protocol and must be implemented by the IP module. IGMP is used by multicasting applications. When sending a broadcast message to a complete LAN, every node in the LAN analyzes the message up to the transport layer to verify if some application wants to receive messages from the port of the broadcast. If no application is listening, the message is destroyed and does not progress beyond the transport layer. This does mean that some CPU cycles are needed by every host no matter if the broadcast message is of interest or not.
  
  Multicasts address this concern, by only sending messages to a group of nodes rather than every node in the LAN. The network interface card can detect if the system is interested in a particular message by analyzing the broadcast MAC address without needing the assistance of the CPU.
  
  Internet Protocols
  
  After discussing base protocols, we can now step up to a higher level. The HTTP and FTP protocols cover layers 5-7 of the OSI model.
  
  FTP-File Transfer Protocol
  
  FTP is used to copy files from and to a server, and to list files and directories on a server. It is an application level protocol based on TCP, where FTP commands are encapsulated within the TCP data block of a TCP message.
  
  An application model with an FTP server and client is illustrated in the picture below. The client application presents a user interface and creates an FTP request according to the user's request and the FTP specification. The FTP command is sent to the server application over TCP/IP, and the FTP interpreter on the server interprets the FTP command accordingly. Depending on the FTP command, a list of files or a file from the server's file system is returned to the client in an FTP reply.
  
  HTTP-Hypertext Transfer Protocol
  
  HTTP is the main protocol used by web applications. Similar to the FTP protocol, HTTP is a reliable protocol that is achieved by using TCP. Like FTP, HTTP is also used to transfer files across the network. Unlike FTP, it has features such as caching, identification of the client application, support for different attachments with a MIME format, and so on. These features are enabled within the HTTP header.
  
  To demonstrate what an Internet browser is doing when it requests files from a web server, we can use the telnet application to simulate a browser. Start the telnet application by entering telnet in the Run dialog of the Start menu, and we see the Microsoft Telnet> prompt. Enter set local_echo (set localecho with Windows XP) to display the entered commands locally for demonstration purposes. If we don't set this option, commands we send to the server would not be displayed by the telnet application. Now we can connect to the web server with the open command. The command open msdn.microsoft.com 80 creates a TCP connection to port 80 of the server at msdn.microsoft.com. The telnet application uses port 23 by default, hence we have to specify a port for the HTTP request. The default port of a web server offering HTTP services is port 80.
  
  HTTPS-HTTP over SSL (Secure Socket Layer)
  
  If there is a requirement to exchange confidential data with a web server, HTTPS can be used. HTTPS is an extension to the HTTP protocol, and the principles discussed in the last section still apply. However, the underlying mechanism is different, as HTTPS uses SSL (Secure Socket Layer), originally developed by Netscape. SSL sits on top of TCP and secures network communication using a public/private key principle to exchange secret symmetric keys, and a symmetric key to encrypt the messages.
  
  To support HTTPS, the web server must install a certificate so that it can be identified. The default port for HTTPS requests is 443.
  
  E-Mail Protocols
  
  There are quite a few protocols for use with e-mail. In this section, I'll try to provide an overview of the most important mail-related protocols. In Chapter 9, we will look into these more, and see how to create applications that use them.
  
  SMTP-Simple Mail Transfer Protocol
  
  SMTP is a protocol for sending and receiving e-mail messages. It can be used to send e-mail between a client and server that both use the same transport protocol, or to send e-mail between servers that use different transport protocols. SMTP has the capability to relay messages across transport service environments. SMTP does not allow us to read messages from a mail server, however, and for this activity POP3 or IMAP protocols should be used.
  
  An SMTP service forms part of the Internet Information Server installation of Windows 2000 and XP.
  
  The SMTP standard is defined with RFC 821; the SMTP message format is defined with RFC 822.
  
  POP3-Post Office Protocol
  
  The Post Office Protocol was designed for disconnected environments. In small environments it is not practical to maintain a persistent connection with the mail server, for instance, in environments where the connection time must be paid. With POP3 the client can access the server and retrieve the messages that the server is holding for it. When messages are retrieved from the client, they are typically deleted on the server, although this is not necessarily the case.
  
  Windows .NET Server includes a POP3 server.
  
  POP 3 is defined by RFC 1081.
  
  IMAP-Internet Message Access Protocol
  
  Like POP3, IMAP is designed to access mails on a mail server. Similar to POP3 clients, an IMAP client can have an offline mode where mails can be manipulated on the local machine. Unlike POP3 clients, IMAP clients have greater capabilities when in online mode, such as retrieving just the headers or bodies of specified mails, searching for particular messages on the server, and setting flags such as a replied flag. Essentially, IMAP allows the client to manipulate a remote mailbox as if it was local.
  
  IMAP is defined with RFC 1730.
  
  NNTP-Network News Transfer Protocol
  
  Network News Transfer Protocol is an application layer protocol for submitting, relaying, and retrieving messages that form part of newsgroup discussions. This protocol provides client applications with access to a news server to retrieve selected messages, and also supports server to server transfer of messages.
  
  NNTP is defined by the RFCs 850, 977, and 1036.

With OSI (Open System Interconnection) the International Organization for Standardization (ISO) defined a model for a standardized network that would replace TCP/IP, DECNet, and other protocols, as the primary network protocol used in the Internet. However, because of the complexity of the OSI protocol, not many implementations were built and put to use. TCP/IP was much simpler, and thus can now be found everywhere. But many new ideas from the OSI protocol can be found in the next version of IP, IPv6.

While the OSI protocol didn't catch on, the OSI seven layer model was very successful, and it is now used as a reference model to describe different network protocols and their functionality.

The layers of the OSI model separate out the basic tasks that network protocols must accomplish, and describe how network applications can communicate. Each layer has a specific purpose and is connected to the layers immediately above and below it. The seven layers defined by OSI are shown here:

The application layer defines a programming interface to the network for user applications.
The presentation layer is responsible for encoding data from the application layer ready for transmission over the network, and vice versa.
The session layer creates a virtual connection between applications.
The transport layer allows reliable communication of data.
The network layer makes it possible to access nodes in a LAN using logical addressing.
The data link layer accesses the physical network with physical addresses.
Finally, the physical layer includes the connectors, cables, and so on.

Monday, October 29, 2012

Physical Components

An important aspect of understanding the network is knowing the hardware components. We are going to have a look at the major components of a LAN:

Network Interface Card (NIC)
Hub
Switch
Router

Network Interface Card

The NIC is the adapter card used to connect a device to the LAN. It allows us to send and receive messages to and from the network. A NIC has a unique MAC (media access control) address that provides a unique identification of each device.

The MAC address is a 12-byte hexadecimal number uniquely assigned to an Ethernet network card. This address can be changed by a network driver dynamically (as is the case with DECnet systems, a network developed by Digital Equipment), but usually the MAC address is not changed.

Hub

Multiple devices can easily be connected with the help of a hub. A hub is a connectivity device that attaches multiple devices to a LAN. Each device typically connects via a UTP (Unshielded Twisted Pair) cable to a port on the hub. You may have already heard about the RJ-45 (Registered Jack-45) connector. This is one of the possible port types on a hub, but a hub can also support other cable types. A hub can have anything from four ports to 24. In a large network, multiple hubs are mounted in a cabinet and support hundreds of connections.

Switch

Switches separate networks into segments. Compared to a hub, a switch is a more intelligent device. The switch stores the MAC addresses of devices that are connected to its ports in lookup tables. These lookup tables allow the switch to filter network messages, and, unlike the hub, avoid forwarding messages to every port. This eliminates possible collisions, and a better performing network can be achieved.

Router

A router is an intermediary network device that connects multiple physical networks. With many hosts it can be useful to split a LAN into separate portions, or subnets. The advantages of subnets are:

Performance is improved by reducing broadcasts, which is when a message is sent to all nodes in a network. With subnets, a message is only sent to the nodes in the appropriate subnet.
The capability of restricting users to particular subnets offers security benefits.
Smaller subnets are easier to manage than one large network.
Subnets allow a single network to span several locations.

A router holds a routing table that lists the ways that particular networks can be reached. There will often be several different routes from one network to another, but one of these will be the best, and it is that one that is described in the routing table. Routers communicate using routing protocols that discover other routers on the network, and support the exchange of information about networks attached to each router.

The Physical Network

In essence, a network is a group of computers or devices connected together by communication links. In networking terms, every computer or device (printers, routers, switches, and so on) connected to the network is called a node. Nodes are connected by links, which could be cables or wireless links (such as infrared or radio signals), and they can interact with any other node by transmitting messages over the network.

We can differentiate networks according to their size:

A LAN, or Local Area Network, connects nodes over a limited area. This area can be as large as the site of a big company, or as small as connected computers in someone's home. The most commonly used LAN technology is the Ethernet network (see next section).
WAN is the acronym for Wide Area Network. Multiple LAN sites are connected together by a WAN. WAN technologies that you might know of include Frame Relays, T1 lines, ISDN (Integrated Services Digital Network), X.25, and ATM (Asynchronous Transfer Monitor). In the next section, we'll further discuss the means of connecting to a WAN.
A MAN, or Metropolitan Area Network, is very similar to a WAN in that it connects multiple LANs. However, a MAN restricts the area of the network to a city or suburb. MANs use high-speed networks to connect the LANs of schools, governments, companies, and so on, by using fast connections to each site, such as fiber optics.

When talking about networks, the term backbone is often used. A backbone is a high-speed network that connects slower networks. A company can use a backbone to connect slower LAN segments. The Internet backbone is built up of high-speed networks that carry WAN traffic. Your Internet provider either connects directly to the Internet backbone, or to a larger provider that connects directly to the Internet backbone.

Networking Concepts and Protocols

In this article we will introduce some basic networking concepts and protocols. The article serves as a foundation to networking that will allow us to tackle programming in the upcoming articles.

It doesn't matter if you plan to develop server applications running as Windows Services that offer some data for clients using a custom protocol, if you write client applications that request data from web servers, or if you create multicasting applications, or applications using mailing functionality-you should start by reading this article.

If you don't already know what a router or a network switch is, if you are not sure about the functionality of the seven layers in the OSI protocol, or if you just want a refresher or an overview of the different network protocols and their uses, this chapter is for you.

We will start with an introduction to the hardware used in local area networks, such as routers, hubs, and bridges. Then we will have a look at the seven layers of the OSI model and their functionality, and how the TCP/IP protocol suite fits into the OSI layers. After that, we will learn about the functionality of various network protocols.

In particular, we will look at:

The Physical Network
The OSI Seven-Layer Model
Basic Network Protocols
Internet Protocols
E-mail Protocols
Sockets
Name Lookups
The Internet
Remoting
Messaging

(Details will be posted on next articles.....)

Wednesday, October 31, 2012

Network Protocols

Basic Protocols

IP-Internet Protocol

Subnets

Transport Layer-Port Numbers

TCP-Transmission Control Protocol

UDP-User Datagram Protocol

ICMP-Internet Control Message Protocol

IGMP-Internet Group Management Protocol

Internet Protocols

FTP-File Transfer Protocol

HTTP-Hypertext Transfer Protocol

HTTPS-HTTP over SSL (Secure Socket Layer)

E-Mail Protocols

SMTP-Simple Mail Transfer Protocol

POP3-Post Office Protocol

IMAP-Internet Message Access Protocol

NNTP-Network News Transfer Protocol

Layered OSI Model