Search This Blog

Tuesday, July 29, 2008

Ch02 : Introduction to Networking

Familiarity with the concepts explained in the following sections will help answer many of the daily questions often posed by coworkers, friends, and even yourself. It will help make the road to Linux mastery less perilous, a road that begins with an understanding of the OSI networking model and TCP/IP. The OSI Networking Model The Open System Interconnection (OSI) model, developed by the International Organization for Standardization, defines how the various hardware and software components involved in data communication should interact with each other. A good analogy would be a traveler who prepares herself to return home through many dangerous kingdoms by obtaining permits to enter each country at the very beginning of the trip. At each frontier our friend has to hand over a permit to enter the country. Once inside, she asks the border guards for directions to reach the next frontier and displays the permit for that new kingdom as proof that she has a legitimate reason for wanting to go there.

In the OSI model each component along the data communications path is assigned a layer of responsibility, in other words, a kingdom over which it rules. Each layer extracts the permit, or header information, it needs from the data and uses this information to correctly forward what's left to the next layer. This layer also strips away its permit and forwards the data to the next layer, and so the cycle continues for seven layers.

The very first layer of the OSI model describes the transmission attributes of the cabling or wireless frequencies used at each "link" or step along the way. Layer 2 describes the error correction methodologies to be used on the link; layer 3 ensures that the data can hop from link to link on the way to the final destination described in its header. When the data finally arrives, the layer 4 header is used to determine which locally installed software application should receive it. The application uses the guidelines of layer 5 to keep track of the various communications sessions it has with remote computers and uses layer 6 to verify that the communication or file format is correct. Finally, layer 7 defines what the end user will see in the form of an interface, be it graphical on a screen or otherwise. A description of the functions of each layer in the model can be seen in Table 2-1.

Table 2-1: The Seven OSI Layers

Layer Name Description Application
7 Application
  • The user interface to the application

telnet

FTP

sendmail

6 Presentation
  • Converts data from one presentation format to another. For example, e-mail text entered into Outlook Express being converted into SMTP mail formatted data.
5 Session
  • Manages continuing requests and responses between the applications at both ends over the various established connections.
4 Transport
  • Manages the establishment and tearing down of a connection. Ensures that unacknowledged data is retransmitted. Correctly re-sequences data packets that arrive in the wrong order.
  • After the packet's overhead bytes have been stripped away, the resulting data is said to be a segment.

TCP UDP

3 Network
  • Handles the routing of data between links that are not physically connected together.
  • After the link's overhead bytes have been stripped away, the resulting data is said to be a packet.

IP ARP

2 Link
  • Error control and timing of bits speeding down the wire between two directly connected devices.
  • Data sent on a link is said to be structured in frames.

Ethernet

ARP

1 Physical
  • Defines the electrical and physical characteristics of the network cabling and interfacing hardware
Ethernet

An Introduction to TCP/IP

TCP/IP is a universal standard suite of protocols used to provide connectivity between networked devices. It is part of the larger OSI model upon which most data communications is based.

One component of TCP/IP is the Internet Protocol (IP) which is responsible for ensuring that data is transferred between two addresses without being corrupted.

For manageability, the data is usually split into multiple pieces or packets each with its own error detection bytes in the control section or header of the packet. The remote computer then receives the packets and reassembles the data and checks for errors. It then passes the data to the program that expects to receive it.

How does the computer know what program needs the data? Each IP packet also contains a piece of information in its header called the type field. This informs the computer receiving the data about the type of layer 4 transportation mechanism being used.

The two most popular transportation mechanisms used on the Internet are Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).

When the type of transport protocol has been determined, the TCP/UDP header is then inspected for the "port" value, which is used to determine which network application on the computer should process the data. This is explained in more detail later.

TCP Is a Connection-Oriented Protocol

TCP opens up a virtual connection between the client and server programs running on separate computers so that multiple and/or sporadic streams of data can be sent over an indefinite period of time between them. TCP keeps track of the packets sent by giving each one a sequence number with the remote server sending back acknowledgment packets confirming correct delivery. Programs that use TCP therefore have a means of detecting connection failures and requesting the retransmission of missing packets. TCP is a good example of a connection-oriented protocol.

How TCP Establishes A Connection

The server initiating the connection sends a segment with the SYN bit set in TCP header. The target replies with a segment with the SYN and ACK bits set, to which the originating server replies with a segment with the ACK bit set. This SYN, SYN-ACK, ACK mechanism is often called the "three-way handshake". The communication then continues with a series of segment exchanges, each with the ACK bit set. When one of the servers needs to end the communication, it sends a segment to the other with the FIN and ACK bits set, to which the other server also replies with a FIN-ACK segment also. The communication terminates with a final ACK from the server that wanted to end the session.

You can clearly see the three way handshake to connect and disconnect the session.

hostA -> hostB TCP 1443 > http [SYN] Seq=9766 Ack=0 Win=5840 Len=0
hostB -> hostA TCP http > 1443 [SYN, ACK] Seq=8404 Ack=9767 Win=5792 Len=0
hostA -> hostB TCP 1443 > http [ACK] Seq=9767 Ack=8405 Win=5840 Len=0
hostA -> hostB HTTP HEAD/HTTP/1.1
hostB -> hostA TCP http > 1443 [ACK] Seq=8405 Ack=9985 Win=54 Len=0
hostB -> hostA HTTP HTTP/1.1 200 OK
hostA -> hostB TCP 1443 > http [ACK] Seq=9985 Ack=8672 Win=6432 Len=0
hostB -> hostA TCP http > 1443 [FIN, ACK] Seq=8672 Ack=9985 Win=54 Len=0
hostA -> hostB TCP 1443 > http [FIN, ACK] Seq=9985 Ack=8673 Win=6432 Len=0
hostB -> hostA TCP http > 1443 [ACK] Seq=8673 Ack=9986 Win=54


In this trace, the sequence number represents the serial number of the first byte of data in the segment. So in the first line, a random value of 9766 was assigned to the first byte and all subsequent bytes for the connection from this host will be sequentially tracked. This makes the second byte in the segment number 9767, the third number 9768 etc. The acknowledgment number or Ack, not to be confused with the ACK bit, is the byte serial number of the next segment it expects to receive from the other end, and the total number of bytes cannot exceed the Win or window value that follows it. If data isn't received correctly, the receiver will re-send the requesting segment asking for the information to be sent again. The TCP code keeps track of all this along with the source and destination ports and IP addresses to ensure that each unique connection is serviced correctly.

UDP, TCP's "Connectionless" Cousin

UDP is a connectionless protocol. Data is sent on a "best effort" basis with the machine that sends the data having no means of verifying whether the data was correctly received by the remote machine. UDP is usually used for applications in which the data sent is not mission-critical. It is also used when data needs to be broadcast to all available servers on a locally attached network where the creation of dozens of TCP connections for a short burst of data is considered resource-hungry.

TCP and UDP Ports

The data portion of the IP packet contains a TCP or UDP segment sandwiched inside. Only the TCP segment header contains sequence information, but both the UDP and the TCP segment headers track the port being used. The source/destination port and the source/destination IP addresses of the client & server computers are then combined to uniquely identify each data flow.

Certain programs are assigned specific ports that are internationally recognized. For example, port 80 is reserved for HTTP Web traffic, and port 25 is reserved for SMTP e-mail. Ports below 1024 are reserved for privileged system functions, and those above 1024 are generally reserved for non-system third-party applications.

Usually when a connection is made from a client computer requesting data to the server that contains the data:

  • The client selects a random previously unused "source" port greater than 1024 and queries the server on the "destination" port specific to the application. If it is an HTTP request, the client will use a source port of, say, 2049 and query the server on port 80 (HTTP)
  • The server recognizes the port 80 request as an HTTP request and passes on the data to be handled by the Web server software. When the Web server software replies to the client, it tells the TCP application to respond back to port 2049 of the client using a source port of port 80.
  • The client keeps track of all its requests to the server's IP address and will recognize that the reply on port 2049 isn't a request initiation for "NFS", but a response to the initial port 80 HTTP query.

The TCP/IP "Time To Live" Feature

Each IP packet has a Time to Live (TTL) section that keeps track of the number of network devices the packet has passed through to reach its destination. The server sending the packet sets the initial TTL value, and each network device that the packet passes through then reduces this value by 1. If the TTL value reaches 0, the network device will discard the packet.

This mechanism helps to ensure that bad routing on the Internet won't cause packets to aimlessly loop around the network without being removed. TTLs therefore help to reduce the clogging of data circuits with unnecessary traffic.

Remember this concept as it will be helpful in understanding the traceroute troubleshooting technique.

The ICMP Protocol and Its Relationship to TCP/IP

There is another commonly used protocol called the Internet Control Message Protocol (ICMP). It is not strictly a TCP/IP protocol, but TCP/IP-based applications use it frequently.

ICMP provides a suite of error, control, and informational messages for use by the operating system. For example, IP packets will occasionally arrive at a server with corrupted data due to any number of reasons including a bad connection; electrical interference, or even misconfiguration. The server will usually detect this by examining the packet and correlating the contents to what it finds in the IP header's error control section. It will then issue an ICMP reject message to the original sending machine saying that the data should be re-sent because the original transmission was corrupted.

ICMP also includes echo and echo reply messages used by the Linux ping command to confirm network connectivity. ICMP TTL expired messages are also sent by network devices back to the originating server whenever the TTL in a packet is decremented to zero. ICMP Codes

You'll also encounter ICMP codes in your troubleshooting exercises, especially when viewing your iptables log files. Table I.6 lists the most commonly used codes.

Table I-6 ICMP Codes

Type

Name

Description

3

Destination Unreachable Codes

Net Unreachable

The sending device knows about the network but believes it is not available at this time. Perhaps the network is too far away through the known route.

Host Unreachable

The sending devices knows about host but doesn't get ARP reply, indicating the host is not available at this time

Protocol Unreachable

The protocol defined in IP header cannot be forwarded.

Port Unreachable

The sending device does not support the port number you are trying to reach

Fragmentation Needed and Don't Fragment was Set

The router needs to fragment the packet to forward it across a link that supports a smaller maximum transmission unit (MTU ) size. However, application set the Don't Fragment bit.

Source Route Failed

ICMP sender can't use the strict or loose source routing path specified in the original packet.

Destination Network Unknown

ICMP sender does not have a route entry for the destination network, indicating this network may never have been an available.

Destination Host Unknown

ICMP sender does not have a host entry, indicating the host may never have been available on connected network.

Source Host Isolated

ICMP sender (router) has been configured to not forward packets from source (the old electronic pink slip).

Communication with Destination Network is Administratively Prohibited

ICMP sender (router) has been configured to block access to the desired destination network.

Communication with Destination Host is Administratively Prohibited

ICMP sender (router) has been configured to block access to the desired destination host.

Destination Network Unreachable for Type of Service

The sender is using a Type of Service (TOS) that is not available through this router for that specific network.

Destination Host Unreachable for Type of Service

The sender is using a Type of Service (TOS) that is not available through this router for that specific host.

Communication Administratively Prohibited

ICMP sender is not available for communications at this time.

Host Precedence Violation

Precedence value defined in sender's original IP header is not allowed (for example, using Flash Override precedence).

5

Redirect Codes

Redirect Datagram for the Network (or subnet)

ICMP sender (router) is not the best way to get to the desired network. Reply contains IP address of best router to destination. Dynamically adds a network entry in original sender's routing tables.

Redirect Datagram for the Host

ICMP sender (router) is not the best way to get to the desired host. Reply contains IP address of best router to destination. Dynamically adds a host entry in original sender's route tables.

Redirect Datagram for Type of the Service and Network

ICMP sender (router) does not offer a path to the destination network using the TOS requested. Dynamically adds a network entry in original sender's route tables.

Redirect Datagram for the Type of Service and Host

ICMP sender (router) does not offer a path to the destination host using the TOS requested. Dynamically adds a host entry in original sender's route tables.

6

Alternate Host Address Codes

Alternate Address for Host

Reply that indicates another host address should be used for the desired service. Should redirect application to another host.

11

Time Exceeded Codes

Time to Live exceeded in Transit

ICMP sender (router) indicates that originator's packet arrived with a Time To Live (TTL) of 1. Routers cannot decrement the TTL value to 0 and forward the packet.

Fragment Reassembly Time Exceeded

ICMP sender (destination host) did not receive all fragment parts before the expiration (in seconds of holding time) of the TTL value of the first fragment received.

12

Parameter Problem Codes

Pointer indicates the error

Error is defined in greater detail within the ICMP packet.

Missing a Required Option

ICMP sender expected some additional information in the Option field of the original packet.

Bad Length

Original packet structure had an invalid length.

How IP Addresses Are Used To Access Network Devices

All TCP/IP enabled devices connected to the Internet have an Internet Protocol (IP) address. Just like a telephone number, it helps to uniquely identify a user of the system. The Internet Assigned Numbers Authority (IANA) is the organization responsible for assigning IP addresses to Internet Service Providers (ISPs) and deciding which ones should be used for the public Internet and which ones should be used on private networks.

IP addresses are in reality a string of 32 binary digits or bits. For ease of use, network engineers often divide these 32 bits into four sets of 8 bits (or octets), each representing a number from 0 to 255. Each number is then separated by a period (.) to create the familiar dotted decimal notation. An example of an IP address that follows these rules is 97.65.25.12.

The localhost IP Address

Whether or not your computer has a network interface card it will have a built-in IP address with which network-aware applications can communicate with one another. This IP address is defined as 127.0.0.1 and is frequently referred to as localhost.

Network Address Translation (NAT) Makes Private IPs Public

Your router/firewall will frequently be configured to give the impression to other devices on the Internet that all the servers on your home/office network have a valid public IP address, and not a "private" IP address. This is called network address translation (NAT) and is often also called IP masquerading in the Linux world. There are many good reasons for this, the two most commonly stated are:

  • No one on the Internet knows your true IP address. NAT protects your home PCs by assigning them IP addresses from "private" IP address space that cannot be routed over the Internet. This prevents hackers from directly attacking your home systems because packets sent to the "private" IP will never pass over the Internet.
  • Hundreds of PCs and servers behind a NAT device can masquerade as a single public IP address. This greatly increases the number of devices that can access the Internet without running out of "public" IP addresses.

You can configure NAT to be one to one in which you request your ISP to assign you a number of public IP addresses to be used by the Internet-facing interface of your firewall and then you pair each of these addresses to a corresponding server on your protected private IP network. You can also use many to one NAT, in which the firewall maps a single IP address to multiple servers on the network.

As a general rule, you won't be able to access the public NAT IP addresses from servers on your home network. Basic NAT testing requires you to ask a friend to try to connect to your home network from the Internet.

Port Forwarding with NAT Facilitates Home-Based Web sites

In a simple home network, all servers accessing the Internet will appear to have the single public IP address of the router/firewall because of many to one NAT. Because the router/firewall is located at the border crossing to the Internet, it can easily keep track of all the various outbound connections to the Internet by monitoring:

  • The IP addresses and TCP ports used by each home based server and mapping it to
  • The TCP ports and IP addresses of the Internet servers with which they want to communicate.

This arrangement works well with a single NAT IP trying to initiate connections to many Internet addresses. The reverse isn't true.

New connections initiated from the Internet to the public IP address of the router/firewall face a problem. The router/firewall has no way of telling which of the many home PCs behind it should receive the relayed data because the mapping mentioned earlier doesn't exist beforehand. In this case the data is usually discarded.

Port forwarding is a method of counteracting this. For example, you can configure your router/firewall to forward TCP port 80 (Web/HTTP) traffic destined to the outside NAT IP to be automatically relayed to a specific server on the inside home network

As you may have guessed, port forwarding is one of the most common methods used to host Web sites at home with DHCP DSL.

DHCP

The Dynamic Host Configuration Protocol (DHCP) is a protocol that automates the assignment of IP addresses, subnet masks default routers, and other IP parameters.

The assignment usually occurs when the DHCP configured machine boots up, or regains connectivity to the network. The DHCP client sends out a query requesting a response from a DHCP server on the locally attached network. The DHCP server then replies to the client PC with its assigned IP address, subnet mask, DNS server and default gateway information.

The assignment of the IP address usually expires after a predetermined period of time, at which point the DHCP client and server renegotiate a new IP address from the server's predefined pool of addresses. Configuring firewall rules to accommodate access from machines who receive their IP addresses via DHCP is therefore more difficult because the remote IP address will vary from time to time. You'll probably have to allow access for the entire remote DHCP subnet for a particular TCP/UDP port.

Most home router/firewalls are configured in the factory to be DHCP servers for your home network. You can also make your Linux box into a DHCP server, once it has a fixed IP address.

The most commonly used form of DSL will also assign the outside interface of your router/firewall with a single DHCP provided IP address.

How DNS Links Your IP Address To Your Web Domain

The domain name system (DNS) is a worldwide server network used to help translate easy to remember domain names like www.linuxhomenetworking.com into an IP address that can be used behind the scenes by your computer. Here step by step description of what happens with a DNS lookup.

  1. Most home computers will get the IP address of their DNS server via DHCP from their router/firewall.
  2. Home router/firewall providing DHCP services often provides its own IP address as the DNS name server address for home computers.
  3. The router/firewall then redirects the DNS queries from your computer to the DNS name server of your Internet service provider (ISP).
  4. Your ISP's DNS server then probably redirects your query to one of the 13 "root" name servers.
  5. The root server then redirects your query to one of the Internet's ".com" DNS name servers which will then redirect the query to the "linuxhomenetworking.com" domain's name server.
  6. The linuxhomenetworking.com domain name server then responds with the IP address for www.linuxhomenetworking.com

As you can imagine, this process can cause a noticeable delay when you are browsing the Web. Each server in the chain will store the most frequent DNS name to IP address lookups in a memory cache which helps to speed up the response. Chapter 18, "Configuring DNS", explains how to you can make your Linux box into a caching or regular DNS server for your network or Web site if your ISP provides you with fixed IP addresses. Chapter 19, "Dynamic DNS", explains how to configure DNS for a Web site housed on a DHCP DSL circuit where the IP address constantly changes. It explains the auxiliary DNS standard called dynamic DNS (DDNS) that was created for this type of scenario.

IP Version 6 (IPv6)

Most Internet-capable networking devices use version 4 of the Internet Protocol (IPv4) which I have described here. You should also be aware that there is now a version 6 (IPv6) that has recently been developed as a replacement.

With only 32 bits, the allocation of version 4 addresses will soon be exhausted between all the world's ISPs. Version 6, which uses a much larger 128-bit address offers eighty billion, billion, billion times more IP addresses which it is hoped should last for most of the 21st century.

IPv6 packets are also labeled to provide quality-of-service information that can be used in prioritizing real-time applications, such as video and voice, over less time-sensitive ones such as regular Web surfing and chat. IPv6 also inherently supports the IPSec protocol suite used in many forms of secured networks, such as virtual private networks (VPNs).

Most current operating systems support IPv6 even though it isn't currently being used extensively within corporate or home environments. Expect it to become an increasingly bigger part of your network planning in years to come.

No comments: