NetWorking "H" Definitions and Concepts

RDB PRIME!
Engineering


	Home


	Research Paper(s)


	Resume


	Technology Items


	Site Map


	Site Search

It is 03:48 PST on Friday 01/31/2025

"H" Networking Definitions & Concepts...

Hacker .. to .. Hz (Hertz)

# A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Search for Information Technology Items

Hacker:

An avid computer user who enjoys exploring and testing the limits of computers, and who enjoys "hacking together" solutions to programming or other computing problems. Hackers often extend their zealous explorative tendencies to others' computers -- breaking into networks, corporate or university computers, etc. Generally, however, these explorations don't have any malicious or destructive goals.

In contrast, the term "cracker" is used to describe users who do have destructive plans when they break into other computer systems. Unfortunately, in general parlance, hacker has come to be used for both of these sometimes intrusive types.

Hardware Abstraction Layer (HAL):

In Windows NT and NT Advanced Server, the HAL mediates between the operating system kernel and specific hardware. By implementing functions for interfaces, caches, interrupts, and so on, the HAL can make every piece of hardware look the same to the higher layers. This helps make NT more transportable to other CPUs and machines.

Hardware Address:

The low-level addresses used by physical networks. Each computer/device attached to an Ethernet is assigned a 48-bit integer known as its Ethernet address. These addresses are assigned by vendors of Ethernet hardware. Because Ethernet addresses belong to hardware devices, they are sometimes called hardware addresses or physical addresses. Therefore:

Physical addresses are associated with the interface hardware; moving the hardware interface to a new machine or replacing a hardware interface that has failed changes the physical addresses.

Knowing that Ethernet physical addresses can change will make it clear why higher levels of the network software are designed to accommodate such changes.

The 48-bit Ethernet address does more than specify a single hardware interface. It can be one of three types:

The physical address of one network interface,
The network broadcast address, or
A multicast address.

Vendors purchase blocks of physical addresses and assign them in sequence as they manufacture Ethernet interface hardware. Thus, no two hardware interfaces have the same physical address.

Hamming Code:

In telecommunication, a Hamming code is an error-detecting and error-correcting code, used in data transmission, that can (a) detect all single- and double-bit errors and (b) correct all single-bit errors. It was named after its inventor: Richard Hamming.

Note: A Hamming code satisfies the relation 2^m n+1, where n is the total number of bits in the block, k is the number of information bits in the block, and m is the number of check bits in the block, where m = n - k .

In telecommunication, the term block has the following meanings:

A group of bits or digits that is transmitted as a unit and that may be encoded for error-control purposes.

A string of records, words, or characters, that for technical or logical purposes are treated as a unit.

Note 1: Blocks (a) are separated by interblock gaps, (b) are delimited by an end-of-block signal, and (c) may contain one or more records.

Note 2: A block is usually subjected to some type of block processing, such as multidimensional parity checking, associated with it.

The Hamming code is computed as follows:

Write the data bits in positions whose binary representation has at least two 1 bits.
Set the bits whose positions are powers of 2 so that the parity of odd-numbered bits is even, the parity of bits whose position is 2 or 3 mod 4 is even, and so on.

To decode it:

Compute the parity of all odd-numbered bits, the parity of bits whose position is 2 or 3 mod 4, etc.
Interpret these parities as a number which is a bit position. Flip the bit in that position.
Read out the data bits.

That is, to encode '01111101000' where the MSB is written first:

The MSB -- in a byte, every bit has a value based upon the bit's position in the byte. The bit which has the largest value is called the most significant bit. This can also refer to the MSB of any field of bits, such as a word, or double word.

0111110x100x0xxx
011111011000001x (the x is bit 0 which is ignored)

Transmit it with an error:

011101011000001

Compute parities:

   0111010110000010
   01110101          1
   0111    1000      0
   01  01  10  00    1
   0 1 0 0 1 0 0 1   1

Bit 11 (1011 in binary) is in error. Flip it:

0111110110000010

Extract the data:

0111110 100 0

(Bit 0 was appended to the received data so that if there were no errors, we could flip it.)

Other variants of Hamming code are in use; they are rearrangements of the bits so that the parity bits are at the end.

Hamming Distance:

In telecommunication, the Hamming distance or signal distance is the number of digit positions in which the corresponding digits of two binary words of the same length are different.

It corresponds to the weight (number of ones) in the XOR of the words, or to the Manhattan distance between two vertices in an n-dimensional hypercube (where n is the length of the words).

For instance, the Hamming distance between 1011101 and 1001001 is two.

The concept can be generalized to other notation systems. For example, the Hamming distance between 2143896 and 2233796 is three, and between "toned" and "roses" it is also three.

What is the Hamming distance of the following code?

    000000  111111  000111  111000

Hamming distance:

(Definition) In comparing two bit patterns, the Hamming distance is the count of bits different in the two patterns. More generally, if two ordered lists of items are compared, the Hamming distance is the number of items that do not identically agree. This distance is applicable to encoded information, and is a particularly simple metric of comparison.

Between 000000 XOR 111111

000000
111111 XOR
------
111111

the distance is 6,

Between 111111 XOR 000111

111111
000111 XOR
------
111000

the distance is 3,

Between 000111 XOR 111000

000111
111000 XOR
------
111111

the distance is 6.

Hierarchical Address Space:

Is associated with IP (Internet Protocol) in that this space is limited within the location information that is embedded in the IP structure. IP addresses consist of four bytes usually expressed in decimal-dotted notation, for example 64.32.153.19. The IP addresses consist of two parts: a network ID and a host ID. Machines in the same geographical location share common portions of the address, which allows routers to handle addresses with the same prefix in the same manner.

Hierarchical Name Structure:

A naming strategy that relies on the hierarchical relationship between two entities. This strategy is used, for example, for files or network entities. In a network context, a node's name is based on the name of the parent node, which sits immediately above the node in a hierarchy. Compare this with a flat name sturcture.

Hierarchical Routing:

In an internetwork, hierarchical routing is routing in which multiple levels of networks (or of routers) are distinguished.

For example, in the Internet, three routing levels may be used:

backbone,
mid-level, and
stub

At the backbone level, routing among mid-level networks is supported; at the mid-level networks, routing between sites (stub networks) is supported. At a particular site, internal routing among the network's nodes is supported.

High Performance Routing (HPR), IBM:

HPR is an internetworking protocol designed by IBM as an upgrade to its Advanced Peer-to-Peer Networking (APPN) protocol. It was originally referred to as APPN+, but is now officially referred to as APPN HPR or simply HPR. IBM designed the protocol as a replacement for Transmission Control Protocol/Internet Protocol (TCP/IP). HPR handles routing around failed nodes and avoids the packet overhead handled by network nodes to improve performance. IBM shipped HPR in 1994 with its 6611 router and provide further enhancements to support Asynchronous Transfer Mode (ATM) in 1995.

Host:

Any (end-user) computer system that connects to a network. Hosts range in size from personal computers (i.e. a Macintosh Plus) to supercomputers such as the Cray.

A host computer is typically defined in the centralized computer model as a large timesharing computer system that terminals communicate with and rely on for processing. It contrasts with the client-server (described above) model in which users work at computers that perform some processing of their own and access servers that provide services such as file management, security, and printer management. However, the term host has become broader in its meaning over the years.

IBM Environment:

In the IBM environment, a host system is a mainframe computer called the "host processor" such as the IBM model 3090, IBM model 4381, or IBM model 9370. These mainframes usually run the Multiple Virtual Storage (MVS) operating system, running as either XA (eXtended Architecture) or Enterprise Systems Architecture (ESA). MVS is part of IBM's Systems Application Architecture (SAA).

Devices attach to IBM host computers via horizontal wiring to Intermediate hubs. Then from the intermediate hubs via vertical wiring to the Main hub. A number of terminal or intelligent computers acting as terminals connect to a cluster controller. Terminals do not connect directly to the host. The cluster controller multiplexes the data stream from its attached terminals into a channel linked to the host. Printers may also be attached to the cluster controller. A communication controller is another device that attaches directly to a host. It connects with a remote cluster controller over a telecommunication link. The remote cluster controller multiplexes the data stream from a number of devices at that location.

ISO Terminology:

In a network environment where multiple local area networks (LANs) are connected together with a series of routers, a host is often referred to as the end system or ES. For example, if the accounting department is connected to the sales department with a router, then workstations in each department are referred to as the host or end system, and the router is referred to as an intermediate system. There may be a number of intermediate systems that a communication message has to cross between one end system and another.

Host Bus Adapter (HBA):

A special-purpose board designed to take over data storage and retrieval tasks, thereby saving the CPU (central processing unit) some work. A disk channel consists of an HBA and the hard disk(s) associated with it. Novell's Disk Coprocessor board is a SCSI HBA adapter.

Host Computer:

Usually a multi-user computer, such as a minicomputer or mainframe, that serves as a central processing unit for a number of terminals.

Hostname:

In the Internet environment, the name for a machine, such as thelma or henry. The hostname is part of the more complete fully qualified domain name (FQDN).

How A DOS-Based PC Uses Memory:

Today's DOS-based PC typically includes 8 to 32 MB of memory, well over ten times the original 640K. However, because of the limitations of the 8086 architecture -- still present in today's Intel processors -- memory must be accessed in several ways. Below is given the types of memory you can access in a PC:

Conventional Memory: This is the same 640K that the original IBM PC used. This area stores programs; many programs can use only this area of memory.

Upper Memory: This is the memory just above the 640KB area, which includes a total of 384K. Your video card, NIC, and other hardware components use a sizable amount of this space. Conventional 640K and the upper 384K add up to the first 1024K -- the first megabyte of RAM.

Expanded Memory (XMS): This memory uses a 64K block of memory to swap memory to and from the higher areas of memory. This system is rarely used in today's PCs but remains as a standard.

Extended Memory: This is memory above 1MB; a computer with 16MB of RAM has only 15MB of extended memory. Today's PCs can access this memory without swapping, but use of extended memory still requires a different method of access.
Although the use of extended memory has overcome the 640KB barrier, the limit still applies ot many programs, which use only conventional memory. Thus, you may even get out-of-memory errors when you have 8- or 16MB of RAM. Most likely, it's the 640KB conventional memory that is running out.

High Memory: This is a 64K area that begins at the 1MB boundary (an area that otherwise would be considered extended memory). It is addressed by HIMEM.SYS, which is shipped with recent versions of DOS and Windows. The most common use of high memory is for part of the COMMAND.COM file in DOS 5 and 6.x products from Microsoft. This allocation scheme allows DOS to take a much smaller portion of the precious 640KB area.

The IBM XT's 8086 could address a maximum of 1MB of RAM. The 286 addressed 16MB of RAM. The 80386 and later processors can address as much as 4GB. However, a computer with an 80386 processor will not always address more than 16MB. The computer's motherboard must also support the addressing. For example, the IBM PS/2 Model 80 is an 80386 machine that was commonly used as file server. System administrators who installed mor RAM were disappointed to fine that the IBM PS/2 Model 80 supported a maximum of 16MB, even though the Intel 80386 processor in the IBM is capable of addressing up to 4GB.

A PC operates in two modes when accessing memory: real mode and protected mode. Real mode is a backward-compatible technology developed for x86 processors. When a 286, 386, 486, 586, or 686 is running in real mode, it is running like an 8086 processor. This feature means that programs created for the original IBM-XT will still run on your Pentium Pro chip (and just a bit more quickly).

Protected mode was developed for 286 chips and higher. Many programmers writing for the original 8086 chip's memory walked over each other's memory, space, causing conflicts and instability. Protected mode helps solve this conflict by making programs request memory from the operating system.

The theory is that memory a program is using is protected by the operating system. Another program running on the PC is required to request memory from the operating system. The operating system will give access only to memory that is not being used.

This explanation of DOS memory does not apply to the Windows NT and Windows 95 operating systems, which do not suffer the limitations of the DOS memory model.

How Ethernet Works:

Ethernet arbitrates access to the network with the CSMA/CD (carrier sense multiple access with collision detection) media access method. This means that only one workstation can use the network at a time. CSMA/CD functions are much like the old party-line telephone systems used in rural areas. If you wanted to use the telephone, you picked up the line and listened to see whether someone was already using it. If you heard someone on the line, you did not try to dial or speak; you simply hung up and waited a while before you picked up the phone to listen again.

If you picked up the phone and heard a dial tone, you knew that the line was free to use. You and your phone system operated by carrier sense. You sensed the dial tone or carrier and, if it was present, you used it. Multiple access means that more than one party shared the line. Collision detection means that if two people picked up the phone at the same time and dialed, they would "collide", and both would need to hang up the phone and try again at a later time. The first one back on the free line would gain control and be able to make a call.

In the case of Ethernet, workstations or nodes send signals (packets) across the network. When a collision takes place, the workstations or nodes transmitting the packets stop transmitting and wait a random period of time before retransmitting. Using the rules of this model, the workstations or nodes must contend for the opportunity to transmit across the network. For this reason, Ethernet is referred to as a "contention-based system". Most Ethernet networks currently run at 10 Mbps (millions of bits per second).

10 Mbps Ethernet:

Ethernet is available for many types of cable (or physical media). The different types of Ethernet use different signaling characteristics, but they share the Ethernet framing specification, the 10 Mbps speed, and the use of CSMA/CD to arbitrate access. The four commonly used 10 Mbps Ethernet cabling systems are:

10Base5, or thicknet, which uses thick coaxial cable RG-8, and RG-11,

10Base2, or thinnet, which uses thin coaxial cable RG-58,

10BaseT, which uses unshielded twisted-pair cable,

10BaseFL, which uses single- or multimode optical fiber.

10Base5 (Thicknet) Ethernet

The original wiring used for Ethernet is called thicknet or 10Base5. The 5 stands for its maximum length: 500 meters (1650 feet). It is named for the size and coating of the wire used, which is about as big around as your thumb. The coaxial (coax) cable is marked every 2.5 meters (8.25 feet) for connection points to workstations or nodes. This is done so you do not try to connect devices closer than 2.5 meters, since a shorter distance degrades the signal.

WARNING: Most coax is made using PVC coating. If burned, one of the gases it creates is chlorine, which, when inhaled into the lungs, turns into hydrochloric acid. This can do great damage to lung tissue. Teflon-coated cable is much more expensive but is safer to use in ceilings where ventilation systems are located. Some fire codes require the use of plenum-rated (Teflon-coated type) cable if the wiring is run through ceilings.

10Base5 (thicknet) Ethernet has the following specifications:

Maximum segment length	500 meters (1650 feet)
Maximum taps	100
Maximum segments	5
Maximum segments with nodes	3
Maximum distance between taps	2.5 meters (8.25 feet)
Maximum repeaters	4
Maximum overall length with repeaters	2.5 kilometers (1.5 miles)
Maximum AUI drop cable length	50 meters (165 feet)

You normally use a device called a vampire tap to connect new connections to the thicknet. To do so, you use a tool that drills a small hole into the coaxial cable. Then you attach the tap and tighten it down, with its connector, into the hole. Although in some cases you can tap coaxial cable with users up and running, you should try to do it after working hours. A mistake can short the center conductor with the shielding and take down the entire segment.

The tap is also a transceiver, a device that handles transmission data signal generation and reception. It receives its electrical power through the DIX (Digital, Intel, and Xerox) connector. The DIX connector uses an AUI (Attachment Universal Interface) cable to connect to the DIX female connector on the LAN card. At both ends of the cable, you must install a terminator to complete the electrical circuit and to cut down on signal reflections.

Thicknet cable has the following disadvantages:

Large size

High cost

Connection method (drilling into the cable structure to get to the wire)

The advantages of thicknet cabling are few for today's networks, but many thicknet networks are still in use and are reliable.

The 10Base5 wiring specification allows you to increase the length of the overall network by using repeaters, which are devices that pick up signals (amplify) and repeat them to another segment of the cable. You may use a maximum of four repeaters on one network, with only three of the segments populated with nodes. Thus, the overall length of a network that implements repeaters to extend the length is 2.5 kilometers (1.5 miles).

10Base2 (Thinnet Coax) Ethernet

When thinnet coax (10Base2) cable was introduced, it quickly became a popular choice of network cabling, since its costs, appears, and handles just like its affordable and useful cousin, the 75-ohm coaxial cable used for television cable. Because of its low cost, it is sometimes referred to as cheapernet.

10Base2 (thinnet) Ethernet has the following specifications:

Maximum segment length	185 meters (610.5 feet)
Maximum segments	5
Maximum segments with nodes	3
Maximum repeaters	4
Maximum devices per segment	30
Maximum distance between repeaters	925 meters (3052.5 feet)

WARNING: The term 10Base2 is a little misleading since the maximum length is not actually 200 meters (660 feet) but only 185 meters (610.5 feet). Someone took the liberty of rounding up to make it fit in with the other specifications. Some vendors advertise that by using their hardware you can extend the 185 meters to 300 meters (990 feet). However, if you later mix LAN cards or repeaters from different vendors into your network, you may have problems, since most manufacturers adhere strictly to the IEEE specifications.

Additional specifications for thinnet is 50-ohm RG-58A/U or RG-58C/U coaxial cable (commonly referred to as coax). RG-58A/U is the most widely used type. You should also avoid using RG-59 cable, which is intended for television signals. Another type of cable you may see is RG-58U cable. Installing this type of wiring is a mistake because it does not meet the IEEE specification for 10Base2.

You use BNC (Bayonet Nut (also Navy) Connector) connectors for thinnet, along with the "T" connectors required to connect to the BNC female connectors on the LAN card. As with 10Base5 (thicknet), each end of the cable must have a terminator. Only one end of the cable must be grounded.

The 10Base2 wiring specification differs significantly from 10Base5 in that the transceiver is built into the LAN card itself and is not a device you must attach to the cable. A cable connecting the "T" connector to the workstation, called a pigtail, cannot be used with this standard. The "T" connector must connect directly to the back of the card in a daisy-chain fashion. If it doesn't connect this way, the network connections will fail.

As with 10Base5, you can use up to four repeaters on a network, with only three of the segments populated with nodes. You can mix 10Base2 and fiber optic cabling by using a fiber/thinnet repeater. If you have repeaters on your thinnet network, be sure that all devices have SQE (Signal Quality Error) or Heart Beat turned off. If SQE is on, the SQE signal will appear as excessive collisions on the network, with the end result of slowing down the network.

To remember what you can put between any two nodes on a coaxial Ethernet network, keep in mind the "5-4-3 rule". As you may have noticed from the specifications, Ethernet topologies have a five-segment, four-repeater theme. The 3 part of the rule states that only three segments can be populated with nodes. The 5-4-3 rule does not apply to UTP (Unshielded Twisted Pair) or fiber optic cable segments. With UTP, hubs act as repeaters. You cannot have two devices separated by more than four hubs.

The disadvantages of thinnet include the high cost compared to UTP cable and the fact that the bus configuration makes the network unreliable. If any node's cable is broken, the entire segment, and probably the entire network, will be affected. Nevertheless, because it was the most economical solution for a long time, thinnet is used in many existing installations.

Thicknet and thinnet are often used together, thicknet for covering large distances between thinnet segments and thinnet for connecting a number of computers to the thicknet backbone. This combines the advantages of both types of Ethernet in one network.

10BaseT (Twisted-Pair) Ethernet

The use of unshielded twisted-pair (UTP) cable is now a well-established trend in Ethernet network wiring schemes. UTP costs less and is more flexible than 10Base5 or 10Base2 cabling. The specification for UTP was created by the IEEE 802.3 subcommittee in the 1980s. Do not substitute shielded twisted-pair (STP) cable for UTP; the IEEE 10BaseT specification is for UTP only.

10BaseT (twisted-pair) Ethernet has the following specifications:

Maximum segment length	100 meters (330 feet)
Maximum segments	1024
Maximum segments with nodes	1024
Maximum nodes per segment	2
Maximum nodes per network	1024
Maximum hubs in a chain	4

10BaseT is wired as a star, which means that each device has its own set of wires connected directly to a hub. Although the physical topology of 10BaseT is a star, its logical topology is a bus. This gives you the advantages of a star wiring scheme and a bus in one specification. 10BaseT is easy to troubleshoot because problems on one segment of wiring usually do not affect the other segments. (Each node uses its own separate segment).

You can also isolate a device that is causing problems by just disconnecting its cable from the hub. Some hubs have built-in management capabilities that will report errors or problems, as well as allow you to disconnect remotely the devices from the hub. These types of hubs are known as intelligent hubs.

The connection to the hub and the LAN cards is made with an RJ-45 connector. You can also connect 10BaseT to a DIX (Digital Intel Xerox) connector or an AUI (Attachment (also Auxiliary) Unit Interface) connector by using a transceiver or twisted-pair access unit (TPAU). Thinnet connections on LAN cards can also be used with special transceiver devices.

UTP cable is classified in categories defined by the Electrical Industries Association (EIA). Categories 1 and 2 are voice-grade cable. Categories 3, 4, and 5 are data-grade. Be sure to ask the vendor for a performance specification sheet when you purchase Category 5 cable to be sure it meets the specifications for your network.

Some buildings that are wired for telephone service with twisted-pair wires have extra installed pairs available for your use on your network. If the wiring is Category 3 or better, you can use the existing wiring to add a 10BaseT network inexpensively. You can purchase wall jacks that have an RJ-45 (Registered Jack) connector for 10BaseT and an RJ-11 connector for traditional phone lines.

There are also Teflon-coated versions of UTP cable for areas that require plenum-rated wire. The cable is light and flexible, which makes it easy to pull through during construction. The cable should be 22-, 24-, or 26-gauge AWG (American Wire Gauge), with an impedance (resistance based on signal frequency) of 85 to 110 ohms at 10 MHz.

10BaseFL (Fiber-Optic) Ethernet

10BaseFL (10 Mbps data rate, baseband signaling over a fiber-optic cable) uses light rather than electricity to transmit Ethernet frames. 10baseFL is a star-wired network because it requires a network hub (also called a concentrator) to receive that light signal from each network station and send the same signal to all stations. The hub can be either active, with electronics to detect and retransmit the signal, or passive, with optics to split the light and reflect or guide it out to all the network stations.

10BaseFL Ethernet has the following specifications:

Maximum segment length	2000 meters (6600 feet)
Maximum segments	1024
Maximum segments with nodes	1024
Maximum nodes per segment	2
Maximum nodes per network	1024
Maximum hubs in a chain	4

A passive fiber-optic hub requires no power to operate, but the intensity of the signal is divided among all the ports on the hub; therefore, the number of ports cannot be large and the signal from the network stations must be strong. Also, since there are no electronics in a passive fiber-optic hub, the passive hub cannot be managed or have error detection circuitry. This makes a network with a passive hub harder to troubleshoot.

A 10BaseFL network segment can be up to 2000 meters (6600 feet) long, four times that of a 10Base5 segment. The great distances a 10BaseFL segment can cover makes it a common choice for network backbones.

100 Mbps Ethernet:

For some networking applications, a 10 Mbps data rate is not enough. Two competing standards extend traditional Ethernet to 100 Mbps:

100VG-AnyLAN

100BaseT Ethernet, also known as Fast Ethernet

100VG-AnyLAN:

100VG-AnyLAN (100 Mbps data rate, voice grade) combines elements of traditional Ethernet and Token Ring. It is referred to by any of the following designations:

100VG-AnyLAN

100BaseBG

AnyLAN

100VG-AnyLAN has the following advantages over regular Ethernet:

It is faster.

It supports both Ethernet and Token Ring packets.

It uses a demand priority access method (as opposed to CSMA/CD) that allows for two priority levels.

Hubs can filter individually addressed frames for enhanced privacy.

You can use 100VG-AnyLAN over categories 3, 4, and 5 twisted-pair and fiber-optic cable. It uses a star topology and defines how child hubs can be connected to a parent hub to extend the network. Several hubs can be cascaded to form larger networks. The length of any two 100VG-AnyLAN cable segments combined must not exceed 250 meters.

100BaseT Ethernet:

100BaseT, also called Fast Ethernet, is simply regular Ethernet run at a faster data rate over category 5 twisted-pair cable. 100BaseT uses the same CSMA/CD protocol in a star wired bus as 10BaseT.

100BaseT has been specified for three media types:

100BaseT4 (four pairs categories 3, 4, or 5 UTP or STP),

100BaseTX (two pairs category 5 UTP or STP),

100BaseFX (two-strand fiber-optic cable).

In addition to the faster data rate and the higher quality cable required, 100BaseT has the same advantages and drawbacks as 10BaseT.

Hub:

A hub is a component that serves as a common termination point for multiple nodes and that can relay signals along the appropriate paths. Generally, a hub is a box with a number of connectors to which nodes (workstations) are attached. Hubs usually accommodate four or eight nodes, and many hubs include connectors for linking to other hubs.

A hub usually connects nodes that have a common architecture, such as Ethernet, ARCnet, FDDI, or Token Ring. This is in contrast to a concentrator, which can generally support multiple architectures. Although the boundary between concentrators and hubs is not always clear, hubs are generally simpler and cheaper than concentrators. Token Ring hubs are known as multistation access units (MAUs or MSAUs).

Hub-node connections for a particular network all use the same type of cable, which may be coaxial, twisted-pair, or fiber-optic. Regardless of the type of cabling used for hub-node connections, it is often advisable to use fiber-optic cable for hub-hub connections.

Hubs may be located in a wiring closet, and they may be connected to a higher-level wiring center, known as an intermediate distribution frame (IDF) or main distribution frame (MDF).

In light of its central role, you should seriously consider connecting a hub to a UPS (uninterruptible power supply).

Hub Operation:

All hubs provide connectivity; they pass on signals that come through them. The simplest hub broadcasts incoming signals to all connected nodes; more intelligent hubs will selectively transmit signals. Any other services a hub provides will depend on the capabilities that have been built into the hub. MAUs (Token Ring hubs) and active hubs (used in the ARCnet architecture) also boost a signal before passing it on. MAUs also do some internal routing of the node connections in order to create a ring arrangement for the nodes.

There are constraints on the distances that can separate a hub from a node or from another hub. These constraints depend on the type of hub (active or passive) and on the network architecture. In general, allowable node-hub distances are shorter than hub-hub distances.

Hub Features:

In addition to connectivity, some hubs also provide management capabilities. Some hubs include an on-board processor which can monitor network activity and can store monitoring data in a MIB (management information base). A network management program -- running on the hub or on a server -- can use these data to fine-tune the network in order to improve the network's performance.

Just about all hubs have LEDs (light-emitting diodes) to indicate the status of each port (node). Many hubs can also do partitioning, which is a way to isolate a nonfunctioning node.

Other capabilities can be built into hubs or can be provided through software. For example, hubs can be provided with non-volatile memory, which can retain states and configuration values in case of a power outage.

Hubs can also be built or imbued with security capabilities. For example, with the help of software, certain high-end hubs can be made to send data packets to a destination node and garbage packets to all other nodes. This makes it much more difficult for a node to read packets not intended for that node.

Various types of special-purpose or enhanced hubs have been developed to incorporate some subset of the above features. In some cases, hub devices may be considered hubs or concentrators.

Hub and Spoke:

A term for an arrangement with a central component and multiple peripheral, or outlying components. For example, a central office with connections to smaller branch offices would have a hub-and-spoke arrangement.

Hub Card:

In 10BaseT networks, a multiport card that can be used in place of a hub.

Hub Topology:

See Star Topology.

Huffman Coding:

In computer science, Huffman coding is an entropy encoding algorithm used for data compression. It was developed by David A. Huffman and published in 1952, A Method for the Construction of Minimum-Redundancy Codes, Proceeding of the IRE, Vol. 40, pages 1098-1101.

Huffman is a method of encoding symbols that varies the length of the symbol in proportion to its information content. Symbols with a low probability of appearance are encoded with a code using may bits. Symbols with a high probability of appearance are represented with a code using fewer bits. Huffman codes can be properly decoded because they obey the prefix property, which means no code can be a prefix of another code.

Symbols in data compression terminology, are an atomic unit of information. General purpose compression programs frequently compress streams of bytes, where the byte is the same thing as the symbol. However, a symbol could just as easily be a floating point number, or a spoken word, etc.

The basic idea with Huffman's coding is borrowed from an older and slightly less efficient method called Shannon-Fano coding. This coding technique was developed in the early 1950s which attempted to minimize the number of bits used in a message when the probabilities of symbols in the message were known. Shannon Fano coding has generally been superseded by Huffman coding, which produces provably optimum code sets, resulting in marginally better performance than Shannon Fano codes.

This is also known as statistical compression, a form of compression that restructures the elements of a file or data stream such that those elements which are used most frequently get short symbols and those used the least often get long ones.

Standard Huffman coding suffers from a significant problem when used for high performance data compression. The compression program has to pass a complete copy of the Huffman coding statistics to the expansion application. As the compression program collects more statistics and tries to increase its compression ration, the statistics take up more space and work against the increased compression.

The "text" to be compressed is considered as a string of symbols. Symbols that are likely to be frequent are represented by a short sequence of bits, and symbols that are likely to be rare are represented by a longer sequences of bits.

Huffman coding uses a specific method for choosing the representations for each symbol, resulting in a prefix-free code (i.e. no bit string of any symbol is a prefix of the bit string of any other symbol). It has been proven that Huffman coding is the most effective compression method of this type. That is, no other mapping of source symbols to strings of bits will produce a smaller output when the actual symbol frequencies agree with those used to create the code. For a set of symbols whose cardinality is a power of two and a uniform probability distribution, Huffman coding is equivalent to simple binary block encoding.

Huffman coding is optimal when the frequencies of input characters are powers of two. Arithmetic coding produces slight gains over Huffman coding, but in practice these gains have not been large enough to offset arithmetic coding's higher computational complexity and patent royalties (as of November 2001, IBM owns patents on the core concepts of arithmetic coding in several jurisdictions).

Huffman coding shares most characteristics of Shannon-Fano coding as stated above. It creates variable-length codes that are an integral number of bits. Symbols with higher probabilities get shorter bit codes. Huffman codes have the unique prefix attribute, which means they can be correctly decoded despite being variable length. Decoding a stream of Huffman codes is generally done by following a binary decoder tree.

Huffman codes are built from the bottom up, starting with the leaves of the tree and working progressively closer to the root.

The procedure for building the tree is simple and elegant. The individual symbols are laid out as a string of leaf nodes that are going to be connected by a binary tree. Each node (tree) has a weight, which is simply the frequency of probability of the symbol's appearance. The tree is then built with the following steps:

Huffman works by creating a binary tree of symbols:

Start with as many trees as there are symbols.
While there is more than one tree:

Find the two trees with the smallest total probability.
Combine the trees into one, setting one as the left child and the other as the right.

Now the tree contains all the symbols. A '0' represents following the left child; a '1' represents following the right child.

This algorithm can be applied to the symbols used in the example below. The six symbols (A through F) in our message are laid out, along with their frequencies and/or probabilities. Here is an example (when the probabilities are known):

    A: 0.30
    B: 0.30
    C: 0.13
    D: 0.12
    E: 0.10
    F: 0.05

From the probabilities above build the first layor of tree:

Probability of Event or Symbol	0.30	0.30	0.13	0.12	0.10	0.05
Event or Symbol Name	Event A	Event B	Event C	Event D	Event E	Event F

The table above is preparing for Huffman code construction. List all events in descending order of probability. These six nodes are going to end up as the leaves of the decoding tree.

Step one in Huffman code construction: pair the two events with the lowest probabilities. See the table below

					EF (0.15)
					0	1
Probability of Event or Symbol	0.30	0.30	0.13	0.12	0.10	0.05
Event or Symbol Name	Event A	Event B	Event C	Event D	Event E	Event F

Step Two Repeat for the pair with the next lowest probabilitites. See the table below.

			CD (0.25)		EF (0.15)
			0	1	0	1
Probability of Event or Symbol	0.30	0.30	0.13	0.12	0.10	0.05
Event or Symbol Name	Event A	Event B	Event C	Event D	Event E	Event F

Repeat again. Note that for this example, previous pairs are paired. See the table below.

			CDEF (0.40)
			0		1
			CD (0.25)		EF (0.15)
			0	1	0	1
Probability of Event or Symbol	0.30	0.30	0.13	0.12	0.10	0.05
Event or Symbol Name	Event A	Event B	Event C	Event D	Event E	Event F

Continue pairing the lowest-probability events. See the table below.

	AB (0.6)		CDEF (0.40)
	0	1	0		1
			CD (0.25)		EF (0.15)
			0	1	0	1
Probability of Event or Symbol	0.30	0.30	0.13	0.12	0.10	0.05
Event or Symbol Name	Event A	Event B	Event C	Event D	Event E	Event F

Continue pairing:

	ABCDEF (1) [ROOT]
	0		1
	AB (0.6)		CDEF (0.40)
	0	1	0		1
			CD (0.25)		EF (0.15)
			0	1	0	1
Probability of Event or Symbol	0.30	0.30	0.13	0.12	0.10	0.05
Event or Symbol Name	Event A	Event B	Event C	Event D	Event E	Event F

The construction of the code proceeds by successively pairing events with the lowest probabilities, until all events have been paired. Elements of the code (zeros and ones) are assigned as indicated, and the resulting codes may be read from the diagram. For example, the code for event A (the most likely) is 00 while the code for event F (the least likely) is 111. The table below shows how the rest of the code is developed for the events or symbols from the probability tree above.

Table showing Code for the Events or Symbols from the above tree:

Events	Probability	Code
Event A	0.30	00
Event B	0.30	01
Event C	0.13	100
Event D	0.12	101
Event E	0.10	110
Event F	0.05	111

To evaluate the entropy associated with the events above, you would use Shannon's Entropy model, which states:

H(X) = -	∑	p_ilog₂p_i
	i

The Shannon entropy equation or model provides a way to estimate the average minimum number of bits needed to encode a string of symbols or events, based on the frequency of those symbols.

In the Shannon entropy equation, p_i is the probability of a given symbol or event.

To calculate log₂ from another log base (e.g., log₁₀ or log_e) we would use the relationship below:

The minimum average number of bits per symbol or event is:

numBits = ⌈H(X)⌉

log₂(n) = log_b(n)/log_b(2) ⇒ log_b(n)/0.69314718; if b is the natural log.

Thus, to calculate the Entropy to obtain the minimum average number of bits, we would do the following calculations:

H(E_A) = -0.30(log₂0.30) = .521

H(E_B) = -0.30(log₂0.30) = .521

H(E_C) = -0.13(log₂0.13) = .382

H(E_D) = -0.12(log₂0.12) = .367

H(E_E) = -0.10(log₂0.10) = .332

H(E_F) = -0.05(log₂0.05) = .216

H(X) = (.521 + .521 + .382 + .367 + .332 + .216) = 2.339 or,

The numBits = H(X) = 2.339 bits/symbol or event

Rounding down to 2, we get 2 bits/per symbol. To represent a fifteen character string AAAAABBCCDDEFFF would require 30 bits if the string were encoded optimally. Such an optimal encoding would allocate fewer bits for the frequent occuring symbols (e.g., A and B) and long bit sequences for the more infrequent symbols (C,D,E, and F).

In information theory, entropy is conceptually the actual amount of (information theoretic) information in a piece of data. An entirely random byte of data has an entropy of about infinity, since you never know what the next character will be. A long string of A's has an entropy of 0, since you know that the next character will always be an 'A'. The entropy of English text is about 1.5 bits per character/symbol. Try compressing some English text with the PPM compression algorithm! The entropy rate of a data source means the average number of bits per symbol needed to encode it, as shown above.

Another example;

Consider a discrete memory-less source (DMS) with seven possible symbols X1, X2,..., X7 having probabilities of occurrences as follows:

If the discrete output symbols are statistically independent, then the source is known as Discrete Memoryless Source (DMS).

    X1: 0.350
    X2: 0.300
    X3: 0.200
    X4: 0.100
    X5: 0.030
    X6: 0.015
    X7: 0.005

From the probabilities above you can build the following tree:

	X1X2X3X4X5X6X7 (1)
	0			1
	X1X2X3 (0.850)			X4X5X6X7 (0.150)
	0	1		0		1
		X4X5 (0.500)		X4X5 (0.130)		X6X7 (0.020)
		0	1	0	1	0	1
Probability of Symbol	0.350	0.300	0.200	0.100	0.030	0.015	0.005
Symbol Name	X1	X2	X3	X4	X5	X6	X7

Table showing Code for the Events or Symbols from the above tree:

X Code

X1 00

X2 010

X3 011

X4 100

X5 101

X6 110

X7 111

Then evaluate the entropy associated with the discrete memory-less source (DMS).

H(X1) = -0.350(log₂0.350) = .530

H(X2) = -0.300(log₂0.300) = .521

H(X3) = -0.200(log₂0.200) = .464

H(X4) = -0.100(log₂0.100) = .332

H(X5) = -0.030(log₂0.030) = .152

H(X6) = -0.015(log₂0.015) = .091

H(X7) = -0.005(log₂0.005) = .018

H(X) = (.530 + .521 + .464 + .332 + .152 + .091 + .018) = 2.098 or,

numBits = H(X) = 2.098 bits/symbol

Rounding down to 2, we get 2 bits/per symbol. To represent a fifteen character string X1X1X1X1X1X2X2X3X3X4X4X5X6X6X7 would require 30 bits if the string were encoded optimally. Such an optimal encoding would allocate fewer bits for the frequent occuring symbols (e.g., X1,X2, and X3) and long bit sequences for the more infrequent symbols (X4,X5,X6, and X7).

There are variations. The frequencies used can be generic ones for the application domain that are based on average experience, or they can be the actual frequencies found in the text being compressed. (This variation requires that a frequency table or other hint as to the encoding must be stored with the compressed text; implementations employ various tricks to store these tables efficiently.) A variation called "adaptive Huffman coding" calculates the frequencies dynamically based on recent actual frequencies in the source string. This is somewhat related to the LZ family of algorithms.

Extreme cases of Huffman codes are connected with Fibonacci numbers. For examples, see http://mathforum.org/discuss/sci.math/t/207334

Huffman coding today is often used as a "back-end" to some other compression method. DEFLATE (PKZIP's algorithm) and multimedia codecs such as JPEG and MP3 have a front-end model and quantization followed by Huffman coding.

n-ary Huffman algorithm uses the {0, 1, ..., n-1} alphabet to encode message. Built tree is n-ary one.

Huffman Template algorithm enables to use non-numerical weights (costs, frequences). For example, see http://alexvn.freeservers.com/s1/huffman_template_algorithm.html

Hybrid Topology/Network Hardware:

Before any attempt is made to interconnect a mixture of network configurations, some basic network characteristics need to be understood. The topology is the way a network is configured. Another network characteristic is the protocol. Therefore, when interconnecting networks, the protocol, as well as the topology, must be considered. Two networks that use the same topology but different protocols cannot effectively communicate without help.

Heterogeneous networks can be thought of as building blocks connected by "black boxes". The building blocks are self-contained local area networks with their own workstations, servers, and peripherals. Each consists of a single topology and a single protocol. There is an exception of the single protocol in a single topology, and this difference is the ethernet network, which is of a bus topology, but can tout several protocols.

To connect two of these building blocks, a boundary must be crossed. A connection must be made with both "building blocks" via a black box either by a physical cabling scheme or by radio waves. The device that makes the connection, the "black box", does not change either interconnecting network. It simply transfers packets of data between the networks. It not only satisfies all the physical requirements of both networks, but also transfers the data safely and securely from one network to the other.

The ability to connect two heterogeneous networks depends on two requirements. First, the topologies must be able to be interconnected. Second, there must be a way to transfer information between dissimilar systems of communication (protocols). This means that at some point a common protocol must be employed. There are several ways to accomplish this task. Most use high-level protocols for moving data and employ tools for internetworking such as:

Bridges (see bridges),
Routers (see routers),
B-Routers (see routers), and
Gateways ( see gateways).

Each of these devices has its characteristics and specific applications. The type of device used in connecting dissimilar networks will depend on the amount of transparency desired and the cost that a company is willing to pay for such devices. A general rule of thumb is that the more sophistication a device has, the higher the transparency will be to the users and networks and the more expensive the equipment will be.

Hybrid Topology:

A network which contains elements of more than one of the network configurations of bus, ring, and star. For example, a bus network may have a ring network as one of its links. Another type of hybrid topology is a star network that has a bus network as one of its links where a workstation is normally found.

HyperText Markup Language (HTML):

The Web is a Hypertext Information System. Hypertext enables you to read and navigate text and visual information in a nonlinear fashion based on what you want to know next.

HTML is based on SGML (Standard Generalized Markup Language), a much bigger document-processing system. To write HTML pages, you won't need to know a whole lot about SGML, but it does help to know that one of the main features of SGML is that it describes the general structure of the content inside documents, not that content's actual appearance on the page or on the screen.

HTML, by virtue of its SGML heritage, is a language for describing the structure of a document, not its actual presentation. The idea here is that most documents have common elements -- for example, titles, paragraphs, or lists.

If you've worked with word processing programs that use style sheets (such as Microsoft Word) or paragraph catalogs (such as FrameMaker), then you've done something similar; each section of text conforms to one of a set of styles that are pre-defined before you start working.

HTML defines a set of common styles for Web pages: headings, paragraphs, lists, image insertion, and tables. It also defines character styles such as boldface and code examples. Each element has a name and is contained in what's called a tag, for example <TABLE BORDER="0" WIDTH="450">, <P>, etc.. When you write a Web page in HTML, you label the different elements of your page with these tags that say "this is a heading" or "this is a list item". It's like if you were working for a newspaper or a magazine where you do the writing but someone else does the layout; you might explain to the layout person that this line is the title, this line is a figure caption, or this is a heading, or this line it the beginning of a table. It's the same way with HTML.

When you're working with a word processor or page layout program, styles are not just named elements of a page -- they also include formatting information such as the font size and style, indentation, underlining, and so on. So when you write some text that's supposed to be a heading, you can apply the Heading style to it, and the program automatically formats that paragraph for you in the correct style.

HTML, doesn't go this far. For the most part, HTML doesn't say anything about how a page looks when it's viewed. All HTML tags indicate is that an element is heading, list, or table -- they say nothing about how that heading or list is to be formatted. So, as with the magazine example and the layout person who formats your article, it's the layout person's job to decide how big the heading should be and what font it should be in -- the only thing you have to worry about is marking which section is supposed to be a heading.

Web browsers, in addition to providing the networking functions to retrieve pages from the Web, double as HTML formatters. When you read an HTML page into a browser such as Netscape or Lynx, the browser reads, or parses, the HTML tages and formats the text and images on the screen. The browser has mappings between the names of page elements and actual styles on the screen; for example, headings might be in a larger font than the text on the rest of the page. The browser also wraps all the text so that it fits into the current width of the window.

Different browsers, running on different platforms, may have mappings for each page element. Some browsers may use different font styles than others. So, for example, one browser might display italics as italics, whereas another might use reverse on systems that don't have italic fonts. Or a heading in all capital letters instead of a larger font. What this means to you as Web page designer is that the pages you create using HTML may look radically different from system to system and from browser to browser. The actual information and links inside those pages will still be there, but the appearance on the screen will change. You can design a Web page so that it looks perfect on your computer system, but when someone else reads it on a different system, it may look entirely different (and it may very well be entirely unreadable -- however, this was a bigger problem in older browsers).

Thus, HTML is a markup language. Writing in a markup language means that you start with the text of your page and add special tags around words and paragraphs. If you've ever worked with other markup languages such as troff or LaTeX, or even older DOS-based word processors where you put in special codes for things such as "turn on bold", this won't seem all that unusual. The fact that the content remains unconfused over large variations of window sizes is due to the intelligent interpretation of the markup information in HTML.

The tags indicate the different parts of the page and produce different effects in the browser. HTML has a defined set of tags you can use. You can't make up your own tags to create new appearances or features that you can with XML (eXtensible Markup Language). Also, just to make sure things are really confusing, different browsers support different sets of tags.

HyperText Transfer Protocol (HTTP):

The TCP/IP-based protocol used on the World Wide Web for the exchange of data between Web browsers (clients) and Web servers. HTTP is the workhorse of the web. Like most Internet service protocols, HTTP uses the standard TCP (Transmission Control Protocol) connection on default port 80. An increasing number of web servers, however, now use differnet ports. For example, some URLs (Uniform Resource Locator) require support for ports 81, 8000, 8002, and 8080. Although there is no need to support these ports automatically, one should watch out for possible deviations with HTTP. If a user cannot connect to a web site, one needs to check the semantics of the URL. Look for a URL that includes the part number. For instance, you might find a URL that reads:

http://www.beer-example.net:8080/.

HyperText Transfer Protocol-Next Generation (HTTP-NG):

A proposed replacement to the current HTTP standard that improves performance of Web browser-server transactions and adds features.

Hz (Hertz):

A unit of frequency. Hertz is used, for example, to describe the periodic properties of acoustic, electrical, and optical signals. One hertz is equal to one cycle per second.

Search for Information Technology Items

Return back to Network & Concepts Index

Networking "H" Definition and Concepts

robert.d.betterton@rdbprime.com

Back | Home | Top | Feedback | Site Search

E-Mail Me

This site is brought to you by
Bob Betterton; 2001 - 2011.

This page was last updated on 09/18/2005
Copyright, RDB Prime Engineering

This Page has been accessed "9659" times.