Why I Built KMAX – The World’s Best Network Emulator

28 May, 2024 · by Karl Auerbach · Read in about 31 min · (6558 Words)
blog technology internet testing network emulator

I am a human, I am not an AI. I wrote this piece to reflect on the testing of Internet protocols, in particular through the use of devices that tickle the flow of packets going back and forth between devices that speak those protocols.

The term “network emulator” is ambiguous.

Does it refer to a device that affects actual packet traffic or is it a mathematical model about what would happen to real traffic? If real packet traffic, where do those packets come from, the user’s own network or a synthetic traffic generator? Can traffic be classifed into related streams and subjected to different kinds of effects? Is the purpose of the emulator to test protocol implementations for robust and correct operation, or is the purpose to push devices under extreme loads?

I tend to work with actual, running network protocol code. My intent is to test whether devices that contain that code perform acceptably when presented with the kinds of network conditions that can occur on real networks but which are rarely present on the pristine networks used by code developers and QA teams. As a consequence, I am more attuned to network emulation tools that allow the controlled manipulation of actual traffic streams emitted by real devices than with mathematical simulations of hypothetical or synthetic, perfect protocol interactions.

This note is structured as follows:

What is a “network emulator”, what kinds of features distinguish different emulators, and how do emulators meld with the world of network modeling and simulation tools?
How does a user/operator know what values to plug into a network emulator?
What are some important, but often overlooked, characteristics and features (such as accuracy, stability, and ability to create transient and burst conditions.)
What kind of network emulator do you need?
How does KMAX compare with other products?
Who am I and why should you believe me?

A note on terminology: Technically a “packet” is data contained within an Ethernet “frame”. That packet may be IPv4, IPv6, or something else. It can sometimes be useful for a user to understand whether the emulator device operates on frame or a packet, however for most purposes the difference is not important to the user of the emulator.

What in the world is a “Network Emulator”?
How does a user obtain configuration values?
Accuracy, precision, and stability
Transient bursts
Direction matters
Cloud based? Huh?
Background/synthetic traffic
Know your goals
Scenarios
Fallacies
What’s so great about KMAX?
KMAX versus other products
About the author: Who am I to say these things? Why should you believe me?

What in the world is a “Network Emulator”?

There are many test devices that call themselves “network emulators”. Some actually emulate, some mathematically simulate; some work on actual traffic, some create synthetic traffic, some model hypothetical traffic. Some test actual code in live operation, some presume generic device models.

If you are looking to do network testing I would suggest punching through the “network emulator” label and ask “what do I really want to accomplish?”

The kind of devices I have built – such as KMAX – afflict controlled impairments (drop, delay, reordering, rate limits, etc) onto real traffic. KMAX tends to be called a “network emulator” because of the lack of a better descriptive noun. However, KMAX creates real network conditions and subjects the devices under test (“DUT”s) to real network traffic.

I would divide the world of things called “network emulators” into several categories:

Flexible, dual-direction, man-in-the-middle devices

This category of tools (including KMAX) make real-time changes (or “impairments”) to existing flows of packets. These tools can create conditions such as delay/latency, jitter (packet delay variation), drop, re-sequencing, dam bursting (accumulation of packets that are then suddenly released), duplication, rate limitation. Some can alter packet content.

This category of test devices is intended to exercise the code inside devices under test (DUTs). The purpose is to evaluate how well those DUTs handle the kinds of network conditions that can (and often do) exist in real life deployments but are often not found in the near-perfection of developer and QA test networks.

This category of test device was envisioned long ago – 1987: Jon Postel in RFC 1025 called such a device a “flakeway”.

Because these are “man-in-the-middle” systems, this kind of tool can be used in a wide variety of situations with only the devices under test having to contain network protocol code. The downside is that one needs those devices under test, and perhaps some ancillary devices (typically things like DHCP servers or DNS) to establish the traffic flows that the test tool will be afflicting with impairments.

There are several products in this category, some are relatively inflexible, some are quite expensive, some are open source, some are difficult to use. I will try to enumerate these later in this paper. Of course, among these, I consider KMAX the best, by far.

It should be noted that for man-in-the-middle tools there are several significant sub-attributes:

Layer 2 inline

Such devices (including KMAX) appear as a “bump on an Ethernet”. Test devices of this nature tend to act as either layer 2 bridges (without engaging in spanning-tree protocols.) Test devices of this nature require minimal configuration.

Layer 3 inline

This kind of device acts as an IPv4 or IPv6 router. This kind of tester is not transparent, the effects of routing are visible to the devices under test. In addition the testing user will have to configure the tester with IP addresses and routing information (which may be implied) and will probably have to assign the devices under test to separate IP subnets.

Built-in

Testers of this nature are incorporated into the device under test itself. This kind of tool is most often found in smart phones or built into web browsers and can deal only with traffic to/from that phone or that web browser.

A more generalized form of built-in tester may be found on Linux or *Bsd operating systems in which the user may bind a built-in emulator to a network interface or may set up two such emulators and use the underlying operating systems’s bridging machinery, if present, to create a full layer 2 inline tester. This often appears, at first glance, to be easy – there are usually many “how to do it” websites. However, there are complexities (such as dealing with the variable length headers created by IP options) that can make setting this up quite difficult (and there are usually no websites that explain how.)

With or without packet classification

Many test tools lack the ability to differentiate between traffic streams. Thus web traffic and voice traffic will be afflicted in the same way. That can create artificial or misleading results. (For example, if one is emulating a wide-area network path it is usually a good idea to add latency only to those packets that would cross that wide-area path and not add latency to what would typically be locally handled packets such as ARPs or DNS queries.)

Packet classification may be simple, such as allowing packets to be treated differently based on IP addresses, or may flexibly allow packet header fields at many protocol levels to be evaluated. For example, it can be quite useful to be able to apply different emulations to traffic on different IEEE 802.1Q VLANs.

Programmable

Networks are dynamic, they change from moment to moment. However, many emulators are inflexible: once configured and started, they run the same impairments without change. A better emulator allows impairments to change either on short time scales (to emulate network burst behaviors) or over longer times (to emulate things like route flapping, errors induced by the cycling of nearby machinery, or low earth orbit satellite transits and blanking intervals.)

Easy on/off

One of the more difficult aspects of network testing is understanding the effects being caused by the testing. Sometimes this is done by capturing statistics or traffic and doing post-test, off-line analysis. Other times, particularly with human-oriented network applications such as music, voice, or video, the best form of evaluation is done by a human listening to, or watching application behavior. To do this it is useful if the test tool has an easy on/off switch so that the test user can do quick A versus B comparisons. (KMAX calls this “bypass” and gives the user one-click means to disable test impairments in either direction, individually, or both directions simultaneously.)

Internal latency

All man-in-the-middle emulators add real delay. It takes time to receive a packet, process it, and then transmit it. That time ought to be small and and stable. Some emulators use specialized (often expensive) hardware, often at the cost of reducing the flexibility of impairments that the emulator can perform. Other emulators (such as netem or WANem) don’t address these concerns and depend on the added delay and its variation to be too small to be noticed by the user. Emulators such as KMAX use over-provisioned hardware foundations, multi-core processors, and tuned operating systems to avoid the resource starvation that causes emulator internal delays and, especially, variation of those delays.

Observation and capture tools

This category includes tools such as Wireshark and tcpreplay. Tools like Wireshark are invaluable, but they are a long way from ‘network emulators”. Closer are tools (like tcpreplay) that can replay previously captured traffic, often with a limited ability to alter packet fields (such as the TCP initial sequence number) that tend to vary with each instance of a network transaction.

Hardware indicators, such as the Link State and traffic send/receive LEDs on many network interfaces, are quite useful observation tools.

Reachability Tools and Lightweight Generators

This category is composed of software tools that generate light to moderate loads of traffic or network transactions. These are tools such as iPerf, ping, Scapy, telnet, curl, or any number of web testing tools. These tools create lightweight patterns (or even singular) events that can be useful for testing and debugging. I tend to refer to this class of tools as your basic peer-level toolkit because they tend to act as a peer to the device under test.

These tools are quite useful; and people who test network code almost certainly already have them.

Heavyweight Generators

This category is composed of devices that generate heavy traffic or network transactions to test the capacity of DUTs to absorb such loads. I tend to refer to this category of test devices as “packet smashers”. These are often hardware based systems with limited flexibility and are often quite expensive.

Simulation and modeling systems

These are packages in which one defines a network topology – switches, routers, links, hosts – and then uses a mathematical modeling system to predict flow rates, queuing delays, and overall performance.

These tools are usually more for network design and are less useful as a means to test actual protocol stacks.

Sometimes a dividing line is drawn to divide simulation from pure mathematical modeling. Simulation may be made more concrete by having the simulator synthesize events to trigger code or devices under test – for example by generating synthetic input events to real code or devices.

Simulation and modeling tools usually require quantitative data about the nature of the traffic that will be loaded onto the simulated network; often users do not have a good handle on these numbers.

These tools tend to produce results that are expressed in statistical forms. These can be very useful, but to many of us they may also be difficult to fully comprehend.

When using simulation and modeling tools one should keep in mind the rather apt aphorism: In theory, theory and practice are the same; in practice they are not. In other words, models and simulations are approximations; reality often exhibits behavior unexpected and unpredicted by models and simulations.

Fuzzing tools

Fuzzers generate large numbers of transactions, varying each transaction slightly from its predecessor.

Fuzzing tools are often based on a concept that can be described as “throw everything on the wall and see what sticks”.

These tools can be useful. Given enough time they can reveal problems in a device under test, but unless you have adequate backtracking you may end up knowing that there is a problem but not knowing precisely what caused that problem.

An unfocused fuzzing session can take a very long time if that session has to enumerate through a large number of possible values. For example, a fuzzing test that simply tries all possible TCP initial sequence numbers would have to run 4,294,967,296 test patterns – that could take a very long time.

A good fuzzing tool must be able to focus its activities. However, the more the user is involved in telling a fuzzing tool where to focus its attention the less that tool is actually doing fuzz test and the more it is in the realm of a traditional test suite constructed using knowledge of where code under test may be most sensitive.

(At InterWorking Labs – iwl.com we have avoided “try everything” fuzzing. Rather, we have recognized that there are certain points around which programmers tend to write flawed code. For example, a lot of code works well until the high order [sign bit] of a packet header field flips from 0 to 1 or 1 to 0. [A rather common error is that code written in C has not adequately specified whether integer packet fields are signed or unsigned.] So it is useful to do a small bit of fuzz-style testing around those values. The same sort of focused fuzzing is useful when checking for buffer sizing errors – most problems will occur when the buffer is near being full, exactly full, or is being overrun. The point to be taken from this is that a tool that can do highly focused fuzzing can be rather more useful than a brute force fuzzing tool.)

How does a user obtain configuration values?

Those of us who are using network emulation tools to test applications and protocol stacks typically ask the question “what network values – drop, latency/jitter (packet delay variation), bursts, etc – should I be expecting as traffic passes to and from my devices under test?”

There seem to be two general ways to obtain this:

From a user-constructed model of the hypothetical (or a real existing) network and let the tool use mathematical techniques to generate the settings. Internally this kind of tool usually merges all of its calculations into a single set of end-to-end values (or into an internal composition of each of the values for each part of the packet paths in that network model.) This can be quite useful for those who are laying out complex network topologies (real or hypothetical).

Emulators that derive impairments through calculations based on user drawn topologies tend to be emulators for stable network conditions. The values computed by this class of emulator may be such that the user is unable to make detailed changes (such as to increase or decrease packet latency) or even to view what those values are.

Such an emulator can be cumbersome should a user wish to explore what happens when one or more links degrade or fail. For example, such an emulator may require the user to intervene to change a topology in order to emulate common situations such as rainstorms afflicting satellite up/down links, or the impact of competing traffic bursts (e.g. the impact on an ongoing Zoom conference of someone tuning into a highly variable, high packet rate, high-motion sports video.)

NE-ONE is an example of the kind of tool in which the user lays out a graphical representation of the emulated network and lets the emulator calculate things like delays or packet drops.

From user knowledge or user provided measurements (often obtained from actual tests of actual networks or simply from “what if” conjectures). This is useful for those who know the end-to-end characteristics they want to test.

Tools of this nature put a burden on the user to derive the precise end-to-end network characteristics to be emulated. Often, for user convenience, there are vendor (or user) provided scenarios that have captured these end-to-end characteristics.

Because the user of this kind of tool is closer to the actual settings, the test user may more easily explore the effects of end-to-end network behavior in a more complete way, particularly with regard to exploring for sensitivity of a protocol stack implementation to the kind of unusual or sub-optimal network conditions that happen in real-world deployments.

KMAX is an example of this kind of tool.

As a general rule, emulators that use user provided values are less expensive than emulators in which the user draws a network topology and lets the emulator calculate impairment values.

Accuracy, precision, and stability

There is a lot of variation between network emulators. The most easily seen differences are with regard to functionality (ease of use, richness of the impairments, or flexibility of packet classification.) However there is a deeper difference, one that is harder to see: Accuracy, precision, and stability.

We have noticed that many software based emulators (including IWL’s first emulator two decades ago) had a fair amount of wobble, particularly with regard to time. For example, a requested delay of 40 milliseconds might wobble around that value, often with significant deviations and intermittent distant outliers. This is often the fault of the underlying operating system. Sometimes that flaw can be tuned away, sometimes it is intrinsic to the operating system. Another cause is an under-provisioned hardware foundation.

IWL has been careful in the selection of its operating systems and hardware platforms. Our goal with KMAX is to have much better than millisecond accuracy, with divergence confined within a very narrow standard deviation, and effectively no distant outliers. (IWL has validated KMAX precision, accuracy, and stability using tools from Spirent, Ixia, and IWL.)

Transient bursts

Anyone who has watched automobiles on a busy highway has seen how the vehicles sometimes gather into high density clumps and sometimes are widely spaced. This is burst behavior. Packets on networks do the same thing. (The causes are complex and beyond the scope of this note.) Network bursts can be quite difficult for protocol stacks and applications to handle: for example video or audio applications may suffer burst transients in which they are starved for data (thus causing visual or audible “artifacts”).

Many network emulators focus on steady-state network behavior and do little in the way of creating the kind of random or periodic burst transients that are so common on real-world networks.

In KMAX we have worked hard to give the test user various means to create controlled bursts ranging from a few milliseconds to more than an hour. Thus, for example, with KMAX a user can emulate the changes in noise (represented as packet loss) as low/medium orbit satellite rises from the horizon (high noise), ascends (decreasing noise/packet loss), and then descends (increasing packet loss). Or one may emulate a vehicle approaching (decreasing packet loss), passing, and then moving away (increasing packet loss) from an IP radio access point or 5G tower. Nearly every numeric control in KMAX can be made time-variant through the use of a “Waveform” facility that allows changes to occur either as a function of time, expressed graphically or as a Python function.

Direction matters

Modern networks are often asymmetrical. In other words, traffic flowing from A-to-B often takes a different path, and is thus impacted differently by network conditions, than traffic flowing in the opposite direction from B-to-A.

Some emulators do not make this distinction and treat A-to-B the same way they treat B-to-A. Such emulators could not accurately reproduce conditions on the asymmetry-filled Internet of today.

Cloud based? Huh?

These days a lot of things are “in the cloud”. Sometimes it is handy to have a tool available somewhere on the Internet. However, man-in-the-middle network emulators do not work well when they are “in the cloud”.

What is the purpose of an emulator (as opposed to a modeling or simulation tool)? The purpose is to create packet flows that replicate real-life traffic but with some controlled aspects.

The network distances to and from a “cloud based” emulator will distort that emulation, often to the point where the distortion greatly dominates the desired characteristics being emulated. Moreover, the Internet paths to and from those “cloud based” emulators are often shared with sharp swings in quality from second to second. In short, cloud based emulation may often prove to be an oxymoron – impossible with any degree of acceptable precision, accuracy, or repeatability.

In addition, a user may have to construct VLANs or tunnels between the devices under test and the cloud-based emulator. This can be a difficult and troublesome task.

Background/synthetic traffic

Some man-in-the middle and several simulator tools have means to create synthetic background traffic. This can be useful in situations where one does not want to use external devices (typically laptops, servers, or small computers such as Raspberry Pi’s) as traffic sources or sinks.

Know your goals

Why do you think you need a “network emulator”? What is it that you are trying to accomplish? Let’s look at a few possibilities:

Basic reachability or basic operation

Usually the first step when testing a device is to assure that it is operating at a basic level. Can you send packets to it? Does it respond? This is where your basic peer-level toolkit and observation tools shine.

Simple load testing

If you are trying to get a sense of the capacity of a network link, device, or server, start with your basic peer-level toolkit. Tools like iPerf (iPerf is a family of tools) can be quite useful, but one has to take care to understand the meaning of the results. See, for example, Does IPERF Tell White Lies?

Packet smashing

Are you doing performance testing or evaluating how well a device handles heavy loads? In that case you want a packet smasher, a device that can generate large numbers of Ethernet frames (or IP packets) or can generate large amounts of higher level transactions (such as HTTPS web GETs.)

Protocol correctness or robustness testing

Are you exercising a protocol implementation to validate that it handles mainline and corner cases properly either on a single event basis or over a long term (such as when trying to discover memory leaks)? In this case you do not need massive speed any more than a person cutting a precious diamond needs a 16lb sledge hammer.

Scenarios

Some emulators come with a suite of pre-defined scenarios. These are bundles of configuration settings appropriate for common situations, such as a link over a low (e.g. Starlink) or high (geosynchronous) satellite, or over a network deployed in a high electrical noise site, such as a machine shop.

Scenarios can save a lot of user time (and frustration), particularly when they can be tailored according to the particular situation a user is trying to emulate.

Fallacies

The fallacy that you need a high packet rate test device

For some reason, many people are misled into believing that they must have tools that can generate packet-smashing loads in order to do protocol correctness or robustness testing. That is an incorrect belief: Protocols are conversations, protocols are exchanges of packets. The rate of those conversations is based on the network latency, the round-trip time, between those conversing end-points. Most protocol interactions occur at rates of less than thousands of packets a second, and more often at rates of hundreds or even merely tens of packets per second. There are exceptions, such as when TCP has negotiated a large window scaling option and has passed any slow start or congestion recovery phase. However, those exceptions are rare and rarely justify the cost of devices that can move packets at wire-melting rates.

Many man-in-the-middle network emulators have a variety of network interface options, sometimes ranging beyond a hundred gigabits per second. This can be important in limited cases, such as testing failover protocols such as HSRP but it is less important when testing cross-internet, or even cross-enterprise situations. Unless you are testing brute force, maximum packet load conditions your testing is most likely paced by the round-trip-time of the packets that make up the handshakes of a network protocol. This typically means that the packets of interest are spaced hundreds of microseconds, or more often several milliseconds apart. As such, the raw speed of the network interfaces on your emulator do not really affect your results. Take, for example, an analogy based on a typical commute from home to work: The time and variation of that commute is dominated by the aggregate characteristics of the highways, busses, or trains of the commute. For nearly all protocol testing purposes, an emulator with very high speed network interfaces has about as much impact on your test results as making your home’s driveway wider would affect your daily commute.

The fallacy that you need several (more than two) network interfaces

Some man-in-the-middle emulators have two traffic-bearing physical network interfaces (in addition on either a wired or Wi-Fi interface for control and management purposes.) Some have more than two network interfaces.

(KMAX has two traffic-bearing network interfaces.)

The reason that some emulators have more than two is that they divide those traffic-bearing network interfaces into pairs, with each pair carrying packets that the user has already divided into flows of interest. How the user divides or diverts that traffic onto those physical interfaces is a matter left to the user’s imagination and invention.

Emulators with two network interfaces can have (but do not always have) an internal classification function that looks at every incoming packet, examining that packet against user specified expressions, and internally dividing the traffic into flows, each of which is subjected to its own set of emulated impairments. KMAX has this kind of internal classifier. Thus, for example, KMAX can divide traffic by IEEE 802.1Q VLAN tags, or IPv6 addresses (or many other parameters) into individual flows.

(One thing to consider is that often there is traffic flowing that in real life would not pass through the emulated network. These are things like ARP, IPv6 router discovery, or OSPF packets. There ought to be some means within the emulator to have those packets pass through the emulator without being subjected to the conditions being emulated. KMAX can do this with its “default” band. An emulator that uses physical interfaces as a means of dividing traffic may have difficulty doing this.)

Focus, focus, focus

Protocol isolation is quite useful when doing protocol testing. Emulators that have good packet classification systems allow a test user to focus on exactly the protocol being tested without the confusion that can arise when every packet on the network is subjected to impairments. This is where tools like KMAX (and some of the harder-to-use basic peer-level tools such as Scapy) are particularly useful.

Pre deployment design of a new network or changes to an existing network

Are you laying out a new network or changing an existing one? In this case you are usually interested in whether that network will do its intended job and, if so, will it do so economically and efficiently? Simulation and modeling tools are what you probably want.

What’s so great about KMAX?

KMAX is the third generation of network emulator products from IWL. Each generation brought increased performance, easier user interfaces, and expanded capabilities.

These products arose out of decades of practical experience building, testing, and repairing real networks and protocol stacks. Although the original idea for a “flakeway” came from Jon Postel in RFC 1025 the impetus for IWL’s emulators arose from work done in the mid 1990’s to build an entertainment grade audio/video distribution system in which a tool was needed to drive the protocol stacks through all of the possible code pathways to assure that data was handled properly and without ill effects such as memory leaks or crashes.

KMAX consists of core software on a commercial off-the-shelf (COTS) operating system (FreeBSD) running on vetted commodity (X86/64) hardware. This allows several performance tiers while keeping costs down. Users may interact with KMAX either through a web based graphical user interface (via a web browser) or through user-written scripts (using various languages, such as Python, C, Perl, BASH, etc.) KMAX has a rich packet classification system that can deal with packet headers from the Ethernet level (including VLANs), through MPLS, through IPv4 or IPv6, through UDP and TCP, and even into user data carried by UDP or TCP.

KMAX has a catalog of impairment types – latency, jitter (packet delay variation), drop, duplication, re-sequencing, rate limitation, alteration, etc – each with a large number of options that may be changed while a test session is running. Those options allow the user to express many of the changing dynamics and bursts that happen on real-life networks. In addition, a “waveform” mechanism allows most numeric controls to be run through repeating patterns over periods of time ranging from tens of milliseconds to an hour.

KMAX has ancillary mechanisms that can be quite useful. For example there is a single-click means to disable and re-enable the impairments in order to make A-B comparisons easy. Traffic indicators and traffic graphing helps visualize what is flowing. Another facility allows KMAX to be pre-configured in a lab and then run “headless” in the field. The most recent versions of KMAX support a local graphical desktop and permit familiar tools (such as Wireshark) to be run alongside KMAX.

KMAX versus other products

The standard toolkit

The standard toolkit consists of operating system provided or open source tools that perform basic tests. These are invaluable. Most notable are reachability tools such as ping, arping, and traceroute (including variants such as tcptraceroute.) There are also transactional tools such as Curl, dig, or Scapy. (Scapy is often looked upon with fear as a “hackers’ tool”.) And there are performance tools such as the iPerf family (as long as one remembers that iPerf tends to report statistics about transport level data carriage than about the actual throughput at the packet level.) Traffic observation, capture and replay are available through tools such as Wireshark, tcpdump, and tcpreplay. There are other less known (and often not well maintained) tools such as pchar.

In our present world of IPv4 and IPv6 it can be quite revealing to run these tools twice, once with IPv4 and once with IPv6. There may be (and often are) differences that could indicate differences in the underlying packet routing, the presence of IPv4 NATs (Network Address Translators), or other forms of IPv4 proxies and relays.

As a general matter, you will find a larger and more fully featured collection of these standard suite of tools on Linux than on proprietary platforms such as Microsoft Windows or Apple MacOS.

KMAX

KMAX is a man-in-the-middle network impairment system. KMAX acts much as a layer two bridge, forwarding Ethernet frames (and the packets those frames contain) while subjecting those frames/packets to a configurable sequence of ill effects such as drop, duplication, re-sequencing, latency/jitter (packet delay variation), rate limitation, and so forth. Those frames/packets may be classified for different kinds of treatment based on the direction they are moving and on frame/packet header fields (ranging from Ethernet headers [including IEEE 802.1Q priority and VLAN headers], through MPLS, IPv4/IPv6, UDP/TCP, and even into the data being transported. (Variable length headers, such as from IP options, are handled.)

Being at layer two, KMAX essentially “clicks into” the middle of an Ethernet cable and avoids setup tasks (for both the emulator and the devices under test) required for man-in-the-middle devices that act at layer three.

The latest versions of KMAX support a graphical desktop; ancillary tools, such as Wireshark; and can use familiar tools to control and drive ancillary traffic generators (such as iPerf on other devices.)

More information may be found via https://www.iwl.com

Spirent, Keysight/Ixia, and other heavy traffic generators

Spirent, Keysight/Ixia, and other heavy traffic generators are exactly that, heavy traffic generators – packet smashers – often with limited means to vary the traffic from one packet to the next. These are often quite expensive devices. These devices can be invaluable when stress testing a device under test to maximum loads or to test issues at the MAC and PHYS layers of an underlying physical medium. However, they are less useful when trying to assure that network stack implementations are robust and stable in the face of network activities other than massive traffic loads.

More information may be found via the vendors’ websites:

Spirent: https://www.spirent.com/
Keysight/Ixia: https://www.keysight.com/us/en/home.html

Netem

Netem is a Linux kernel provided collection of fairly simple impairment modules that may be attached to a network interface. Netem is managed by a command line tool, tc that is one of those things that seems easy and simple when explained on a web page but that can drive users to distraction when actually used.

Netem can suffer from accuracy and stability issues depending on the generation and configuration of the underlying Linux kernel.

More information may be found via https://man7.org/linux/man-pages/man8/tc-netem.8.html

WANem

WANem is an overlay over Netem and tc that is intended to impose characteristics of wide area networks (hence the name “WAN” for “wide area network”.)

More information may be found via https://wanem.sourceforge.net/

CORE

CORE describes itself as “a tool for building virtual networks.” It allows one to create a network of applications and protocols within a single Linux computer. Virtualized hosts and network paths are created inside that single Linux computer. Traffic flows are created by using virtualized hosts to run Linux commands and applications. CORE appears useful as a tool for modeling a network at the kind of high level useful when preparing to deploy a set of network services. However, CORE appears to have little use with regard to evaluating the sensitivity of protocol implementations to the kinds of network conditions that can occur on a real network. (Since each virtual host/node in CORE is a Linux container it may be possible to add some level of traffic impairment by using the Linux netem and the like via the tc command.)

More information may be found via the following links:

GNS3

GNS3 allows a user to construct a hypothetical network of routers and switches (from Cisco and other vendors). It appears that GNS3 is oriented more as a tools for learning how to configure Cisco (and other) devices than as a tool to evaluate either network performance or protocol stacks.

More information may be found via https://docs.gns3.com/

NE-ONE

NE-ONE is a network simulator much like CORE (and somewhat like GNS3) with the addition of physical network interfaces so that the user’s own traffic may be directed through that simulated network.

More information may be found via https://itrinegy.com/

Packetstorm

Packetstorm is a family of man-in-the-middle emulators covering a range of network interface speeds.

More information may be found via https://packetstorm.com/

Shunra

Shunra is a modeling tool. It calculates expectations about how packet traffic will flow across a user-defined network. Internally, Shunra products can be coupled to various kinds of synthetic traffic generators to create predictions of how well various network services ought to perform on that user-defined network.

More information may be found via http://media.shunra.com/datasheets/ShunraNVforHP.pdf

Netropy

Netropy is a family of hardware (or cloud-based) man-in-the-middle emulators. Netropy emulators appear to be aimed more at high packet rates than at a rich suite of flexible, focused, highly configurable impairments (such as loss, latency, burst emulations, etc.) Indeed the Netropy impairments appear rather simplistic and of quite limited flexibility. Netropy devices appear to use physically distinct network interfaces as a primary means to differentiate between traffic flows thus inconveniencing the user who must create means to arrange for traffic to be vectored to the desired physical interface.

More information may be found via https://www.apposite-tech.com/wan-simulation/

About the author: Who am I to say these things? Why should you believe me?

I’ve been building network code and products since the early 1970’s. You can read about me online on my website, cavebear.com.

I have built many network systems and products – and, of course, I have created many bugs that had to be detected, tracked down, and fixed.

I come from a line of repair people: My grandfather repaired radios, my father repaired televisions. They had me diagnosing and fixing electronics from a very early age.

They taught me the value of having the right tools. For example, some TV repair folks would waste large amounts of time tweaking a capacitor or coil on the back of the TV, run around to the front to view the picture quality, and then run to the back to make more adjustments. I learned that a small mirror can be used as a tool to avoid those run-around cycles – so that I could work behind the TV while still seeing the picture.

Over the years I have noted that we do not have vast suites of tools to help. We do have tools ranging from ping to traceroute to Wireshark to tcpreplay. These are useful and necessary, but these tools are not sufficient. We do not have many tools to help us do fine-grained or repeatable adjustments to traffic flows to tease out difficult problems or even to recreate customer experiences so that we can isolate and debug customer experiences.

I spent more than a decade designing, deploying, and running the Interop show networks. These events were hotbeds of test issues: Each one had large numbers of new products, with new code, interacting with other new products from other vendors. At these events Mr. Murphy was busy enforcing his law: If something can go wrong, it will, and at the worst possible time. Our show network team used or built pretty much every tool we could find. I even designed the first Internet “butt set” – Dr. Watson, The Network Detective’s Assistant – so that we (and others) could quickly do testing even in remote or difficult locations.

In the mid 1990s I helped start a company to do entertainment grade video distribution over the Internet. Our code had to deal with many network issues – most particularly packet delay variation/jitter and packet loss issues. (For example, users do not like audio stuttering or video blotches; so we had to take care that we properly fed incoming audio and video packets to codecs no matter what delays or drops [or other impairments] the network may be exhibiting at any given moment.) But we had no good tools to exercise our code under these conditions, or to perform regression to check that we had actually fixed the bug.

So I set out to build tools to help, which, over a series of product generations, has resulted in KMAX.

Table of Contents