Network Security

Malicious hackers and IT defenses are in an arms race. Constant improvement in the speed and sophistication of deep-packet inspection is required as we move to 40Gbps and beyond. Signature-based detection systems need fast, reprogrammable hardware for complex signature analysis. Spotting ”zero day” attacks and polymorphic malware drives the need for novel and flexible systems.

The avalanche of attacks on computers and networks worldwide by hackers is widely recognized as a serious problem, and it is increasing at an ever faster rate. Browser hijackers, ransomware, keyloggers, backdoors, rootkits, Trojan horses, worms, spyware, denial of service attacks, and other variations are inflicted on us in an arms race increasingly fueled by sophisticated criminal groups (and some governments), with motivations ranging from financial theft to industrial espionage and sabotage.

Automata Processors scale linearly for malware pattern search and analysis. For a complex SNORT example, estimated throughput is

  • 1 Gbps per processor
  • 2 Gbps per ‘rank’ of 8 processors
  • 8 Gbps per 4 ranks (one PCIe card)… and beyond

A variety of defensive strategies are used at a variety of places in the network. Firewalls, intrusion-detection and prevention systems, anti-virus scanners and other solutions need constant upgrades to cope with the malware onslaught.

Top 10 malicious programs distributed via email, Kaspersky Spam Report, December 2013
Network security01

This top-10 virus chart illustrates how frequently “new” malware is simply a variant of familiar ones. The 2nd, 4th and 6th positions in the ranking are simply variants of Trojan-PSW.Win32.Tepfer. They steal cookies, passwords, and email details.

Signature-based detection has long been a mainstay defensive strategy. These methods look for known patterns of data in files attached to emails or in disk storage already, or inside inbound network packet payloads (deep-packet inspection). Although the signature-based approach can effectively contain known virus outbreaks, malware authors can stay a step ahead of simple signature searches by writing “oligomorphic,” “polymorphic,” or “metamorphic” viruses which modify and/or encrypt parts of their code over time to hide themselves from simple signature-based searches.

Oligomorhpic: The decryptor takes one of a few predefined forms. Polymorphic: The decryptor takes many different forms, like encryptions. Metamorphic: The whole body mutates, generally without a decryptor, so that it does not even resemble the same shape and size through generations. Each is progressively harder to detect.
Network security02

To counter these variants, heuristic searches or “generic signatures” are used to identify variants by looking for many variations of known malicious code patterns. Using more complex wildcarded regular expression techniques, the search can better detect new variant malware even if they are padded with extra, meaningless code, re-arranged, and/or partially encrypted. The compute load for looking at so many variations is much larger than simple pattern searches, especially as wired and wireless speeds are increasing so rapidly over time.

Following is an example of a complex security rule, taken from Snort, converted to an automaton that runs directly on the Automata Processor. This particular rule is designed to capture a buffer overflow attack on an Apache web server.

Network security04
Regular expression converted to an automata network
Network security03

The test results shown in the table below were extracted from Snort and modified to increase the number of patterns including character classes, and to increase the percentage of patterns using unbounded repetitions of wildcards.

The SNORT rules were implemented in the Automata Processor SDK tool chain to quantify the following: the number of regular expressions in the dataset, the number of NFA states needed to implement the datasets, the number of state transition elements used after configuration of the chip by Micron's AP compiler, and the percentage of AP chip capacity consumed.

Estimated utilization of AP hardware
Ruleset Description Num of RegEx NFA States STEs Used % of chip used
Backdoor real 226 4.3k 4.5k 13
Spyware real 462 7.7k 7.8k 23
EM only exact match patterns 1k 28.7k 29.7k 78
Range 5 50% of patterns have char-classes 1k 28.5k 29.3k 78
Range 1 100% of patterns have char-classes 1k 29.6k 30.4k 80
Dot-star 0.5 5% of patterns have " 1k 29.1k 30.0k 77
Dotstar 1 10% " 1k 29.2k 30.0k 77
Dotstar 2 20% " 1k 28.7k 30.0k 77
Dotstar 3 30% " 1k 28.7k 4.5k 76
The following table shows the estimated throughput on AP hardware, which will be deterministic regardless of the number of SNORT rules that are evaluated simultaneously.
Ruleset Throughput: 1 AP Device Throughput: 1 Rank of 8 AP Devices Throughput: 1 PCIe card (4 ranks)
Backdoor 1 Gbps 2 Gbps 8 Gbps
Spyware 1 Gbps 2 Gbps 8 Gbps
EM 1 Gbps 2 Gbps 8 Gbps
Range 5 1 Gbps 2 Gbps 8 Gbps
Range 1 1 Gbps 2 Gbps 8 Gbps
Dot-star 0.5 1 Gbps 2 Gbps 8 Gbps
Dot-star 0.1 1 Gbps 2 Gbps 8 Gbps
Dot-star 0.2 1 Gbps 2 Gbps 8 Gbps
Dot-star 0.3 1 Gbps 2 Gbps 8 Gbps

The usage of state transition elements corresponds nearly 1-to-1 with the number of NFA states and that resource utilization does not grow with expression complexity. Thousands of rule sets will fit in a single Automata Processor chip and will compute results at exactly 1 Gbps per chip. A rank of 8 chips configured with 2 groups of the entire SNORT rule set would run at 2 Gbps. A PCIe card with 4 ranks would run at 8 Gbps and further scaling can be obtained by adding more cards to the system.

Some of malware’s tricks can even be used against it—for example, code obfuscation. Some malware attacks involve JavaScript malware. Hackers will write script code that is obfuscated or insert meaningless code to mask or hide malicious code, such as meaningless assignment of string, integer and float variables, encryption, reversing, and breaking up string variables. However, those typical code obfuscations can be heuristically searched as well, in case there are detectable patterns inside the obfuscated portions.

The immensely parallel Automata Processor (AP) technology can handle extremely complex, heavily wild-carded regular expression searches at unprecedented speeds. Extremely complex heuristic searches can be done to catch morphed malware patterns much more cost-effectively than with un-aided traditional CPUs. AP-based accelerator cards are also much more feasible to deploy in high-density server datacenters due to ultra-low power, just a few watts per device. (See bottom note for more detail.)