The intro
Artificial Intelligence (AI)! Oh my, that sounds exciting and disturbing in the same time. Exciting because man has always fantasized about playing God, and now we are as closer as we ever got to this. Disturbing, because, some say [1] [2] [3], a superior form of intelligence, such as AI, might indisputably put an end to humanity, so to say.
But let’s have a thorough look at what AI really is at the moment. The AI concept, that we refer to today was invented in 1954 by John McCarthy and used for the first time at the famous Dartmouth workshop. This was the moment when AI gained its name, its mission, its first success and its major players. Within its project description, McCarthy carefully introduced the term AI, mainly to avoid confusion with other terms that were rampant at that time. The scope of the conference was “to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves”. [4] The concept of AI took into consideration developments from multiple disciplines or theories that were created during the 40s-50s years, such as:
- Neurology research shown that the brain was an electrical network of neurons,
- Norbert Wiener’s cybernetics described control and communications in closed systems (machines).
- Claude Shannon’s information theory described digital signals.
- Alan Turing’s theory of computation showed that any form of computation could be described digitally. In 1950 Alan Turing published a revolutionary paper in which he speculated the possibility of creating thinking machines (the famous Turing test).
Modern AI
What we nowadays call AI represents a broad discipline, that pursuits the objective of creating an autonomous form of intelligence within machines. Machine learning (ML), a subset of AI, represents the ability of machines to learn to perform different tasks or solve problems. Most people, when they say AI, they actually refer to ML. Deep Learning (DL), represents a set of ML techniques that can recognize patterns (e.g. such as image recognition algorithms). Artificial general intelligence (AGI) involves the intelligence of a machine that can actually act humane, i.e. solve complex and varietal problems and experience emotions. To my knowledge AGI is only utopia at this point.
The author Pedro Domingos, in his inclusive and straightforward book The Master Algorithm, describes 5 approaches that have inspired the science of ML up to now. Fundamentally, we have tried to create intelligence by observing ourselves and the world around us. They are usually called the five tribes of AI.
- SYMBOLISTS have developed learning algorithms based on the use of symbols and inverse deduction that figures out missing data in order to make a deduction.
- CONNECTIONISTS follow the brain functioning model by making decisions based on the strength of the connections (as in the strength of the synapse) between decisions through an algorithm called backpropagation. Deep learning relies here.
- EVOLUTIONARIES use genetic programming which evolves computer programs by copying natures models of mating and evolution.
- BAYESIANS use probabilistic inference and the famous Bayes’ theorem and its derivatives.
- ANALOGIZERS work by analogy, recognizing similarities between decisions (e.g. patients having the same symptoms).
The approaches above led to a multitude of classes of algorithms, each one of them being appropriate for solving a particular problem type. Figure 3 depicts the current development.
ML is everywhere nowadays. Autonomous cars, personal assistants, online advertising and profiling, search engines etc. To be able to keep up with the fast development of the online environment, you need AI which gives you flawless solutions for many problems out there, but not for all.
Cyber security, a case for AI
There is a bit of a hype around AI/ML. According to some experts the world is already being disrupted, as many industries are reshaped by AI, and US humans are under threat of losing our jobs because of intelligent machines .
Before jumping into such conclusions please have a short read into this article, written by one of the most reputable AI experts worldwide. His overall conclusion is that “if a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future”. So, by using AI/ML, we can automate the little annoying things that keep us from focusing on the big picture.
But in some cases, those little things matter a lot. In cyber security the ability to automatically investigate TB of logs is surely an advantage. A study developed by Ponemon Institute entitled Artificial Intelligence and Cyber Security, concluded that “the biggest benefit is the increase in speed of analyzing threats […] followed by an acceleration in the containment of infected endpoints/devices and hosts”.
Among the reasons why you need to adopt AI/ML in cyber security, one of the most prevalent is that nowadays the number of threats and the attack surface has amplified exponentially. The speed of the attacks has also increased. Exposing a vulnerable device online can end badly in just a couple of hours. Combining this with the lack of qualified personnel and the huge amounts of logs that we need to analyze, you end up with a single option and that is to automate as much as you can. Automation is not something new, but automation reinforced by AI might be the real deal in cybersecurity.
By implementing AI/ML you get smarter and more capable autonomous security systems (AI algorithms are very good at identifying outliers from normal patterns) and you can also reduce workloads for Security Operation Centers (SOCs).
Nevertheless, using AI/ML involves certain risks. One of them is that ML algorithms could create a false sense of security. It might give you the impression that you are fully safe, although ML and automation covers only the large majority of threats. You still need human intervention to remove bias (false positive) and adjust results with the business context. ML algorithms are prone to bias, as long as they don’t run on large sets of accurate data. This leads to the question whether ML can help dealing with sophisticated persistent adversaries that do not follow any rules.
Another risk represents overreliance on a single algorithm to drive a security system. The danger is that if that algorithm is compromised, there’s no other signal that would flag a problem with it. A good insight into the issue is given in this article. But the concept of single point of failure is not something new in security, as we are dealing with it for quite a while now.
Probably the most terrifying risk of all is the case when humans might not understand why a certain decision has been taken. Recent ML algorithms have become so complex that decisions cannot be traced and justified easily. A great article on this topic, also called “the dark side of AI”, can be found here.
Yet, using AI/ML in cyber security has become a must, if you want to keep up with the overwhelming number of threats. The benefits clearly exceed the risks. And this is exactly what’s happening at this point. AI/ML technologies have been massively adopted by the players in the industry. The article entitled “The current state of machine intelligence 3.0”, published in 2016 by O’Reilly, provides a good overview of current players in the area of ML. On the Enterprise Functions section of the graphical representation, you can notice also some of the pure AI vendors in the cyber security area. But these are not the only ones. Almost every major player in the field has incorporated ML functions into their products. I personally work for Secureworks, a Gartner Magic Quadrant Leader on Managed Security Services, where we use machine learning to give to our CTU Research Team and Counter Threat Platform (CTP) unparalleled insight into the threat landscape. Basically, this is how we manage to analyze 270 billion security events on a daily basis and provide cutting edge security for our clients.
Another good resource that provides more details on how ML and cyber security work together, is the article Machine Learning for Cybersecurity 101. It gives good insights on actual types of algorithms and methods that can be used for some tasks related to cyber. I have tried to synthesize below the list of useful algorithms and their main focus.
Dimension | Algorithm |
Network traffic analysis | Regression predict the network packet parameters and compare them baseline;
Classification to identify different classes of network attacks such as scanning and spoofing; Clustering for forensic analysis. |
Endpoint | Classification to divide programs into such categories as malware, spyware and ransomware.
Clustering for malware protection on secure email gateways (e.g., to separate legal file attachments from outliers). |
Application | Regression to detect anomalies in HTTP requests.
Classification to detect known types of attacks like injections. Clustering user activity to detect DDOS attacks and mass exploitation. |
User (UBA) | Regression to detect anomalies in User actions.
Clustering to separate groups of users and detect outliers. |
Process (anti-fraud) | Regression to predict the next user action and detect outliers (credit card fraud).
Classification to detect known types of fraud. Clustering to compare business processes and detect outliers. |
Table 1 – ML algorithms and their use in cyber security
But, if you really want to understand the level that has been reached when using AI/ML in security, have a look at the results of the U.S. Defense Advanced Research Projects Agency (DARPA) 2016 Cyber Grand Challenge, the world’s first all-machine cyber hacking tournament. AI/ML systems are now proven to be able to automatically identify software flaws and affected hosts in network.
Therefore, AI/ML and cyber security might go hand in hand pretty well. A comprehensive report published by Institute of Electrical and Electronics Engineers (IEEE.org) entitled “Artificial Intelligence and Machine Learning Applied to Cybersecurity” looks upon the impact of using AI/ML in cyber security. In their view, relying only on conventional security measures will render human personnel incapable of defending the entire system. They consider AI/ML a necessity, and they explore six dimensions where the new technology will impact current operations:
- Legal and Policy areas: who will be legally responsible for failures?!
- Technical and Human Trust: humans will have to work together with ML to satisfy business needs.
- Data: AI needs huge amounts of accurate data, so as to remove the bias.
- Software and Algorithms for AI/ML and Cybersecurity: huge amounts of data need new software.
- Hardware for AI/ML and Cybersecurity: new algorithms need new hardware architecture.
The conclusion
AI/ML is one of the technologies that has made great progresses in the last 10 years. Unfortunately, machine learning IS NOT a silver bullet for cybersecurity (compared to image recognition or natural language processing). BUT it’s surely a necessity!
AI/ML cannot work alone, as humans need to provide the common sense that computers cannot, to ensure that result from AI is also meaningful in the business context.
If you are working in cyber security industry and are concerned about loosing your job, don’t be, it’s not the time yet. AI/ML still needs more improvements, until it will be able to totally replace humans in this area.