-
oa Information Security in Artificial Intelligence: A Study of the possible intersection
- الناشر: Hamad bin Khalifa University Press (HBKU Press)
- المصدر: Qatar Foundation Annual Research Conference Proceedings, Qatar Foundation Annual Research Conference Proceedings Volume 2016 Issue 1, مارس ٢٠١٦, المجلد 2016, ICTPP1679
ملخص
1. Introduction
Artificial Intelligence or A.I attempts to understand intelligent entities, and strives to build ones. And it is obvious that computers with human-level intelligence (or more) would have huge impact in our everyday lives and the future. A student in physics might reasonably feel that all the good ideas have already been taken by Galileo, Newton, Einstein, and the rest, and that it takes many years of study before one can contribute new ideas. Artificial Intelligence, on the other hand, still has openings for a full-time Einstein. [1] With the ever increasing amounts of the generated information-security data, smart data analysis will become even more pervasive as a necessary ingredient for more effective and efficient detection and categorization. Which will provide intelligent solutions to security problems that perform beyond the typical automatic approaches. Moreover, the combination of Artificial Intelligence and Information Security focuses on analytics of security data rather than simple statistical reporting. In our research, we are conducting a survey on the different A.I methodologies that are being used by researchers and industry professionals in order to tackle security problems. We added our own analysis and observations on the findings, and compared between different methods. We are working on providing more details about which approaches that suit certain problems, and how some A.I methodologies are not always a good choice for specific information security problems. By this work, we are trying to introduce the intersection of the two fields of Information Security and Artificial Intelligence, and to hopefully promote more use of intelligent methods in solving cyber security issues in the middle-east. The background is divided into two parts: the first part is about the different forms of information security data sets, and the second part of background is briefly giving examples of major corporations that use A.I to address security issues. The background part is followed by the results and discussion, at which we expressed our own opinions, observations and analysis. Our work is still in progress, so we concluded the paper by stating our future missions of this research.
2. Background
Artificial Intelligence has always proven its way through the successful applying in solving various industrial problems related to medicine, finance, marketing, low and technology. It is the cognitive era we are living in, and according to IBM, the augmented-intelligence systems like IBM-Watson process information themselves, and they can teach too. Which will lead to more cognitive learning platforms that eradicate the need to manually work in industrial problems. [2] On the other side, the overwhelmingly huge sizes of data generated by networking entities and information security elements is considered as a rich and valuable resource to more promising security insights.
2.1 Information Security Data Sets
Data can take different forms when it comes to information security. Starting from the logs, as Windows security logs, servers logs, outputs generated from networking tools as Snort, TCPDump, and NMAP [3]. In addition, Sandbox output [4] when malwares are executed, the sniffed network-traffic in a.pcap [Packet CAPture] files, and features of installed android mobile applications [5] are all examples of information security data that can be treated as input to the Artificial Intelligence techniques as Machine Learning, Data Mining, Artificial Neural Networks, Fuzzy logic systems and Artificial Immune Systems. For the experimental and academic purposes, there are various online repositories that provide information security data sets as DARPA data sets [6].
2.2 Examples from the major corporations
2.2.1 Kaspersky Cyber Helper
Cyber Helper is a successful attempt in getting nearer to employing truly autonomous Artificial Intelligence in the battle against malware. The majority of Cyber Helper autonomous sub-systems synchronize, exchange data and work together as if they were a single unit. For the most part they operate using fuzzy logic and independently define their own behavior as they go about solving different security tasks. [7].
2.2.2 IBM – IBM Watson
IBM Watson is a technology platform that uses Natural Language Processing or NLP and Machine Learning to reveal insights from large amount of unstructured data. [8] IBM Watson can be trained up on massive amount of security data from the Common Vulnerabilities and Exposures, or CVE, threat database to articles on network security, plus deployment guides and how-to manuals, and all sorts of content that makes IBM Watson very smart on security. IBM Watson uses NLP technologies, so the users might pose security questions to Watson, and Watson will response with all pertinent applications. [9]
3. Results & Discussion
Through the survey that we conducted on number of different researches and projects that work on the intersection area of A.I and Information security, we came across the following:
1- We noticed that the most commonly followed approach is Machine Learning. But not all the Machine Learning algorithms were right for solving every information security problem. Some algorithms result in high rates of false alarms, false negative and false positive for certain kind of information security issues. So it turns out that deciding the most suitable A.I approach to follow depends on the nature of the information security problem we are trying to solve, what kind of data we do have in hand, whether we have classes or labels for those data or not, and many other factors.
2- There is a preprocess stage for the unstructured data before it becomes ready for upload into the selected Artificial Intelligent model or method. When the information security data was an android malware, it had to be executed inside a Sandbox first, and then a report was generated about the execution of this malware. Each different type of Sandbox usually generate different format of reports. The common format are text, XML and MIST [Sequence of instructions]. Text is more convenient for humans but XML and MIST is are more suitable for machines. [3] Another example is when the features of installed android mobile applications were the input to the artificial intelligence processes. Those features had to be extracted from the dex code [Android programs are compiled into.dex Dalvik Executable files, which are in turn zipped into a single.apk file] [5].
3- When Machine Learning is the selected approach to be applied on the information security problem, the dataset defines what action need to be followed: clustering, classification, feature-selection or a combination between different processes. When the security data set has labels [supervised learning] then the applied actions are classification algorithm. When there are no labels or classes on the data rows of the security data set [Unsupervised learning] then we need to group the similar entities in group, and thus we use clustering Machine Learning algorithms. Some researchers used both clustering and classification techniques for security applications that detect malicious malware behavior that has not been previously assigned to a certain malware class, and they see both clustering and classification as two techniques that complement each other. [4] The feature selection was conducted as well on security datasets to identify the features that help the most in the prediction process and building of the predictive models.
4- The use of some of the Linear Algebra techniques as vector space, which was combined with the static analysis, was suggested by one of the researchers, so as to have a better representation of the selected features of installed android mobile applications that are suspected to be malicious.
4. Conclusion
The application of Artificial Intelligent methodologies in addressing the information security problems will play a major role in bringing brighter insights that move the security forward; by formulating more successful winning approaches leading to a security that “thinks”. The huge overwhelming amount of data that is generated by networking devices and security appliances will be of a great use when combined with the intelligence of machine learning; and going beyond the traditional and limited automatic techniques of information security. Our research work on the intersection between the Artificial Intelligence and Information Security is still on going. And we hopefully look forward by the end of this work to help in designing a matrix with more accurate criteria that help the information security practitioners to decide which A.I approach to follow. Security professionals will no longer need to dwell into the deep mathematical formulas of A.I and machine learning when they start considering more intelligent alternative solutions.
5. References
[1] Russell, Stuart J. and Norvig, Peter. Artificial Intelligence: A Modern Approach. A Simon & Schuster Company 1995.
[2] Powered by IBM Watson: Rethink! Publication - Our future in augmented intelligence. August 2015.
[3] Buczak, Anna L. and Guven, Erhan. A survey of data mining and machine learning methods for cyber security intrusion detection. 2015.
[4] Rieck, Konrad. Trinius, Philipp. Willems, Carsten, and Holz, Thorsten. Automatic Analysis of Malware Behavior using Machine Learning. 2011.
[5] Arp, Daniel. Spreitzenbarth, Michael. Hübne, Malte. Gascon, Hugo. Rieck, Konrad. DREBIN: Effective and Explainable Detection of Android Malware in Your Pocket.
[6] DARPA Intrusion Detection Data Sets – MIT Lincoln Laboratory. Link: http://ll.mit.edu/ideval/data.
[7] Oleg Zaitsev. Cyber Expert. Artificial Intelligence in the realms of I.T security, link: http://securelist.com/analysis/publications/36325/cyber-expert-artificial-intelligence-in-the-realms-of-it-security October 25, 2010.
[8] IBM Watson official website, link: http://www.ibm.com/smarterplanet/us/en/ibmwatson/what-is-watson.html.
[9] An interview with Amir Husain on how IBM Watson is helping to fight cyber-crime, link: http://www.forbes.com/sites/ibm/2015/05/29/how-ibm-watson-is-helping-to-fight-cyber-crime/ MAY 29, 2015.