📧 nivk99 - Niv Kotek
📧 oriazadok - Oria Zadok
Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.
Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation.
The name of the paper : Malware Detection With NLP.
The feature: API call
The type of classifier: static:
problem attack.
The number of features.
A. First we opened the classifier and checked which files it works on.
B. We printed the names of the applications that the classifier took as "test" and those that he took as "training". (From this section we work on the applications that were taken for testing)
C. We found that the features are xml files that describe the API of the applications.
D. We looked for the weak spots of the features and changed the values of the features in the xml files.
E. After section D was unsuccessful, we took several features from an application that was classified as "benign" and put them into an application that was classified as "malicious".
F. After a considerable number of activations of the classifier after each change we were able to narrow the search range.
G. We found that a number of features (which we took from the application which was classified as "benign") that we add (More than 5000 lines)to an application that is classified as "malicious" lowers the accuracy of the classifier.
H. After that, we added empty xml tags in the amount of more than 5000 lines of code to the beginning of the features of the application which was classified as " malicious " and we found that even now it lowers the accuracy of the classifier.
I. Next, we added more than 5000 lines of code empty xml tags to the end of the app's features that were classified as "bad" and found that the classifier found them to be bad.
J. After that, we added in a comment block the empty xml tags in the amount of more than 5000 lines of code to the beginning of the features of the application classified as " malicious " and we found that even now it lowers the accuracy of the classifier.
K. After that, we added the same empty xml tags (which have no meaning) to other apps that were classified as " malicious ", but this time the classifier found them to be malicious.
L. After that, we added various empty (meaningless) xml tags to other apps that were classified as " malicious", and found that the accuracy of the classifier decreased.
In conclusion: we discovered that the weak point of the classifier is the number of features. That is, if we add to the beginning of the features of the application an amount of more than 5000 lines of code of empty characters (which have no meaning) or any features from an application that was classified as " benign", we will lower the accuracy of the classifier.
A. <"stam name="android.nlp"">
B. <"package name="android.support.v4.app"">
A. We will add a code that checks which bad apps the classifier takes for "testing".
B. We will add a code that opens the " malicious " application with Apktool.
C. We will add to the smali group a file with a name starting with the letter "A" (so that the features we add will be first in the xml).
D. We will add to the file that starts with the letter "A" any features from the application that was classified as " benign " (at least 5000 lines of code)
D. We will add a code that closes the " malicious " app after the changes with apktool
E. We will run the classifier on the application after the changes.
A. We did a test with 18 benign apps and 18 malicious apps, of which about 4 benign and 4 malicious apps were taken for testing. We ran the classifier and the accuracy came out 1.0. After that we opened the 4 malicious applications of the test with apktool and added several features from a benign application randomly. After that, we closed the applications and ran the classifier and got 0.5 accuracy(The addition was done in the same way for all 4 apps)
B. Any tags can be added as features or any features from a benign application
C. Must have at least more than 5000 lines of code of adding features
D . The addition of the features must be at the beginning of the xml.
E. The testing and addition of features was done by apps randomly (that is, the weak point of the classifier is the number of features)
DroidBot is a lightweight test input generator for Android. It can send random or scripted input events to an Android app, achieve higher test coverage more quickly, and generate a UI transition graph (UTG) after testing.
We will use Droidbot to , to verify that the attack did not damage the functionality of the application.
The test was done on 100 apps before and after the change can be seen from the results the attack did not damage the functionality of the application.
Droidbot results can be seen here
An android malware detection system implemented in Python using NLP technique of document vectors based on the work of Tomas Mikolov and Quoc le. Their paper can be found at: https://arxiv.org/abs/1405.4053
Our article can be found at the link here
https://github.com/nivk99/androidMalwareDetectionWithNLP.git






