* Sale Price for only Code / simulation – For Hardware / more Details contact : 8925533488
Recently, the huge amounts of data and its incremental increase have changed the importance of information security and data analysis systems for Big Data. Intrusion detection system (IDS) is a system that monitors and analyzes data to detect any intru? sion in the system or network. High volume, variety and high speed of data generated in the network have made the data analysis process to detect attacks by traditional techniques very difficult. Big Data techniques are used in IDS to deal with Big Data for accurate and efficient data analysis process. This paper introduced Spark?Chi?SVM model for intrusion detection. In this model, we have used ChiSqSelector for feature selection, and built an intrusion detection model by using support vector machine (SVM) classifier on Apache Spark Big Data platform. We used KDD99 to train and test the model. In the experiment, we introduced a comparison between Chi?SVM classifier and Chi?Logistic Regression classifier. The results of the experiment showed that Spark? Chi?SVM model has high performance, reduces the training time and is efficient for Big Data.
There are many types of researches introduced for intrusion detection system. With emerge of Big Data, the traditional techniques become more complex to deal with Big Data. Therefore, many researchers intend to use Big Data techniques to produce high speed and accurate intrusion detection system. In this section, we show some researchers that used machine learning Big Data techniques for intrusion detection to deal with Big Data. Used cluster machine learning technique. The authors used k-Means method in the machine learning libraries on Spark to determine whether the network traffic is an attack or a normal one. In the proposed method, the KDD Cup 1999 is used for training and testing. In this proposed method the authors didn?t use feature selection technique to select the related features.
Spark Chi SVM
proposed model In this section, the researchers describe the proposed model and the tools and techniques used in the proposed method. Figure?1 shows Spark-Chi-SVM model. The steps of the proposed model can be summarized as follows:
1 Load dataset and export it into Resilient Distributed Datasets (RDD) and Data Frame in Apache Spark.
2 Data preprocessing.
3 Feature selection.
4 Train Spark-Chi-SVM with the training data set.
5 Test and evaluate the model with the KDD data set.
The KDD99 data set is used to evaluate the proposed model. The number of instances that are used are equal to 494,021. The KDD99 data set has 41 attributes and the ?class? attributes which indicates whether a given instance is a normal instance or an attack. Table provides a description of KDD99 data set attributes with class labels.
- Operating System 😕 ? Windows 7, 8 and 10 (32 and 64 bit)
- Front End :???????? Python
- Packages :??? ???? numpy, Pandas, itertools, matplotlib,???? sklearn, Spark
- Back End :???????? DataSet
2.3.2 HARDWARE REQUIREMENTS:
- Processor?? -??????? Dual Core
- Speed -??????? 1 GHz
- RAM -??????? 4 GB
- Hard Disk -??????? 200 GB
- Key Board -??????? Standard Windows Keyboard
- Mouse -??????? Two or Three Button Mouse
- Monitor -??????? SVG