Data Mining Lab
Implementation of Data Mining Algorithms by Attribute Relation File
Introduction to Weka (Data Mining Tool)
• Weka is a collection of machine learning algorithms for data mining tasks. The
algorithms can either be applied directly to a dataset (using GUI) or called
from your own Java code (using Weka Java library).
• Tools (or functions) in Weka include:
• Data preprocessing (e.g., Data Filters),
• Classification (e.g., BayesNet, KNN, C4.5 Decision Tree, Neural
• Regression (e.g., Linear Regression, Isotonic Regression, SVM for
• Clustering (e.g., Simple K-means, Expectation Maximization (EM)),
• Association rules (e.g., Apriori Algorithm, Predictive Accuracy,
• Feature Selection (e.g., Cfs Subset Evaluation, Information Gain, Chisquared Statistic), and
• Visualization (e.g., View different two-dimensional plots of the data).
The Weka GUI Chooser (class weka.gui.GUIChooser) provides a starting point for
launching Weka‘s main GUI applications and supporting tools. If one prefers a MDI
(―multiple document interface‖) appearance, then this is provided by an alternative
launcher called ―Main‖ (class weka.gui.Main). The GUI Chooser consists of four
buttons one for each of the four major Weka applications and four menus. The
buttons can be used to start the following applications:
• Explorer An environment for exploring data with WEKA (the rest of this
documentation deals with this application in more detail).
• Experimenter An environment for performing experiments and conducting
statistical tests between learning schemes.
• Knowledge Flow This environment supports essentially the same functions as the
Explorer but with a drag-and-drop interface. One advantage is that it supports
• Simple CLI Provides a simple command-line interface that allows direct execution
of WEKA commands for operating systems that do not provide their own command
S.K.T.R.M College of Engineering