Discretization
Discretization
1. Discretization is the process of transforming Quantitative data to Qualitative data.(Also defined as - Discretization is a process that transforms data containing a quantitative attribute so that the attribute in question is replaced by a qualititative attribute.) In Data Mining much of the algorithms use qualitative data and hence the requirement of discretization. Example: Quantitative data represented by the attribute age in numeric values are represented in discrete descriptive terms such as young and old.
2. Discretization divides the value range of the quantitative attribute into a finite number of intervals. The mapping function associates all of the quantitative values in a single interval to a single qualitative value. A cut point is a value if the quantitative attribute where a mapping function locates an interval boundary. For example, a quantitative attribute recording age might be mapped onto a new qualitative age attribute with three values, pre teen, teenage and post teen. the cut points for such a discretization may be 13 or 18. Values of the original quantitative age attribute that are below 13 might get mapped onto the pre-teen value of new attribute, values from 13 to 18 onto teen , and values above 18 onto post teen.
3. Diverse taxonomies exist in literature to categorize discretization methods. These taxonomies are complimentary; each realating to a different dimension along which discretization methods may differ. Typically, discretization without reference to any other discretization method.
4. Popular Primary methods-
Supervised vs Unsupervised.
Supervised methods refer to the class information when selecting discretization cut points. Unsupervised methods do not use the class information. For example, when trying to predict whether a customer will be profitable, the data might be divided into two classes profitable and unprofitable.
A supervised discretization...
Please login to view the full essay...