Acta Metallurgica Sinica(English letters) ›› 2009, Vol. 16 ›› Issue (6): 113-120.doi: 10.1016/S1005-8885(08)60296-4

• Others • Previous Articles     Next Articles

New heuristic method for data discretization
based on rough set theory

ZHAO Jun, ZHOU Ying-hua   

  1. Institute of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2008-10-22 Revised:1900-01-01 Online:2009-12-30
  • Contact: ZHAO Jun


Data discretization contributes much to the induction of classification rules or trees by machine learning methods. The rough set theory is a valid tool for discretizing continuous information systems. Herein, a new method is proposed to improve those typical rough set based heuristic algorithms for data discretization, by utilizing decision information to reduce the scales of candidate cuts, and by more reasonably measuring cut significance with a new conception of cut selection probability. Simulations demonstrate that compared with other typical discretization algorithms based on the rough set theory, the proposed method is more capable and valid to discretize continuous information systems. It can effectively improve the predictive accuracies of information systems while still conceptually keeping their consistency.

Key words:

data discretization;rough set theory;cut;cut significance;selection probability