This paper introduces the conception of data quality and the issues that the attention has not enough beenpaid to the data quality in data mining (DM). Then, it analyze and emphasize that the data quality is crucial for manyapplications in DM with real examples. Finally an example of the iatric diagnoses application is given to show how toimprove the data quality.
The classic sequential frequent pattern mining algorithms are based on a uniform mining support, either miss interesting patterns of low support or suffer from the bottleneck of pattern generation. In this thesis, we extend FP-growth to attack the problem of multi-level multi-dimensional sequential frequent pattern mining. The experimental result shows that our E-FP is more flexible at capturing desired knowledge than previous studies.