Data Mining, is stored in a database, data warehouse or other repository of large amounts of data to obtain valid, novel, potentially useful, and ultimately understandable patterns of non-trivial process.
What is data mining
Data mining, in the field of artificial intelligence, habits also known as knowledge discovery in database ( Knowledge Discovery in Database, KDD ), and some of the data mining and knowledge discovery in database is regarded as an essential step in the process of. Knowledge discovery process consists of three phases: ( 1) the data preparation, data mining ( 2), ( 3) expression and interpretation of results. Data mining can be associated with the user or knowledge base interaction.
Not all of the information found in the task was considered to be a data mining. For example, the use of database management system for individual records, or through the Internet search engine to find the specific Web page, is the information retrieval ( information retrieval ) field missions. Although these tasks are important, may involve the use of complex algorithms and data structures, but they mainly rely on traditional computer science technology and data features to create the index structure, thereby effectively the organization and retrieval of information. In spite of this, data mining technology has also been used to enhance the ability of information retrieval system.
The origin of data mining
Necessity is the mother of invention. In recent years, data mining has attracted great attention of the information industry, the main reason is the presence of large amounts of data, and can be widely used, and is in urgent need of these data into useful information and knowledge. Access to information and knowledge can be widely used in various applications, including business management, production control, market analysis, engineering design and science exploration.
Data mining using from the following areas: ( 1) from the thought of statistical sampling, estimation and hypothesis testing ( 2), artificial intelligence, pattern recognition and machine learning algorithm, modeling technology and learning theory. Data mining also quickly adopted from other fields of thought, these areas include optimization, evolutionary computation, information theory, signal processing, visualization and information retrieval. A number of other areas also plays an important supporting role. In particular, the need for database system to provide efficient storage, indexing and query processing support. Due to high performance ( parallel ) computing technology in the treatment of massive data sets is often important. Distributed technology can also help in dealing with mass data, and when the data can not be together when it is of vital importance.
No comments:
Post a Comment