Concept With the continuous development of database technology and the wide application of database management systems, the amount of data stored in the database has increased dramatically. However, in the face of these massive data, there are few tools that can be analyzed and processed at present. The limitations of the tools currently used make it impossible to extract many important information hidden behind a large amount of data, which can well support people's decision-making. In order to solve these problems in practice and meet people's needs, the knowledge of KDD in the database has gradually developed. KDD is also called Data Mining-DM. Actually, there are differences between the two, but generally it is not necessary. Use it differently.
Data mining itself is an integration of multiple technologies, including mature database management systems, data warehousing, statistics and machine learning technologies. Data mining can be applied to many areas and process control, such as medicine, finance, intelligence, law, defense, logic, education, and can also be used in anomaly detection and diagnosis. In many areas of scientific research and engineering practice, there are often situations that require rapid diagnosis and decision making. For a complex large system such as a power system, this situation often occurs, such as a power system failure.
Data mining is suitable for such hidden rules of discovery, and is used for rapid diagnosis and decision-making.
The definition of data mining is usually: data mining is a method of exploring a large number of enterprise data according to the established business objectives, revealing the regularity of hiding and further modeling it.
A well-recognized definition of KDD is that KDD is a process that extracts credible, novel, effective, and understandable patterns from large amounts of data. This process is an advanced process.
There are several points in this definition that need to be explained. "Data" refers to a collection of facts F, which is used to describe information about things, and generally these data are accurate. "Mode" means that for the data in the set F, the characteristics are obtained by the description of a certain language L. “Processing†refers to the multi-step process in KDD, including data preprocessing, pattern extraction, knowledge assessment, and process optimization. The pattern that "trusted" discovers from current data through KDD must have a certain degree of correctness, otherwise KDD will have no effect.
This should be the case for the system. If the pattern obtained by KDD only reveals the general law, it is considered useless. The "potential effect" means that the extracted pattern should be meaningful. If the extracted pattern is novel, it has no practical significance and is considered useless.
Application Technology China Powerbookmark3 helps people better understand the information contained in the database. 2 Process knowledge processing in the database The process model knowledge discovery process model is multi-stage. Usually, the KDD process can be divided into several stages: (1) analysis, understanding and definition of domain problems. Data miners work with domain experts to conduct in-depth analysis of the problem to determine possible solutions and methods for evaluating learning outcomes. (2) Collection, extraction and cleanup of relevant data. Collect relevant data based on the definition of the problem. In the data extraction process, database query function can be utilized to speed up data extraction.
At the same time, you need to understand the meaning of the fields in the database and its relationship with other fields, and then check the legality of the extracted data and clean up the data with errors. (3) Data engineering. The data is reworked, mainly by redundant attribute culling, selecting representative data from a large amount of data to reduce the amount of learning and converting the representation of the data to be suitable for learning algorithms. (4) Select and run the data mining algorithm. Select the appropriate data mining algorithm based on the problem and data to be solved and decide how to use the algorithm on the data. Then, according to the selected knowledge discovery algorithm, the processed data is subjected to pattern extraction, that is, data mining. (5) Evaluation of the result model. The assessment of the mining results depends on the problem to be solved, and the domain experts evaluate the novelty and effectiveness of the discovery model. (6) Expression and use of the results. The result model is expressed in a form that people can understand, and these mining results are applied in practical work to support decision making.
This model emphasizes the involvement of data miners and domain experts in the entire process of KDD. The domain experts are very clear about the problems that need to be solved in the field. The domain experts explain to the data mining personnel in the analysis, understanding and definition stages of the problem. The data mining personnel introduce the technology used in data mining and the types of problems that can be solved to the domain experts. . The two sides understand each other and reach a consensus on the issues to be resolved, including the definition of the problem and the way the data is processed. After data miners get accurate problem definitions and analysis, they begin to collect the data they need to use and reprocess them to make the data more suitable for later mining algorithms. Choose the appropriate mining algorithm based on the needs of the problem. The extracted knowledge needs to be explained to the domain experts to evaluate the knowledge and the whole process.
It can be seen that the model given above is mainly based on the needs of practical applications. It mainly emphasizes the participation of domain experts, and the professional knowledge of the field guides the various stages of KDD and evaluates the discovered knowledge. This model is also the most commonly used one in actual engineering. At the same time, data mining is just one of many stages in KDD, and it is the most important one because it can discover hidden patterns. But usually both can be called data mining indiscriminately.
3 Typical data mining system architecture The typical data mining system architecture is mainly composed of the following components: typical data mining system architecture database, data warehouse or other information storage means one or more databases, data warehouses, extended forms Or other kinds of information warehouses. (2) The database or data warehouse server is responsible for obtaining relevant data, which is based on the user data mining requirements. (3) Knowledge base refers to knowledge in a certain field, used to guide data search or to evaluate the pattern of results that users care about. (4) Data mining engines are usually composed of functional modules for specific tasks, such as description, joint analysis, classification, evolution and alienation analysis. (5) The mode evaluation module usually uses the benefit measures to interact with the data mining module to make the search for data develop in the direction of the user's concern. For more efficient data mining, model benefit assessment should be combined as closely as possible with the mining process to limit the search for patterns of interest only. (6) The user interface is mainly responsible for the interaction between the user and the data mining system, providing information to help the user centralize the search direction of data mining.
4 Application 4.1 Data Utilization Status in Power System In the power system, except for some special applications, the main sources of various data include real-time data, archive data, and analog data. At the same time, each data source contains many different kinds of data, all of which constitute an extremely large information storage system. However, at present, in the actual operation and planning management of the power system, the amount of information obtained by people through these data is only a part of the amount of information contained in these massive data, such as the results of power flow calculation, state estimation, etc. The more important information hidden behind these data is the overall characterization of the data and the prediction of its development trend, which is not available in conventional methods, but it is of great value in the process of decision making. That is to say, a large amount of useful data has not been fully developed and utilized. This situation is bound to lead to a situation in which although the application technology is sufficient, the information that can be obtained from it is relatively lacking, that is, many valuable data are Information extraction is in a "death" state, and a large amount of available resources are wasted. All of this is due to the lack of technology for deep analysis of data.
4.2 Data mining applications based on power systems Since the current data utilization in power systems is insufficient, the information obtained from them is relatively lacking and single, so a deep data analysis technique suitable for power system applications is needed to change this situation. The gradual maturity of data mining technology (ie, knowledge discovery in databases) has brought this opportunity. Applying data mining techniques to these data in a manner suitable for use in power systems will facilitate the use of these potentially important information.
In the power system, there are several types of data that can be applied to data mining techniques: (1) the range characteristics (including time and space) and statistical characteristics of the power system, often containing thousands of state variables; (2) mixed existence Discrete information (such as network topology changes or protected actions, etc.) and continuous information (such as some continuously changing state variables) (3) grasp and deal with certain uncertainties (such as noise and incomplete information, etc.) ).
When using classical power system analysis methods to process this data, it is usually only possible to get some general application results for conventional targets. However, the use of data mining technology can solve some problems that cannot be solved or solved by traditional methods. For some specific conventional problems, using this technology sometimes has higher efficiency or better results.
The following are some application aspects of data mining technology in the power system: (1) Classification of the operating state of the power system. The power system is divided into a normal state, an alert state, an emergency state, a test state, or a recovery state. This classification of the power system into various states is important. Because once the state of the power system is determined, an appropriate command for that state is sent to the operator to complete the operation. Data mining algorithms contribute to this sorting process.
A description of the operating state of the power system. That is, a machine learning algorithm is used to learn a rule that describes a certain power system operating state that is satisfied by data in the database. For example, the emergency state of a power system is described by voltage drops on multiple bus bars and other features. Data mining helps to find better description rules. (3) Using numerical rules to analyze the relationship between power system faults.
This type of data mining takes advantage of the numerical law form by learning a function that can use a given data to predict a new input value. Data mining can identify certain relationships that arise when different incidents occur, providing a reliable description of power system failures. (4) Stability analysis and safety assessment of the power system. This kind of knowledge discovery often exists in the form of decision trees or dependent tables. For example, the decision tree can be used to divide the power system into stable state and unstable state, and use other machine learning techniques to evaluate the security of the power system.
Of course, this also requires a reasonable description of certain rules. (5) Detection and prediction of changes and alienation in the operation of power systems. Data mining can be used to discover many important potential changes from a large amount of historical data stored in the past, and then use the domain knowledge of the power system to systematize it for further use. This type of data mining is very meaningful for power system load forecasting, electricity price pricing strategies in the electricity market, and so on. (6) Construct an expert system using the inductive rules obtained from the analysis of accident cases. Data mining can be used to analyze the power system fault report database to form an inductive rule that can be applied to diagnostic expert systems for different types of faults. This method of using the inductive rule to form an expert system is relatively easy.
4.3 Main Advantages Compared with the power system analysis method oriented to classical theory, data mining can show higher superiority in three main aspects: predictability, computational efficiency, and uncertainty for potential problems and laws. Detection and management. (1) Predictability with higher potential problems and laws. In the current engineering practice, engineers often have to solve some new problems after the unsatisfactory results in the system, that is, the overall lack of a high predictability of potential problems and laws. However, “the description of the overall characteristics of the data and the prediction of its development trend†is the characteristic of data mining. Data mining can overcome these difficulties. (2) Higher computational efficiency. Using data mining to extract comprehensive information, rather than numerical results, can bring higher speed to real-time decision making. In addition, for the need to input information, data mining may only require meaningful or usable input parameters without a complete description of the model, ie masking redundant information. These characteristics are bound to bring about an increase in efficiency. (3) Management of uncertainties. Certain events occurring in the power system are always subject to certain unpredictability, such as relay protection misoperation, operator misoperation, incorrect description of a load model, and so on. Data mining is simulated by relaxing the assumptions of the dynamic model, and then effectively managing it with the corresponding domain knowledge.
In short, the structure of the power system is quite complex, and the various problems faced are large and complex. Some cannot establish accurate mathematical models, or they are not allowed to be described by mathematical models alone, and some cannot establish mathematical models. For these problems, the application of data mining technology can show high superiority and is a powerful tool to solve such problems.
4.4 Key Points and Difficulties (1) Guidance role of background knowledge. Background knowledge or theory related to a particular area of ​​research in the power system must be used to properly guide the process of data mining so that the mining algorithm can be closely integrated with the field. (2) Mining different kinds of knowledge. Different types of applications in the power system require different types of information, and data mining should cover a wider range of applications.
The interactivity of the mining process. The background knowledge of a certain area of ​​the power system should be used to guide data mining through interaction, which helps users to focus on the search for the model of interest and improve efficiency.
Application technology mining results are easy to understand and usable. The knowledge that has been discovered should be expressed in a way that is easily understood and utilized, that is, the underlying laws found must be understandable in order to be of practical value. (5) Handling exceptions and incomplete data. Massive information in power systems inevitably contains noise, anomalies, or incomplete data. This information can confuse the analysis process and reduce the accuracy of the discovered patterns. In fact, data mining can effectively manage this information by relaxing the assumptions of the dynamic model. The key is the way to achieve it. (6) Evaluation of data mining result patterns. For the results of the mining, the domain knowledge of the power system is needed to evaluate it, because the result must be applied in a specific field to make sense.
5 Conclusion Data mining is an emerging data analysis tool. So far, some commercial data mining products and research prototypes have been applied. However, the application of data mining combined with the characteristics of power systems to this field has just begun. With the further development of the power industry, the direction of data analysis will be further expanded in various applications of power systems. The conventional methods have been stretched, and data mining has been introduced into power system analysis in a timely manner, which will certainly be active in solving existing problems. effect. Researchers working on power system-related issues should have an understanding of data mining and can use different technologies to get comprehensive and practical solutions.
Eva Slipper,Eva Sole Slippers,Ladies Eva Slippers,Relaxo Eva Slippers
YANGZHOU PENGYOU TOURISM SUPPLIES FACTORY , https://www.yzpengyou.com