Joke Collection Website - Mood Talk - Notes on Mathematical Modeling —— Entropy Weight Method of Evaluation Model

Notes on Mathematical Modeling —— Entropy Weight Method of Evaluation Model

Ok, this time I will talk about entropy weight method, a method to determine the weight of evaluation indicators through sample data.

We mentioned TOPSIS method before, which is used to deal with evaluation models with data. TOPSIS method is quite simple, about three steps.

For the calculation of the above sum, we often use the Euclidean distance between the scheme to be evaluated and the ideal optimal worst scheme after standardized data, that is, this calculation method actually hides a premise, that is, we default that all indicators have the same importance to the final score, that is, the same weight.

Giving different weights to evaluation indicators is more in line with the actual modeling situation and more explanatory. We have also mentioned the methods of determining the weight many times, such as searching other research reports online, sending questionnaires to do surveys, finding experts to empower them and so on. Analytic Hierarchy Process (AHP) is a convincing method to determine the weight. However, the shortcoming of AHP is also obvious, that is, it is too subjective, and the judgment matrix is basically filled in by individuals, which is often the most suitable for the situation without data.

When we have the data, can we determine the weights directly from the data?

For example, common sense can hardly help us to determine which is the most important factor affecting water quality, nor can it tell us how to measure the importance of other indicators. If you can't find the relevant information, you can really only authorize yourself completely subjectively. There are only four indicators. If there are ten or twenty indicators, subjective empowerment will be more troublesome.

Having said that, we can derive a method to determine the weight completely from the data, that is, the entropy weight method. In fact, after listening to the above sentence, we should realize the shortcomings of the entropy weight method: only starting from the data, regardless of the actual background of the problem, it may violate common sense when determining the weight. Even when grading, there will be problems. Of course, we can be flexible. Entropy weight method still has its advantages, and the force is relatively large ... Of course, I don't know if the judges and teachers like this method. This is just an introduction. Whether to adopt it or not depends on the individual ~

Entropy-the degree of chaos within the system. Sounds great, doesn't it? There is also a famous "law of entropy increase", which I believe everyone has heard more or less. Although it is a law of thermodynamics, it actually contains some philosophical truths: everything moves from order to disorder. Then why is this method of determining weight called entropy weight method? After all, the data is completely given and there will be no so-called transition to disorder.

I don't know the details. Let me briefly say my opinion. Modern science not only uses entropy, but also uses "information" to express the degree of order of the system. If a system contains a certain structure, it has certain information, which is called "structural information". The more structural information, the more orderly the system will be. It may be metaphysical to say so. Let me give you a simple example.

Look at the sand on the seashore. If it only distributes freely with the natural state, there is basically no information at all, and the system is completely chaotic.

If you build a sandcastle, things will be different. Sand has a certain structure, and the system composed of this part of sand becomes relatively orderly, and we can also see some information from it. The more such information, the more accurate the sandcastle and the more orderly the system. It should be understandable ~

Of course, it doesn't matter if you don't understand. I'm just saying. The principle of entropy weight method is that the smaller the variation degree of index, the less information it reflects and the lower the corresponding weight. In other words, the entropy weight method uses the information contained in the index to determine the position of the index in all indicators. Because entropy measures the degree of chaos of the system, it can also be used to measure the amount of information, so it is understandable to name this method entropy weight method. (But this is all my guess ...)

Ok, so how do we measure the amount of information? We can measure the amount of information by the probability of events. For example, Xiao Ming's grade has always been the first in the whole school, and Xiao Zhang's grade has always been the last in the whole school. Both of them were admitted to Tsinghua at the same time. Do you think the event "Xiao Ming was admitted to Tsinghua" or "Xiao Zhang was admitted to Tsinghua" is informative? Obviously, "Xiao Zhang was admitted to Tsinghua" may contain more information. Because Xiao Ming has always been the first in the whole school, it should be a natural thing to be admitted to Tsinghua. Everyone thinks so. And Xiao Zhang has always been the last one, and suddenly he was admitted to Tsinghua. An impossible thing happened, which contained a lot of information.

However, there is a small problem here. Is the information mentioned in the above example the same as the existing information mentioned in the principle of entropy weight method?

In any case, we can draw a simple conclusion that the more likely things happen, the smaller the information is, and the more unlikely things will be. We use probability to measure the possibility of an event, so we can also use probability to measure the amount of information contained in an event.

If the amount of information is represented by letters and the probability is represented by, then we can draw a general function diagram.

It can be found that the amount of information decreases with the increase of probability, the probability is between 0 and 1, and the amount of information is between 0 and positive infinity. Therefore, we can use logarithmic function relationship to express the relationship between probability and information.

Hypothesis is a certain situation in which an event may occur, indicating the probability of this event. We define it to measure the amount of information contained. The domain of logarithmic function is and the range of probability is, but we generally don't consider events with probability of 0. Therefore, there is nothing wrong with using logarithmic function in the domain.

If there is a possible situation in the event, then we can define the information entropy of the event as. We can see that information entropy is the expectation of information. When is, the maximum value is.

The greater the information entropy, the greater or smaller the existing information? As we said above, information entropy is the expected value of information, which should mean that the greater the information entropy, the greater the existing information. Actually, it is not, because the expected value of information here should be an expectation of potential information in the future. We say that small probability events contain a lot of information because an almost impossible event happened, and there is a lot of undiscovered information behind it, which eventually led to the occurrence of small probability events. When we say that a high probability event contains less information, we actually mean that there is less information that can be mined after this high probability event.

The information that has not been excavated above is the potential information before the incident, not the existing information. When we have enough information, some events happen naturally, so we can think that such events belong to high probability events. When we have less information, it is difficult for us to think that something will happen naturally. We think such an event is a small probability event. I think it's normal to be admitted to Tsinghua in the first grade, because we already know enough about his exam strength. And "the last one admitted to Tsinghua" may be because we don't know an important message, such as "the last one was admitted to the last one on purpose" ...

Well, this is my idea, which corresponds to the conclusion that the greater the information entropy, the smaller the existing information. There may be some logical problems in the above examples, which are for reference only. But the meaning to be explained should be relatively clear. The greater the information entropy of random variables, the less information available at present. Our entropy weight method actually determines the weight according to the existing information.

Well, after laying a good foundation, the next step is the calculation steps of entropy weight method.

1. For the input matrix, forward and standardize it first (if you forget, please go to the second article on evaluation model).

If all data forward and backward are positive, it is a matrix.

If there are negative numbers in the positive matrix, we can use them to standardize. In short, it is necessary to ensure that the standardized data are positive.

2. Calculate the proportion of the first sample under the first index and use it as the probability used in information entropy calculation.

It is the standardized non-negative matrix mentioned above, and we calculate the probability matrix. Each element in the. Well, don't ask me why I want to determine the probability like this. I really don't know. If you are interested, check it yourself. Can you leave me a message and tell me when you will find it?

3. Calculate the information entropy of each index, calculate the information utility value, and get the entropy weight of each index after normalization.

For the first indicator, its information entropy calculation formula is. As we mentioned above, the maximum value of is, so when we calculate, we can divide it by a constant, which will make it fall in the range between.

As mentioned above, the greater the information entropy, the smaller the existing information. If the information entropy reaches the maximum, it must be all the same, that is, all the same. If an indicator is the same value for all schemes, it is difficult to play an evaluation role. For example, all the evaluation objects are boys, so there is no need to consider gender factors in the evaluation. This also tells us again that under the framework of entropy weight method, the greater the information entropy, the smaller the amount of information exists.

Therefore, if we define the information utility value, the greater the information utility value, the more information we have. Then the information utility value is normalized to get the entropy weight of each index.

The above is the whole process of calculating the index weight by entropy weight method, which is actually not very difficult. In essence, it is "giving higher weight to indicators that contain more existing information". After that, we can calculate the distance between advantages and disadvantages in TOPSIS according to this weight, and even directly weight and score.

In fact, the so-called existing information can also be regarded as the standard deviation of index data. When the data of an indicator are exactly the same, the standard deviation is 0 and the information entropy is the largest. If Monte Carlo simulation is carried out, it can be found that the information entropy is basically negatively correlated with the standard deviation, which means that the standard deviation is basically positively correlated with the existing information. The greater the standard deviation, the greater the data fluctuation, the greater the existing information, and the greater the weight we give it. In a sense, that's it.

Teacher Qingfeng raised an interesting question. In the selection of miyoshi students, if X is the number of serious violations of discipline and Y is the number of verbal criticisms, which index has greater influence on the selection of miyoshi students? Obviously, in real life, once a serious violation of discipline is recorded in the file, it is basically impossible to become the top three students. But the value of this indicator is 0 for most people, and only a few people are 1 or 2. Its fluctuation is very small, and its weight is very small when weighted by entropy weight method. However, if this is done, it is possible that some people may still be rated as three excellent students even if they have seriously violated discipline. This is not realistic.

This example tells us that the limitation of entropy weight method is that it only depends on the fluctuation degree of data, or the so-called information quantity, without considering the actual meaning of data, which is likely to lead to a result contrary to common sense.

Teacher Qingfeng used to think that this method is a unique skill for beginners, because it is unreasonable to think that the power is significant as long as the variance is large. It's not even as good as we give a subjective weight by analytic hierarchy process, or look up information on the Internet. In addition, the standardization method in the first step is different, and the final result may be different, which is also a problem.

But in fact some problems can be solved. For example, in the case of the above serious violations, we can completely eliminate the samples of serious violations and sort the remaining samples. Moreover, for the indicators that have great influence in real life, the weights can also be given in advance, and the remaining indicators can be divided by entropy weight method.

If we have a realistic understanding of the evaluation index, we can see whether the result of entropy weight method is in line with reality, and then decide whether to adopt it. If you don't know much about the evaluation index, the analytic hierarchy process is very casual, and the corresponding conclusions can't be found on the Internet, then it is understandable to use the entropy weight method.

As for whether it is meaningful to measure the importance of indicators by the fluctuation of data in indicators. This is also a matter of opinion. Personally, I think it makes sense. After standardization eliminates the influence of dimensions, the greater the fluctuation of data contained in an indicator, which shows that the indicator will have a greater impact on the final result in a certain sense. Because it has a wide range of values. The ideal optimal solution and the ideal worst solution in TOPSIS are the optimal values and the worst values of each index, respectively. However, when calculating the distance between a scheme and the ideal scheme, the index with large fluctuation obviously has great influence, and it is not completely unreasonable to give it a higher weight. Of course, this method still needs to exclude special circumstances. Normally, I don't think it's a big problem.

Just nonsense, don't take it too seriously. )

In my opinion, as long as the final result of entropy weight method does not violate common sense, there is not much problem in using it. Teacher Qingfeng also said that if it is only used for competitions, the entropy weight method can be used. This method is better than what you define casually (generally speaking).

Well, that's all I want to say about entropy weight method. If you want to know more, please look it up yourself.

Bye bye ~