Joke Collection Website - Cold jokes - How to write high-quality ROC analysis articles

How to write high-quality ROC analysis articles

First, the basis of ROC analysis.

Papers on laboratory medicine can be divided into two categories, one is about methodology, as long as it is quality control, methodological evaluation and comparison, and the other is the application of test indicators, which are used to diagnose diseases and observe the relationship between indicators and disease prognosis and pathological staging. ROC analysis belongs to the latter, and its main purpose is to evaluate the diagnostic significance of a certain index to a certain disease. Such as "the role of AFP in the diagnosis of liver cancer", "CEA, NSE combined diagnosis of lung cancer" and so on.

As we all know, using a single laboratory index to diagnose diseases, simply improving sensitivity will inevitably reduce specificity, and vice versa; In other words, reducing misdiagnosis will inevitably increase missed diagnosis. For example, if a doctor wants to use AFP to diagnose liver cancer, the diagnostic standard set by the doctor is that AFP is greater than 10 mmol/L. Obviously, the sensitivity of diagnosing liver cancer according to this standard is high, and liver cancer will not be missed, because almost all patients with liver cancer have AFP greater than10mmol/L, but on the other hand, it is easy to be misdiagnosed according to this standard. Because in patients with AFP greater than 10, doctors realized this problem and raised the diagnostic standard of liver cancer from 100 mmol/L to 1000 mmol/L as the diagnostic standard of liver cancer. Obviously, with this standard, patients will naturally not be misdiagnosed, because AFP is greater than 1000 mmol/L, and it is almost certain that it is liver cancer. But on the other hand, missed diagnosis is increased, because not all patients with liver cancer have AFP greater than 1000 mmol/L, which shows that missed diagnosis and misdiagnosis are contradictory. Reducing missed diagnosis must be at the expense of increasing missed diagnosis, and reducing missed diagnosis must be at the expense of increasing misdiagnosis. Sensitivity and specificity are always the contradiction between fish and bear's paw.

So how to coordinate sensitivity and specificity? The best way is to do ROC analysis (note: ROC is only for measurement indicators). Put the case group together with the diseases that are difficult to differentiate and diagnose, and diagnose the diseases with the indicators that need to be studied, with 1- specificity as the abscissa and sensitivity as the ordinate. Observe the relationship between sensitivity and specificity when the index changes, and draw ROC curve (the specific program is completed by SPSS). Then judge the index according to the result of the curve.

What is the use of ROC? Generally speaking, the main purposes are as follows: 1, to judge the diagnostic effectiveness of a single index, and the larger the area under the curve, the greater the diagnostic effectiveness of the index; 2. Compare the diagnostic efficiency of different indexes.

If you don't understand, read the relevant books again.

Second, how to design high-quality ROC analysis

2. 1. Estimation of sample size

We must estimate the number of samples before most studies are carried out. The sample size is too small, the statistical efficiency is low, and it is easy to make two kinds of mistakes. The sample size is too high, which not only causes economic waste, but also increases many uncontrollable factors and reduces the experimental quality. Some domestic medical personnel simply think that the larger the sample size, the more reliable the research results, which is one-sided. In fact, personally, it is enough to control the sample size at the upper limit of the minimum estimated sample size plus 20% in general research (mainly to prevent falling off in prospective research).

I won't write the sample size estimate, because I can't work out the formula in the garden. Refer to related books for details.

2.2 Settings of control group and experimental group

Laboratory indicators should emphasize not only their sensitivity, but also their specificity. The setting of the control group mainly reflects the specificity of the index, that is, the ability of differential diagnosis. Based on this, the setting of the control group should be similar to the symptoms of the disease group, and it is difficult to identify the disease without using laboratory diagnostic indicators. Such as liver cirrhosis and liver cancer. In some domestic articles, it is unscientific to add healthy people to the control group. Because healthy people and sick people can basically be distinguished by symptoms and signs, there is no need for laboratory indicators. Of course, if an indicator is a novel indicator or the diagnosis of asymptomatic diseases (such diseases are rare), it is understandable to add a healthy control group in the first or initial study to observe whether there are differences between the disease group and the healthy control group, but it is not necessary to set up a healthy control group for diseases with obvious symptoms and easy to distinguish from healthy people. The experimental group and the control group should fully reflect the homogeneity, that is, a group of people who are difficult to make differential diagnosis only by symptom history without using laboratory indicators. The best way is to adopt uniform inclusion criteria. For example, if you want to use AFP to diagnose liver cancer, a better inclusion criterion is to include all newly diagnosed patients over 40 years old with jaundice or other symptoms of liver disease who are suspected to be liver cancer patients. As for whether the people included in the study are liver cancer or other benign liver diseases, don't worry about it first, just use indicators to make differential diagnosis. It should be noted that the specific standards depend on professional knowledge, and the standards listed above are for reference only. The quality of ROC research depends to a great extent on the scientific nature of the inclusion criteria, which is also a test of researchers' ability.

Some domestic magazines and even some foreign magazines, when included in the research object, only said how many cases there were in the experimental group, and the control group had * * cases of * * disease and * * cases of * * disease. This is actually unscientific or not worth advocating. For example, the radio frequency diagnosis of rheumatoid arthritis has also been studied. A researcher simply explained that the control group consisted of 20 cases of ankylosing spondylitis and 30 cases of systemic lupus erythematosus. It is difficult to reflect whether these diseases need to be differentiated from liver cancer, because some ankylosing spondylitis and lupus erythematosus do not need to be differentiated from RA because of their typical symptoms. At the same time, it also involves another problem, that is, there is no pathological stage of the disease.

Many indexes are related to the pathological stage of the disease. Such as tumor markers. There are also some closely related to the course of disease, such as markers of myocardial injury. When studying such indicators, we must consider the stratification of diseases. For example, to study the diagnostic effect of CK on AMI, it is necessary to limit the time of seeing a doctor in the inclusion criteria. For example, it is stipulated that this study only includes people who see a doctor within 2 hours after acute chest pain, and all of them are tested within a period of time (for example, 3-4 hours after chest pain). Of course, the narrower the time limit, the higher the research quality, but the fewer cases, the more difficult the research, so I need to decide the inclusion criteria according to my research ability and professional knowledge. For diseases related to pathological stages, if there are enough cases stratified, we can study them stratified; If not, it is necessary to explain the case composition, be close to the actual situation (according to professional knowledge), and some can also compare other baseline characteristics according to specialty. For example, in the study of AFP diagnosis of liver cancer, we can compare the liver function (grade) between the case group and the control group by chi-square test. In fact, before ROC analysis, it is best to compare whether other indicators of the two groups are balanced. If it is not balanced, it means that the inclusion criteria are not very strict, or this indicator can also be used as a diagnostic indicator or ROC analysis.

Most papers in China do not explain the composition of cases, nor do they compare the other characteristics of two groups, or even compare the basic age and gender. It makes it impossible for readers to judge whether the two groups of cases are homogeneous and whether the research is meaningful.

It should be noted that the stricter the study inclusion criteria, the finer the case stratification and the higher the quality of the study, but the more difficult the study is. What kind of research purpose needs to be achieved, whether stratification and strict restrictions are needed, etc. It depends on the actual situation.

2.3 The experimental group must have a gold standard.

The so-called gold standard is a method or scheme to diagnose a disease. For the cases included in the experimental group, the gold standard must be used for diagnosis. In other words, one case should be included in the experimental group. On the contrary, sometimes the requirements for the case group are not too strict.

Common diagnostic methods, such as RA, should be diagnosed according to the diagnostic criteria of American Rheumatology Association (1987), tumors should be diagnosed according to cases, and gallstones should be diagnosed according to intraoperative findings. The specific gold standard can refer to the relevant knowledge of various majors. Most ROC analysis articles in China are not clear about the gold standard, which makes it impossible for readers to judge the research quality. This is an inaccurate performance.

2.4 interpretation of roc related parameters

There are many parameters generated by ROC curve, such as truncation value, sensitivity, specificity, area under the curve (AUC), positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, Jordan index and so on. In professional interpretation, it is generally not necessary to list them all, but the critical value, sensitivity, specificity and AUC must be stated.

When discussing related indicators in some domestic papers, the common mistakes are: there is no confidence interval for each indicator, samples are used instead of the whole, and it is simply considered that the greater the AUC, the higher the diagnostic effectiveness. Some people even think that Jordan index is the key to judge the validity of diagnosis. In fact, the Jordan index is only aimed at one point, and it cannot reflect the changes in sensitivity and specificity after the index changes. At the same time, Jordan's index can only roughly determine the deadline.

When comparing two indicators in the same experiment, we must use AUC comparison to get the high diagnostic efficiency of a single indicator, but this comparison is not a simple numerical comparison, but should use relevant statistical methods. Its method is complicated, so it is recommended to find a professional. As for the comparison of sensitivity and specificity, there should also be related comparative test methods. Because we are studying samples, there are sampling errors, so we can't use samples instead of the whole to discuss.

At the same time, the value of AUC is between 0.5 and 1 and cannot be lower than 0.5. In some domestic articles, AUC is less than 0.5 due to the influence of fixed thinking. This is complete nonsense.

For example, some people like to predict whether others will have boys or girls. If 90% of the predictions are correct, it can be said that someone has made a relatively accurate prediction. If 90% of the predictions are wrong, is this prediction useless? Smarter people will think in turn, if you predict to have a boy, you can understand that you want to have a girl, which is not accurate, 90%.

2.5 Multi-index joint diagnosis

The joint diagnosis between indicators can be divided into series diagnosis experiment (series) and parallel diagnosis experiment (parallel), and the specific joint mode depends on the specialty. Some diseases, such as acute myocardial infarction, have great influence on prognosis if early diagnosis and concurrent treatment are needed. Some diseases emphasize specificity, such as SLE, which can be connected in series to improve specificity. To put it simply: series connection means that all indicators are positive, and parallel connection means that as long as one indicator is positive, it is positive.

Of course, almost all diseases are diagnosed as early as possible and as accurate as possible. Except for a few diseases, it is sometimes difficult to distinguish the two. Therefore, from a cautious point of view, researchers can list both series mode and parallel mode for readers to choose from.

When some domestic papers use index joint diagnosis, they either don't explain the joint mode or use series-parallel connection indiscriminately, so that readers can't understand it.

In fact, according to common sense, it can be inferred that whether two indicators are connected in series or in parallel, increasing sensitivity will reduce specificity. Only when multiple indexes are positive in series can the diagnosis be established, thus improving specificity and reducing sensitivity. When one index is positive in parallel, the diagnosis is established, which improves the sensitivity and reduces the specificity. It is impossible to improve the sensitivity and specificity at the same time by combining multiple indexes. In other words, joint diagnosis means that if the sensitivity is higher than any index (at least equal), it means that the author adopts parallel connection and the specificity should be lower than any single index involved in joint diagnosis; On the other hand, if the specificity of joint diagnosis is higher than any index (at least equal), it means that the author adopts series connection and the sensitivity should be lower than any single index involved in joint diagnosis; In some papers in China, it is illogical that the sensitivity and specificity of indicators actually rise simultaneously after joint diagnosis. To put it mildly is a calculation error, and to put it seriously is DDD.

At the same time, the combination between indicators is a point-to-point combination, which is a combination of two cut-off points. There is no such indicator as AUC. AUC is the area of the curve generated when the indicator changes. The indicator has been fixed to a certain value, so there is no AUC. It is also nonsense that some papers in China are merged to form a joint AUC.

Some articles even abandon ROC curve and use the upper limit of reference range as the diagnostic limit, which is really caused by not understanding ROC analysis.

2.6 From the clinical point of view, objectively evaluate ROC analysis.

Theoretically speaking, in most cases, the combined use of multiple indicators is helpful to improve the accuracy of disease diagnosis, but it should be recognized that the combined use of multiple indicators will increase the economic burden of patients. In the case of similar diagnostic efficacy, a small number of cheap indicators will be given priority for diagnosis. At the same time, if an indicator wants to be successfully used in clinic, it must integrate its advantages and disadvantages in all aspects, such as whether it is easy to detect, whether it is stable, whether it is cheap and whether it is timely. ROC analysis papers of multi-index joint diagnosis in China mostly ignore clinical needs and blindly recommend multi-index joint diagnosis, which is divorced from clinical practice and weakens the use and popularization value of the articles. Therefore, when recommending diagnostic indicators or indicator combinations, due to the full consideration of the above factors, it should be carefully recommended.

2.7 Rational treatment of ROC analysis

Although ROC analysis combines sensitivity and specificity to objectively evaluate the diagnostic efficacy of indicators, it also has its own shortcomings, such as: sometimes it pays special attention to disease-specific diagnosis, such as AIDS diagnosis. If a quantitative index is found to be helpful to the diagnosis of AIDS, ROC analysis should not be used at this time, but should be combined with clinical practice. Pay attention to the case that the specificity is 100%, and don't think too much about sensitivity. At the same time, the diagnosis of some diseases is not strongly dependent on laboratory indicators, and many indicators change only as the result of diseases, so ROC analysis is not suitable, because laboratory indicators are still facing challenges from imaging, pathology, medical history, physical examination and so on. Don't blindly use ROC analysis when the laboratory indicators don't have advantages. Theoretically speaking, white blood cell count can be used to diagnose trauma, but the diagnosis of trauma absolutely does not depend on white blood cells, because its diagnosis mainly depends on medical history. And no kidding, ROC analysis is used to analyze the role of WBC in trauma diagnosis.