"JOURNAL OF RADIO ELECTRONICS" (Zhurnal Radioelektroniki ISSN 1684-1719, N 12, 2016

contents             full textpdf   

Using Cluster Analysis and Logistic Regression for the Differentiation of Lung Diseases

D. Yu. Kozlov
Altai State University in Barnaul

The paper is received on November 19, 2016

 

Abstract. In this paper we have considered the possibility of using cluster analysis and logistic regression to differentiate pathologies (cancer and tuberculosis), leading to the appearance of spherical formations in the lungs. The parameters determined on the basis of imaging X-ray computed tomography (the average value of the densitometry Hounsfield index and standard deviation of Hounsfield index for the area of interest was selected by physician radiologist, and also the value of the fractal dimension and the "slope") was chosen as diagnostic features. To this sample, in series, clustering by k-means and complete linkage method for the hierarchical clustering were applied. When comparing the results of cluster analysis with verified diagnoses, it was concluded that the hierarchical clustering is more reliable than the k-means method for the correct diagnosis elects. Then, on the basis of half of the original sample a logistic regression model was obtained. ROC-analysis was used to estimate the quality of the resulting model, and also for the determination of cut-off point that allows to pass on from probabilities of diagnosis obtained after applying logistic regression to forecast own diagnoses. The selected logistic regression model was used to predict the diagnosis in the second half of the sample, and level of correct prediction about 70% was reached.

Keywords: spherical formation in the lungs, fractal dimension, cluster analysis, logistic regression.

References

1.    Leonov S.L., Shoikhet Ya.N., Konovalov V.K. and others. Analysis of Data Errors in the Multispiral Computed Tomography of the Spherical Formation of Lungs. Problemy klinicheskoj mediciny – Problems of Clinical Medicine, 2011, No. 3-4 (25), pp. 16-19. (In Russian)

2.    Konovalov V.K., Shoikhet Ya.N., Fedorov V.V. and others. Target 3d-reconstruction for the Research of the Qualitative Characteristics of the Surface of the Spherical Formation of Lungs. Problemy klinicheskoj mediciny – Problems of Clinical Medicine, 2011, No. 3-4 (26-29), pp. 20-25. (In Russian)

3.    Shayduk A.M., Ostanin S.A., Konovalov V.K. and others. Problems of Standardization Scale in Calculating Fractal Dimension of Medical Images. Izvestija Altajskogo gosudarstvennogo universiteta – Izvestiya of Altai State University, 2012, No. 1-1 (73), pp. 233-235. (In Russian)

4.    Konovalov V.K., Shoikhet Ya.N., Fedorov V.V. and others. The Method of Target Volume Densitometry of the Spherical Formation of Lungs to Estimate Their Internal Structure with Multispiral Computed Tomography. Problemy klinicheskoj mediciny – Problems of Clinical Medicine, 2012, No. 1-4 (26-29), pp. 74-86. (In Russian)

5.    Konovalov V.K., Shoikhet Ya.N., Fedorov V.V. and others. Method of Quantitative Estimation of Structure of Spherical Formation of Lungs in Multispiral Computed Tomography. Problemy klinicheskoj mediciny – Problems of Clinical Medicine, 2012, No. 1-4 (26-29), pp. 95-101. (In Russian)

6.    Ostanin S.A., Shayduk A.M., Kozlov D.Yu. and others. Entropy Method for Estimating the Complexity of Edge Medical Images. Izvestija Altajskogo gosudarstvennogo universiteta – Izvestiya of Altai State University, 2013, No. 1-2 (77), pp. 177-180. (In Russian)

7.    Molodkin I.V., Leonov S.L., Shayduk A.M. and others. Statistical Analysis of the Influence of the Type of Pathology at the Quantitative Characteristics of Medical Images. Medicinskaja fizika – Medical Physics, 2014, No. 3 (63), pp. 43-47. (In Russian)

8.    Oczeretko E., Borowska M., Kitlas A. et al. Fractal analysis of medical images in the irregular region of interest.//BioInformatics and BioEngineering, BIBE 2008. 8th IEEE International Conference on Dept. of Med. Inf., Univ. of Bialystok, Bialystok, 2008. October.

9.    Barsegjan A.A., Kuprijanov M.S., Holod I.I. and others. Analiz dannyh i processov: ucheb. posobie [Analysis of the Data and Processes: Tutorial] , 3d- edition, SPb, BHV-Peterburg, 2009. 512 p. (In Russian)

10.          Paklin N.B., Oreshkov V.I. Biznes-analitika ot dannyh k znaniyam [Business Intelligence: From Data To Knowledge] – SPb, Piter, 2013. 704 p. (In Russian)

11.          Shipunov A.B.,  Shipunov A.B., Baldin E.M., Volkova P.A., Korobejnikov A.I., Nazarova S.A., Petrov S.V., Sufiyanov V.G. Naglyadnaya statistika. Ispol'zuem R! [The Transparent Statistics. R Used!], Moscow: DMK Press, 2012.  298 p. (In Russian)

12.          Mnogomernyj statisticheskij analiz v jekonomicheskih zadachah: komp'juternoe modelirovanie v SPSS: Uchebnoe posobie [Multidimensional Statistical Analysis of Economic Problems: Computer Modeling in the SPSS: Tutorial], edited by I.V. Orlova. Moscow, Vuzovskij uchebnik, 2009. 309 p. (In Russian)

13.          Lesovyh S.V., Tuzhikova N.V., Judincev A.Yu. and others. Methods of Determining of the Integral Index of Regional Development. Tendencii nauki i obrazovanija v sovremennom mire [Trends of Science and Education in the World Today]. 2016, No. 16-1, pp. 39-43. (In Russian)

14.  Berikov V.S., Lbov G.S. Current Trends in Cluster Analysis. Vserossijskij konkursnyj otbor obzorno-analiticheskih statej po prioritetnomu napravleniju «Informacionno-telekommunikacionnye sistemy» [All-Russian Contest Selection Overview and Analytical Articles on Priority "Information and Telecommunication Systems"]. 2008. 26 p. (In Russian)

15.          R. I. Kabacoff. R in Action. Data Analysis and Graphics with R, Manning Publications Co, 2011, 472 p.

16.          Mastitsky S.E., Shitikov V.K. Statistical Analysis and  Visualization of the Data via R, 2014, e-book, available at  http://r-analytics.blogspot.com. (In Russian)