Please use this identifier to cite or link to this item:
https://www.arca.fiocruz.br/handle/icict/67806
Type
ArticleCopyright
Restricted access
Embargo date
3100-12-31
Collections
Metadata
Show full item record
LATENT ARCHETYPES OF THE SPATIAL PATTERNS OF CANCER
low‐rank approximation
nonnegative matrix factorization
singular value decomposition
spatial statistics
Author
Affilliation
School of Mathematics and Statistics. UCD, Dublin, Ireland
Universidade Federal de Minas Gerais. Departamento de Estatística. Belo Horizonte, MG, Brazil.
ESRI Inc. Redlands, California, USA / Universidade Federal de Minas Gerais. Departamento de Ciência da Computação. Belo Horizonte, MG, Brazil.
Fundação Oswaldo Cruz. Instituto René Rachou. Belo Horizonte, MG, Brazil.
Universidade Federal de Minas Gerais. Departamento de Estatística. Belo Horizonte, MG, Brazil.
ESRI Inc. Redlands, California, USA / Universidade Federal de Minas Gerais. Departamento de Ciência da Computação. Belo Horizonte, MG, Brazil.
Fundação Oswaldo Cruz. Instituto René Rachou. Belo Horizonte, MG, Brazil.
Abstract
The cancer atlas edited by several countries is the main resource for the analysis of the geographic variation of cancer risk. Correlating the observed spatial patterns with known or hypothesized risk factors is time‐consuming work for epidemiologists who need to deal with each cancer separately, breaking down the patterns according to sex and race. The recent literature has proposed to study more than one cancer simultaneously looking for common spatial risk factors. However, this previous work has two constraints: they consider only a very small (2–4) number of cancers previously known to share risk factors. In this article, we propose an exploratory method to search for latent spatial risk factors of a large number of supposedly unrelated cancers. The method is based on the singular value decomposition and nonnegative matrix factorization, it is computationally efficient, scaling easily with the number of regions and cancers. We carried out a simulation study to evaluate the method's performance and apply it to cancer atlas from the USA, England, France, Australia, Spain, and Brazil. We conclude that with very few latent maps, which can represent a reduction of up to 90% of atlas maps, most of the spatial variability is conserved. By concentrating on the epidemiological analysis of these few latent maps a substantial amount of work is saved and, at the same time, high‐level explanations affecting many cancers simultaneously can be reached.
Keywords
cancer mapslow‐rank approximation
nonnegative matrix factorization
singular value decomposition
spatial statistics
Share