Link to Github: marisalong.github.io

Analysis of Alzheimer's Disease Features

Marisa Long and Anna Schoeny


Notebook Contents

Introduction\ Discussion of Data Sources\ Dataset Vocabulary and Abbreviations\ Data Limitations\ Extract, Transform, Load (ETL)\ Exploratory Data Analysis (EDA)\ Modeling and Analysis\ Insights and Final Thoughts\ References

Introduction

Our team was interested in working with datasets related to Alzheimer's disease, particularly taken from studies measuring features of dementia patients such as years of education, brain volumetrics, age of diagnosis and death, and more. We are particularly interested in working with datasets relating to Alzheimer's and dementia because both of our families have both been personally impacted by the disease.

Alzheimer's disease impacts families across the globe and is the leading cause of dementia in aging populations. Alzheimer's is a neurological disease that is most widely recognized as progressive memory loss. However, Alzheimer's can have a host of other impacts on patients, including paranoia, delusions, self injurious behaviors and depression, and difficulty speaking, swallowing, and walking. Given the detrimental outcomes of Alzheimer's disease, families are often left with the burden of personally caring for patients or finding palliative care options. Families might suffer the physical and emotional burdens of caregiving, as well as the guilt and pain associated with watching a loved one progressively lose themselves. To put it simply, hundreds of thousands of people are diagnosed with Alzheimer's each year, and research on the disease is still riddled with unanswered question and lack of clarity in terms of predictive features and treatment options. There is a push in the field to start implementing Machine Learning models and other computer/data science methods as a means of predicting Alzheimer's diagnoses. We wanted to engage in our own data analysis with this in mind.

For the purpose of this project, we will be seeking to evaluate the relationships that exist between these patient features and the onset of symptoms and diagnosis of Alzheimer's disease. We were most interested in looking at features such as education and gender at the offset of our research, but we also wanted to keep our mind open to other predictive features based on our exploratory analysis. We are interested in getting a sense of both where research dead-ends in terms of predictive modeling as well as potentially discovering more promising areas where research might be headed. Though we do recognize that our lack of expertise in the field may be a barrier to our analyses, we hope to use of data science skills to explore some potential areas to research further rather than claiming to try to understand the intricacies of the neuroscientific data.


Discussion of Data Sources

The first dataset that we determined was fit for our project is MRI and Alzheimer's, taken from the Open Access Series of Imaging Studies (OASIS). OASIS is a project that has sought to make brain imaging datasets more widely available to the public. This dataset includes MRI comparisons of adults with Alzheimer's and healthy adults. This dataset was initially interesting to our team because the this dataset has a high data usability score as assigned by Kaggle, which represents user ratings on the documentation of the data. This high score indicates that the data is in a state ready for our analysis. This dataset also includes both cross-sectional and longitudinal MRI data. The cross-sectional data highlights 416 subjects ranging from age 18 to 96 and details 3 to 4 MRI scans for each subject. Approximately $1\over4$ of the subjects had been diagnosed with Alzheimer's disease. The longitudinal data follows a sample of 150 subjects aged 60 to 96. These same 150 subjects were scanned two or more times with at least a year between imaging sessions. 64 of the subjects were diagnosed with Alzheimer's disease by the time of the first scan with an additional 14 diagnosed at one of the later scan visits. Using this data, we are hoping to be able to answer questions regarding the ability to predict dementia in patients based on features such as socioeconomic status or education. For example:

For our second dataset, we wanted to dive further into brain difference between females and males that could contribute to differences in the prevalence of dementia. We also continued to explore correlations between years of education and dementia onset. For this milestone, we performed exploratory analysis on the Seattle Alzheimer's Disease Brain Cell Atlas Donor Metadata dataset. This dataset was taken from studies conducted by the Allen Institute for Brain Science, the University of Washington, and the Kaiser Permanente Washington Health Research Institute. The data highlights demographic, clinical, cognitive, and neuropathological data for a set of 84 patients. The patient features include cognitive status, level and years of education, sex, race, brain pH, measures from numerous brain scans, and ages at different stages in each patients cognitive journey. We found this dataset particularly interesting since there is also a separate volumetric dataset that can be linked using the donor ID. For the purpose of our exploratory analysis, we did not use the volumetric dataset as neither of us have the neuroscience expertise to understand the complexity of these measures, but we did think this was an interesting dataset worthy of merging with our first dataset. Using the metadata, we hope to answer questions such as the following:

This second question is especially interesting to us because we want to see if our findings are consistent with our analysis from the first dataset regarding education as a potential predictive feature.

Dataset Vocabulary and Abbreviations

There are several features in our datasets that use specific terminology that we want to explicitly define. (Table adapted from OASIS)

Abbreviation Patient Feature Feature Definition
EDUC_Y Years of Education Simple count of the years in formal education.
EDUC_L Level of Education Education codes correspond to the following levels of education: 1: less than high school grad., 2: high school grad., 3: some college, 4: college grad., 5: beyond college.
SES Socioeconomic Status Socioeconomic status as assessed by the Hollingshead Index of Social Position and classified into categories from 1 (highest status) to 5 (lowest status) (Hollingshead, 1957)
MMSE Mini Mental State Examination Mini-Mental State Examination score (range is from 0 = worst to 30 = best) (Folstein, Folstein, & McHugh, 1975)
CDR Clinical Dementia Rating Clinical Dementia Rating (0 = no dementia, 0.5 = very mild AD, 1 = mild AD, 2 = moderate AD) (Morris, 1993)
ASF Atlas Scaling Factor Atlas scaling factor (unitless). Computed scaling factor that transforms native-space brain and skull to the atlas target (i.e., the determinant of the transform matrix) (Buckner et al., 2004)
eTIV Estimated Total Intracranial Volume Estimated total intracranial volume (cm3) (Buckner et al., 2004)
NWBV Normalized Whole Brain Volume Normalized whole-brain volume, expressed as a percent of all voxels in the atlas-masked image that are labeled as gray or white matter by the automated tissue segmentation process (Fotenos et al., 2005)

Data Limitations

Challenges with MRI Dataset:

  1. Different Columns (Variables): The longitudinal data and cross-sectional data contained slightly different varibles from one another. For example, 'Group' only existed in one dataframe, so we had to map the Clinical Dementia Rating for each patient found in the cross-sectional data to their correpsonding group.
  2. 'EDUC' Ambiguity: In the longitudinal data, the paper that was published alongside the dataset clearly states that Education Level means the number of years in formal education for that individual. However, in the cross-sectional data, the 'Educ' numbers are suspiciously low for being total years in formal education. Upon further research, we discovered that the Educ in the cross-section data refers to the level of education as definied in the table above.
  3. Data Non-Exclusivity: Since the longitudinal and cross-sectional data from the OASIS study contains some of the same patients, when we combined the data we had to ensure that we were not using the same patient’s data multiple times in our analysis and model.

Challenges with Metadata Dataset:

  1. Relatively Small Sample Size: One of the biggest things we have taken out of our analysis of this dataset is that we have to be cautious how much weight we put on any conclusions from this dataset. Only 84 participants are included in this data. We really liked this source because there aren't a lot of sources available publicly that have information on education and gender as they relate to dementia. However, we do recognize that this is one downfall of this dataset. We hope that in our future work we will be able to mitigate this by working with both of our datasets, and if we decide it is necessary at that point, we will figure out another dataset to draw from.
  2. Lack of Knowledge in the Field: This dataset included a lot of test scores and other information that has the potential to be useful in our analyses. We were able to determine which scores we were interested in examining, such as the MMSE score, but we recognize that some of the collected data we did not include may have potential for future study.
  3. Missing Data One example of a piece of missing data is that for several of the age columns, the ages above 90 are grouped together whereas all the other ages are only representative of a singular year. This made it so that those columns were not integers which complicated our task of graphing and getting summary statistics. To deal with this, we changed those "90+" categories to all just be equal to 90, which we recognize may impact some of the conclusions we draw.

There were several other data limitations relating to our analyses:

  1. Data Imputation: When combining our datasets before construction of our model, we noticed a lot of missing data that would limit our ability to build a solid model. Each dataset measures slightly different variables, which meant that there was some disjoint in terms of the collected data. We ultimately determined that we needed to impute values for the missing rows because our datasets were small to begin with and we didn't want to have to drop rows. This may limit some of the certainty in our models and suggests the need for further data collection in future analyses.
  2. We want to acknowledge that patients who exhibit signs of cognitive impairment in a clinical setting may be more likely to receive an MMSE or other cognitive tests. However, these studies were specifically looking at dementia causes and correlations, so seemingly healthy individuals were also given these exams.

Extract, Transform, and Load Data (ETL)

First, we needed to import the necessary libraries and ensure we are working in the correct directory to access our data.

Next, we needed to read in each of our datasets and conduct any necessary transformations and reformatting.

Dataset 1: MRI and Alzheimer's

Data from OASIS project

Longitudinal Data:

Since this data represents the same individual in multiple rows, we decided to break out the different visits out into seperate tables. We are created another table that holds the keys to each individual so that we can still access each visit associated with a specific individual.

We want to check out the datatypes in each of our dataframes to ensure they are formatted in the way that makes the most sense. Here are the datatypes for this dataframe:

We also want to get a sense of what patients are being recorded in this data.

It is clearly evident that some individuals were not studied as long as other individuals given the dropoff in the number of patients during each consecutive visit.

"Converted" refers to individuals that were not initially diagnosed with Alzheimer's at the time of the first scan but were diagnosed with Alzheimer's at one of their later scans.

Cross-Sectional Data:

The MRI and Alzheimer's study also includes cross-sectional data from the same research lab, which is included below:

Here are the datatypes of the dataframe:

Note: This the cross-sectional data does not have a group column that classifies the patients as demented, nondemented or converted like the longitudinal dataset does. However, they still include the Clinical Dementia Rating (CDR) that describes the level of dementia. According to the CDR, these are the classifications: 0 = no dementia, 0.5 = very mild Alzheimer's Disease, 1 = mild Alzheimer's Disease, 2 = moderate Alzheimer's Disease.

To make the data more easily comparable to the longitudinal data, we used this quantitative variable to create a categorical variable with the classification.

Dataset 2: Metadata

Before completing our analysis, we wanted to clean up this dataframe to make it more easily readable.


Exploratory Data Analysis (EDA)

Questions for Dataset 1 (Longitudinal):

What is the average level of education (variable 'EDUC') of demented vs non-demented vs converted individuals?

One hypothesis we have based on this brief summary of education level is that education slows cognitive decline. This is suppored by the data above, where demented individuals have and average of 13.7 years, nondemented individuals have an average of 15.1 years, and converted individuals have 15.5 years. Looking at education differences among clinically demented vs nondemented individuals will help us determine how strongly education level may predict Alhzeimer's.

At first glance, a strong negative correlation between education level and socioeconomic status was suprising. However, when we read the publication that came out alongside this dataset, they authors explained that 'SES' is ranked using the Hollingshead Index of Social Position and has 1 being the highest status, where 5 is the lowest. Therefore, this statistic shows that individuals from a higher socioeconomic status were likely to recieve more education than those of lower socioeconomic statuses. Through our further exploration, we hope to uncover the relationship between socioeconomic status, education, and dementia.

Over how many years did the study track each individual?

To check for skewness in the age data from the longitudinal study, we decided to look at the difference from first to last visit for each of the individuals.

From this histogram, we can see that the longitudinal study did not cover the same number of years for each patient. The study ranged from one visit (which would have a span of 1 year here), to up to three visit spanning up to seven years. Because we are looking to build a model that is predicts risk of Alzheimer's, not age of onset, we plan to just use the data from the first visit (since the first visit had the greatest number of participants).

How old were people in this dataset when they had their first MRI scan for dementia?

The mean age of patients on their first visit was 74.4 years old, and this dataset had a standard deviation of 7.5. This shows that this dataset focuses heavily on Alhzeimer's and dementia in older individuals, rather than also considering early onset Alzheimer's. Looking at age allows us to understand the timeframe for dementia onset.

How does the number of females vs males with dementia compare in this dataset?

Compared to the male data, the female data appears to be more right-skewed. This suggests that women on average are older than men when they are affected by Alhzeimer's. We do want to account for the fact that women in general live longer than men so this could explain why diagnoses happen later on and women are more often affected by Alzheimer's. We want to look at brain differences such as pH level that differ between men and women to understand the contribution these factors may have on Alzheimer's. This is relevant to our main goal as one of the variables we are focusing on is gendered differences in Alzheimer's.

Questions for Dataset 1 (Cross Sectional):

What is the average level of education (variable 'EDUC') of demented vs non-demented vs converted individuals?

This is consisted with the dataframe above, showing that on average nondemented individuals had a higher education level than the demented individuals. This does appear to be on a different scale, which we address in our commentary below.

Similar to the longitudinal data, this statistic shows that individuals from a higher socioeconomic status were likely to recieve more education than those of lower socioeconomic statuses. The correlation is also very similar, with it being -0.74 in the cross-sectional data compared to -0.72 in the longitudinal data.

How old were people in this dataset when they had their MRI to scan for dementia?

The mean age of patients when they got their MRI scan was 51.3 years old, which is much younger than the previous dataset. This dataset also had a much bigger spread in regards to age, with the standard deviation of 25.3. This tells us that the cross-sectional dataset could contain cases of early onset Alzheimer's. The cross-sectional study contained over 400 individuals and studied subjects ranging between the ages of 18 and 96, whereas the longitudinal study included patients only between the ages of 60 and 96. This explains why the mean age was so different betweem the two.

How does the number of females vs males with dementia compare in this dataset?

From this graphic, we see that females have a wider range of ages. This differs from the longtidudinal data and indicates that we have to dig deeper before drawing conclusions on our gender hypothesis.

Dataset 2 EDA

For our milestone 2 dataset, we were interested in introducing more data related to patients with and without dementia to gain a further picture of what features may be related to cognitive decline. This dataset includes a variety of features ranging from the age of the onset of symptoms, the age of death, education levels, sex, brain weight and pH, and several cognitive test scores. Several of these same features were present in our MRI dataset, so we hope this will allow us to be able to examine these two datasets together.

Dataset 2: Metadata

We see that this dataframe is slightly skewed in terms of sex. There are more females than males.

More drastic of a skew than with sex, this dataset is almost entirely white patients, which may pose some constraints on our ability to generalize. We will keep this in mind when making any statements regarding race using this dataset, and note that this is one piece of potential missing data.

Distribution of Dementia by Age and Sex

We see that although there are more females than males in the dataset, there is a difference of 3 between the number of donors with dementia and without dementia for both sexes. For males, there are more donors without dementia while for females there are more donors with dementia. This is information we will have to keep in mind for the rest of our EDA while comparing between sexes.

Brain pH for Dementia v. No dementia

Based on this exploratory graph, there are several observations that can be made about brain pH as it relates to dementia diagnosis. For individuals with dementia, the mode appears at individuals with pH around 6.2-6.3, whereas individuals without dementia appear to have a peak for individuals with a brain pH around 6.9. Ignoring the distinct outlier in the dementia patients, the spread of patients with and without dementia is around 1.5. However, the maximum pH for individuals without dementia is higher at around 7.6, whereas the max pH for individuals with dementia falls at around 7.2. We wanted to consider brain pH as it relates to dementia because brain pH may be a potential indicating factor for dementia, and we know that females tend to have a lower brain pH.

To dive further, we wanted to look at the differences between individuals with and without dementia and their pH levels by sex.

From these two plots, it appears that brain pH is lower in the individuals with dementia. For instance, although males tend to lean towards a higher pH in both the demented and non-demented group, in the non-demented group the highest male brain pH is around 7.6 where it is around 7.3 in the demented group. Females tended to have a slightly lower brain pH in both groups, with the non-demented group having a low of about 6.0 and the demented group having a lot of around 4.4. We found this particularly interesting that females appear to have a slightly lower baseline pH compared to men and that brain pH appears to fall with dementia. At this point, we have to keep digging further to see if brain pH is a good predictor for dementia. If so, we will do more research on pH between females and males to see if baseline brain pH could contribute to the likelihood of developing dementia.

Examining the Relationship between Education and Dementia

One of our focuses in the previous dataset was on education and dementia, so we wanted to continue that analysis here.

All of the individuals in this dataset have had a fair amount of education (at least through high school), so we are unable to assess how extremely low levels of education may correlate with dementia. However, there is a large range in the type of education received as well as the number of years of education with a range of 9.

Average length of education for individuals with/without dementia:

This differs from our previous dataset, where education was longer with individuals without dementia.

The years of education does not vary much between females and males, which will be helpful in our analysis. Therefore, when we look at education we don’t have to worry as much about the distribution of sexes.

Once again, we see that this data may contradict our dataset from above. There appears to be a positive correlation in our new dataset between more education and the development of dementia, where it was the opposite in both datasets above. However, these correlations are still all relatively low, so we would need more data to truly determine the significance of education. We also recognize there may be some confounding factors at play here, which we want to consider when making any conclusion using our data.

From this correlation matrix, we see that none of the correlation are strong, except to themselves and male/female, which is -1 (because you must be recorded as one or the other). This is to be expected since we are examining a complex disease that even neuroscientists are unsure of its causes. We also need to remember that all of our data comes from humans, who are extremely variable in their genetic makeup, environment, and lifestyle, all of which contribute to confounding variables. However, the highest correlation appears to be between “last MMSE score” and “cog_status_num”, which is a binary column with 1 representing dementia and 0 representing no dementia. This informs us that moving forward in our prediction models, we should look at including the last MMSE score since it has a negative correlation with the cognitive status of -0.498.

Looking at MMSE (Mini Mental State Examination) Scores for Males/Females With/Without Dementia

The MMSE test is a means of measuring mental status and is one of the measures that might indicate dementia that is contained in both of our datasets. We wanted to attempt to compare the two datasets to see how findings differ.

Using the Metadata dataset:

Using our previous MRI dataset:

MMSE describes a patient’s mental state with a high score of 30 and a low of 0. Our quick dive into MMSE allowed us to see that there is a significant difference between the MMSE of individuals with and without dementia. We also were able to see that our new dataset contains individuals who had a lower score on average, since each respective mean was lower than the means of MMSE per group in the three combined datasets We will have to do a bit more research into exactly what a point in the MMSE score represents to better understand these differences.

Age of Diagnosis and Years of Education

This exploratory plot suggests that, in this dataset, there does not appear to be a correlation with years of education and the age of dementia diagnosis. We have to take this with a grain of salt because a large amount of the data had to be dropped because there was no age of diagnosis recorded. However, this is definitely something worth exploring more because it plays into our question regarding the correlations between education and dementia diagnosis.

Data Analysis and Modeling

Our initial hope when beginning this project was to see if there were correlations between gender, socioeconomic status, level of education, and the onset of Alzheimer’s or other dementias. Through our EDA, we have seen that without massive amounts of data, it may be difficult to build a predictive model since each human is so variable. However, by exploring the strongest correlations and combining them, we still hope to be able to build a working model.

Here are our ideas:

We do want to look at addtional features beyond gender and education, even if just for exploratory purposes, to get a sense of if there are any other potential correlations to note.

Preparation for model-building

The first step in building our model was combining the three datasets that we have used to perform our exploratory data analysis. This involved creating limiting the dataframe that contained only features we were interested in (and that had adequate data), ensuring that the column titles matched, and ensuring that the data was structured in the same format.

One complication that we ran into while working on combining these datasets was that the two OASIS datasets (cross-sectional and longitudinal) contained some of the same patients. To avoid skewing our results with duplicate data, we decided to use the donor IDs to drop any repeat data from the cross-sectional dataset. We needed to do this by looking at which patients matched on variables that were recorded in both datasets and excluded the ID, since participants were given different IDs in the different trials.

Just to take a look at what we're working with when thinking about a predictive model, we wanted to create a simple visualization to show the relationship between MMSE score and years of education. The green dots represent nondemented patients whereas the red represents patients with dementia. A visualization like this could be used to predict the classification of other patients based on these two features. This is not a complex model, but used more just for us to explore and play around with some graphing and visualizations.

To start the process of building a model, we wanted to take an initial look at the correlations of our combined dataframe. We used these correlation coefficients to inform us on potential variables to be using as features in our model.

At this point, some of the stronger correlations we noted was between nWBV and Age and between SES and EDUC_Y.

nWBV refers to the normalized whole brain volume. In this correlation matrix, it appears that there is a strong negative correlation in the volume and patient age, indicating that as patients age, brain volume also declines. This is interesting to note because maybe excelerated decline of nWBV could be a potential sign of dementia that we could include in our model. In other words, nondemented individuals may have a slower rate of brain atrophy whereas patients with Alzheimer's may have accelerated atrophy worth considering as a predictive feature.

In terms of SES and EDUC_Y, we were initially surprised to see a relatively strong negative correlation. However, upon closer investigation, we realized SES is measured on a 1-5 scale, where a measure of 1 indicates higher socioeconomic status. Therefore, as SES increases, EDUC_Y would be expected to decrease, as individuals of a lower socieconomic status are not surprisingly less likely to have high levels of education.

This is not to say that none of the other correlations are worth noting, but just at this simple level we noticed a few potential areas worth including in our models, which we wanted to further investigate.

To begin, we wanted to build a simple model with the features we've been working with while also imputing values for our missing data.

Our goal with this first model and performance metrics was to test our method of imputation. We tried a few differnt ways, but found that we consistently got the best results using the KNN imputer. While we recognize that imputing data may throw off some of our performance metrics, we ultimately decided that was our only option due to limited open-source data. This first model does not use cross validation or the optimal number of neighbors, which we select later on.

K Value Selection:

We wanted to get a sense of what value for k is best for an improved KNN-prediction model.

This indicates that the best selection for K is around 13.

Feature Selection:

When building a better version of our model, we wanted to be sure that we were building a model around factors that had the best predictive potential. We wanted to test out modeling with a different selection of features and investigate how our different values of accuracy, precision, recall, and F1 changed based on the different features we utilized. This function allowed us to put in a different selection of features and output the different measurement values.

Looking at using all features:

Excluding SSE:

Excluding Sex_binary:

Excluding Sex_binary and SSE:

Excluding Sex_binary and nWBV:

Excluding Sex_binary and MMSE:

Excluding Sex_binary and EDUC_Y:

Excluding Sex_binary, EDUC_Y, and SES:

Looking at a bunch of different combinations of features for a predictive model, we did see differences in our measurements of accuracy, precision, recall, and F1. For our evaluation, we looked at all 4 metrics, but put the most weight on the F1 score when selecting features. The highest score came from the combination in features 8, which included specific brain examination metrics and excluded any demographical data. However, since one of our goals was to use demographical data to aid in the prediction of Alzheimer's we chose the next best model, which excluded Sex_binary but include our remaining 7 features. We wanted to keep this in mind to determine whether we wanted to include all of the features in our final model.

While we were initially really interested in looking at the impacts of sex and education as primary predictive features, through a lot of our exploratory analysis and feature selection work, we recognize that several other values regarding brain volumetrics and other data may have stronger potential to act as predictive features. This is an interesting finding despite the uncertainty in some of our findings and our lack of expertise, because this does suggest that some of these other factors may be a dead-end for research and may suggest that scientists focus more on the other brain predictive features such as pH, nWBV, and eTIV, as well as the ability for scores such as MMSE to come in handy in early prediction of dementia.

In an improved version of our model, we want to build our model using the features we have selected as well as our selection for K. We did play around with trying to use mean imputation as well as trying to use IterativeImputer from Sci-kit learn (which is still experimental), however both of these approaches resulted in a decrease in our accuracy, recall, precision, and F1, so we decided to stick with KNNImputer. We also played around with n_neighbors to see the impact, but also decided to stick with our original imputation method. We recognize that there may be other ways to improve our imputation for the missing data, but we decided that for the purposes of this project we will stick with this.

We saw that playing with our K-value and selected features, we were able to slightly improve our model's accuracy, precision, recall, and f1 score. This model does not use cross validation, so these numbers represent the training performance, not the test. To see the test performance, we can run our selection from above:

As is expected, our test scores are lower than the training. However, despite a large margin for error, we chose this as our final model due to the many confounding variables that go into ALzheimer's and our limited access to data.

Insights and Final Thoughts

Going into our project, we were both initially really interested in more of the social and demographic patient features in terms of their correlation with the onset of dementia symptoms and diagnosis. Through our exploratory research and modeling, we did notice that there is probably a lot more potential for research to focus on the more specific cognitive brain data in order to understand how brain changes may have the potential to be used for early diagnosis and, in the future, potentially for prevention and therapy purposes. This project was a very difficult one at the offset because even the most informed scientists are still struggling to truly understand this disease and means of predicting diagnosis. However, we did find that our analysis presented us with some really interesting conclusions and prompted us to look into features that we were not initially as interested in exploring. The real value of our conclusions was to suggest where future research may be headed and to understand some methods for where more complex modeling may be used in the study of Alzheimer's going forward.

To improve upon our current model, we would recommend:


References

Alzheimer’s disease and healthy aging indicators: Cognitive decline | Chronic disease and health promotion data & indicators. (n.d.).https://chronicdata.cdc.gov/Healthy-Aging/Alzheimer-s-Disease-and-Healthy-Aging-Indicators-C/jhd5-u276

Buckner, R. L., Head, D., Parker, J., Fotenos, A. F., Marcus, D. S., Morris, J. C., et al (2004). A unified approach for morphometric and functional data analysis in young, old, and demented adults using automated atlas-based head size normalization: Reliability and validation against manual measurement of total intracranial volume. Neuroimage, 23, 724–738.

Daniel S. Marcus, Anthony F. Fotenos, John G. Csernansky, John C. Morris, Randy L. Buckner; Open Access Series of Imaging Studies: Longitudinal MRI Data in Nondemented and Demented Older Adults. J Cogn Neurosci 2010; 22 (12): 2677–2684. https://direct.mit.edu/jocn/article/22/12/2677/4983/Open-Access-Series-of-Imaging-Studies-Longitudinal

Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12, 189–198.

Fotenos, A. F., Snyder, A. Z., Girton, L. E., Morris, J. C., & Buckner, R. L. (2005). Normative estimates of cross-sectional and longitudinal brain volume decline in aging and AD. Neurology, 64, 1032–1039.

Hollingshead, A. (1957). Two factor index of social position. New Haven, CT: Yale University Press.

Kavitha, C., Mani, V., Srividhya, S. R., Khalaf, O. I., & Tavera Romero, C. A. (2022). Early-stage alzheimer’s disease prediction using machine learning models. Frontiers in Public Health, 10. https://www.frontiersin.org/articles/10.3389/fpubh.2022.853294

Marcus, D. S., Wang, T. H., Parker, J., Csernansky, J. G., Morris, J. C., & Buckner, R. L. (2007). Open access series of imaging studies (Oasis): Cross-sectional mri data in young, middle aged, nondemented, and demented older adults. Journal of Cognitive Neuroscience, 19(9), 1498–1507. https://doi.org/10.1162/jocn.2007.19.9.1498

Morris, J. C. (1993). The Clinical Dementia Rating (CDR): Current version and scoring rules. Neurology, 43, 2412–2414.

Mri and alzheimers. (n.d.). https://www.kaggle.com/datasets/jboysen/mri-and-alzheimers

Seattle alzheimer's disease brain cell atlas: Donor metadata. https://portal.brain-map.org/explore/seattle-alzheimers-disease/seattle-alzheimers-disease-brain-cell-atlas-download?edit&language=en


Back to Top