← Back to Home

fMRI Age Prediction Pipeline

This project predicts participant age from resting-state fMRI connectivity data. The pipeline includes preprocessing, PCA-based dimensionality reduction, and Ridge regression modeling.

Model Performance

Visual Results

Below are plots from the trained model, showing predicted vs actual ages and residual analysis.

True vs Predicted Age Plot Residual Distribution Plot Residuals vs Predicted Plot

Pipeline Overview

1
Load & Flatten fMRI Connectivity

Each subject has a 200×200 functional connectivity matrix (Schaefer2018 atlas). I extract the upper triangle of this matrix into a 19,900-dimensional feature vector.

2
Merge with Metadata

I join these brain features with participant metadata and use age as the regression target.

3
Scale + PCA

I standardize the features and apply PCA to reduce 19,900 dimensions down to 100 principal components, keeping most of the variance while reducing noise and overfitting.

4
Ridge Regression + Evaluation

A Ridge regression model is trained on the PCA components with a held-out validation split. Performance is measured using R², RMSE, and MAE on unseen data.

Code Excerpt (Python)

Below is a simplified version of the core training loop: loading connectivity matrices, reducing dimensionality, and fitting a Ridge regression model.

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.linear_model import Ridge
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error

# X: flattened connectivity features, y: participant age
X_train, X_val, y_train, y_val = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Scale features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)

# PCA for dimensionality reduction
pca = PCA(n_components=100, random_state=42)
X_train_pca = pca.fit_transform(X_train_scaled)
X_val_pca = pca.transform(X_val_scaled)

# Ridge regression model
model = Ridge(alpha=1.0, random_state=42)
model.fit(X_train_pca, y_train)

# Evaluation
y_pred = model.predict(X_val_pca)
r2 = r2_score(y_val, y_pred)
rmse = mean_squared_error(y_val, y_pred, squared=False)
mae = mean_absolute_error(y_val, y_pred)

print("R²:", r2, "RMSE:", rmse, "MAE:", mae)

View full notebook on GitHub