-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathtrajectory.txt
More file actions
39 lines (38 loc) · 1.98 KB
/
trajectory.txt
File metadata and controls
39 lines (38 loc) · 1.98 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
1. Exploratory Data Analysis (EDA)
a) Check Data Quality
Ensure no duplicate samples.
Identify missing values and decide on imputation strategies.
Examine distributions of features across omics types.
b) Dimensionality Reduction
Use PCA or t-SNE to visualize high-dimensional omics data.
Compare feature variance across different omics types.
c) Survival Data Inspection
Kaplan-Meier survival curves for key variables.
Check censoring distribution (percentage of censored vs. uncensored data).
2. Feature Engineering
a) Standardization
Normalize features within each omics type (Z-score, Min-Max scaling).
b) Biological Feature Construction
Aggregate data by biological pathways (Reactome, KEGG).
Compute pathway activity scores.
Generate gene interaction network features.
c) Feature Selection
Recursive Feature Elimination (RFE) with Random Forest or Elastic Net.
Use SHAP values or permutation importance for interpretability.
3. Baseline Model Training
a) Train individual models for each omics type to benchmark:
Genomics: Elastic Net Cox Model, XGBoost.
Transcriptomics: DeepSurv, Random Forest.
Proteomics: Support Vector Machines (SVM), XGBoost.
b) Evaluate models separately before combining multiomics data.
4. Multiomics Model Development
a) Early Fusion: Concatenate all omics features into a single model.
b) Late Fusion: Train separate models and combine outputs via ensemble methods.
c) Multitask Learning: Jointly predict survival and tumor characteristics.
5. Model Evaluation
a) Survival Metrics: Concordance Index (C-index), NDCG, Kaplan-Meier plots.
b) Feature Importance: Identify key TFs, pathways, and omics contributions.
c) Cross-Validation: Stratified k-fold validation.
6. Biological Interpretation
a) Use enrichment analysis to understand top predictive features.
Compare feature importance across omics types.