dataset_info: dataset_name: african-cp-synthetic version: 1.0.0 description: | Synthetic cerebral palsy dataset for African populations, generated using literature-informed probabilistic modeling. Includes 9 datasets with 20,325 total samples across various configurations (training, balanced, high-risk, test sets). homepage: https://huggingface.co/datasets/[your-username]/african-cp-synthetic license: cc-by-nc-4.0 citation: | African Cerebral Palsy Synthetic Dataset (2025) Literature-informed probabilistic generation for CP detection Version 1.0, November 2025 features: - name: id dtype: int64 description: Unique sample identifier - name: gestational_age dtype: float64 description: Gestational age in weeks (bimodal distribution) - name: birth_weight dtype: float64 description: Birth weight in kilograms (conditional on gestational age) - name: is_sga dtype: bool description: Small for gestational age (OR 2.43) - name: birth_asphyxia dtype: bool description: Birth asphyxia (47.6% in African CP cases) - name: neonatal_seizures dtype: bool description: Neonatal seizures - name: hyperbilirubinemia dtype: bool description: Hyperbilirubinemia/kernicterus (23.8% in African CP) - name: neonatal_infection dtype: bool description: CNS infections - name: maternal_infection dtype: bool description: Maternal infections during pregnancy - name: preclampsia dtype: bool description: Pre-eclampsia (protective effect) - name: malaria_with_seizures dtype: bool description: Malaria with seizures (African context only) - name: tuberculous_meningitis dtype: bool description: Tuberculous meningitis (African context only) - name: head_control_age dtype: float64 description: Age at head control in months - name: sitting_age dtype: float64 description: Age at independent sitting in months - name: crawling_age dtype: float64 description: Age at crawling in months (null if not achieved) - name: walking_age dtype: float64 description: Age at walking in months (null if not achieved) - name: epilepsy dtype: bool description: Epilepsy diagnosis (40% in African CP cases) - name: feeding_difficulties dtype: bool description: Feeding difficulties (55% in CP cases) - name: visual_impairment dtype: bool description: Visual impairment (40% in African CP) - name: hearing_impairment dtype: bool description: Hearing impairment (22% in CP cases) - name: speech_impairment dtype: bool description: Speech impairment (45% in CP cases) - name: intellectual_disability dtype: bool description: Intellectual disability (50% in CP cases) - name: tone_abnormality dtype: string description: Muscle tone abnormality type (hypertonia/hypotonia/variable/mixed) - name: has_cp dtype: bool description: TARGET - Cerebral palsy diagnosis - name: cp_type dtype: string description: CP type (spastic/ataxic/dystonic/choreoathetoid/mixed) - name: cp_subtype dtype: string description: CP subtype (bilateral/unilateral/generalized/variable) - name: gmfcs_level dtype: int64 description: GMFCS severity level (1-5) - name: cp_probability_score dtype: float64 description: Calculated CP probability score (0-1) splits: - name: train_1k num_samples: 1000 file: africa_cp_train_1000.csv - name: train_5k num_samples: 5000 file: africa_cp_train_5000.csv - name: train_10k num_samples: 10000 file: africa_cp_train_10000.csv - name: balanced num_samples: 1000 file: africa_cp_balanced_1000.csv - name: preterm num_samples: 2000 file: africa_cp_preterm_2000.csv - name: cp_only num_samples: 500 file: africa_cp_cases_only_500.csv - name: test num_samples: 2000 file: africa_cp_test_2000.csv task_categories: - tabular-classification - medical task_ids: - binary-classification - medical-diagnosis - health-classification tags: - cerebral-palsy - medical - healthcare - africa - synthetic-data - pediatrics - neurology - low-resource - early-detection size_categories: - 10K