Work Preference
Summary
Overview
Work History
Education
Skills
Additional Information
Accomplishments
Certification
LANGUAGES
Timeline
MAUREEN MUTHONI
Open To Work

MAUREEN MUTHONI

Nairobi

Work Preference

Job Search Status

Open to work
Desired start date: Flexible

Desired Job Title

Junior Data Scientist

Work Type

Full TimePart TimeGig Work

Location Preference

RemoteHybrid
Location: Nairobi, KE
Open to relocation: Yes

Salary Range

51000/yr - 200000/yr

Important To Me

Career advancementWork-life balanceFlexible work hoursHealthcare benefitsWork from home optionPaid time offPaid sick leavePersonal development programs

Summary

Recent Data Science & AI graduate with a multidisciplinary background spanning microbiology, business analytics, and machine learning engineering. Experienced in building end-to-end ML pipelines, retrieval augmented generation (RAG) systems, and data-driven tools for healthcare and business intelligence. Adept at translating complex data into clear, actionable insights for both technical and non-technical stakeholders.

Overview

1
1
Certification
1
1
year of professional experience

Work History

Research Intern

Kenya Medical Research Institute
09.2024 - 09.2025
  • Supported development of analytical tools and interactive visualisations for healthcare decision-making
  • Performed exploratory data analysis (EDA) and insight generation for business and operational initiatives
  • Collaborated across technical and business stakeholders to translate complex datasets into clear, actionable recommendations
  • Contributed to dashboard development using Excel, Power BI to track operational KPIs
  • Assisted in data collection and analysis using statistical software to ensure accuracy.
  • Collaborated with team members to develop presentations summarizing research findings.
  • Managed project timelines and coordinated tasks to meet deadlines effectively.
  • Participated in weekly meetings, providing updates on progress and challenges faced.
  • Maintained organized records of research activities, enhancing project workflow efficiency.
  • Completed research, compiled data, and assisted in timely reporting.
  • Completed research, compiled data, updated spreadsheets, and produced timely reports.
  • Streamlined data entry processes for increased efficiency and accuracy in results reporting.

Education

Data Science & Artificial Intelligence -

Luxdev Hq, Nairobi, Kenya
06-2026
  • Intensive program covering Excel, PowerBi, SQL, Python, Machine Learning, Deep Learning, NLP, RAG systems, and production ML deployment
  • Built multiple end-to-end projects integrating Python, vector databases, and LLM-based inference pipelines

Bachelor of Science - Microbiology And Biotechnology

University of Nairobi, Kenya
09-2021

Developed strong foundations in quantitative research, scientific data analysis, and experimental design

Skills

  • Languages: Python, SQL
  • ML / AI: scikit-learn, Tensorflow, PyTorch, RAG pipelines, LangChain, vector embeddings, hyperparameter tuning, NLP, Neural Networks
  • Databases: ChromaDB, PostgreSQL, MySQL, SQLite
  • Visualisation: Power BI, Excel (PivotTables, Power Query), Matplotlib, Plotly, Seaborn
  • Frameworks: FastAPI, Jupyter Notebook, Git/GitHub, VS Code
  • Domain: Healthcare analytics, business intelligence, scientific data analysis, document intelligence
  • Soft Skills: Stakeholder communication, cross-functional collaboration, problem-solving, detail-oriented

Additional Information

Healthcare Data Pipeline & Machine Learning System Data Engineering / ML

  • Python, PostgreSQL, XGBoost, FastAPI, Airflow, Render
  • Built an end-to-end healthcare analytics pipeline on a synthetic dataset of 10,000 patient records (Kaggle), covering data cleaning, storage, modelling, and deployment
  • Designed and populated a PostgreSQL database (Aiven Cloud) with both raw and cleaned patient data, ensuring data integrity across tables
  • Trained XGBoost and Logistic Regression classifiers to predict patient test results (Normal / Abnormal / Inconclusive), evaluated with accuracy, precision, recall, F1-score, and confusion matrix
  • Automated weekly model retraining via an Apache Airflow DAG scheduled every Saturday, pulling the latest data from PostgreSQL
  • Deployed a FastAPI prediction endpoint (POST /predict) to Render, making the model publicly accessible via a live URL

AI-Powered Adaptive Learning System ML / AI Engineering

  • Python, RAG, LLMs, FastAPI
  • Designed and developed an intelligent learning platform combining a frontend interface with backend ML systems
  • Integrated machine learning models with RAG pipelines to personalise content delivery based on learner behaviour and progress
  • Developed scalable backend architecture integrating APIs, vector embeddings, and LLM inference for real-time adaptive responses

Customer Lifetime Value (CLV) Prediction System ML · Predictive Analytics

• Architected an end-to-end predictive modelling pipeline for revenue forecasting with advanced feature engineering

• Optimised model performance via cross-validation and hyperparameter tuning; delivered interpretable outputs to support customer retention strategy

Retrieval-Augmented Generation (RAG) Systems AI · NLP Engineering

• Designed scalable semantic retrieval systems converting multi-document corpora into embedding-powered search engines using ChromaDB and LangChain

• Built modular ingestion pipelines (PDF parsing, chunking, embedding, indexing) for 50+ page document datasets; improved contextual accuracy over keyword search via dense vector similarity

Production Oriented ML Application — Travel AI Assistant ML Engineering / Backend

  • FastAPI, Python, Vector DB
  • Built a FastAPI backend integrating retrieval and inference layers for a deployment-ready ML application
  • Structured ingestion, retrieval, and query logic into modular, maintainable components following production best practices

Accomplishments

  • Successfully transitioned from a Microbiology and Biotechnology background into Data Science and AI Engineering, combining scientific research expertise with advanced analytics.
  • Built multiple production oriented machine learning applications across healthcare, education, sports analytics, and document intelligence domains.
  • Developed automated ML pipelines featuring cloud databases, FastAPI deployment, Apache Airflow orchestration, and scheduled model retraining.
  • Applied machine learning and statistical analysis techniques to solve real-world problems using healthcare and scientific datasets.

Certification

  • Data Science & Artificial Intelligence — Luxdev Hq
  • Data Analysis with Python — Coursera / IBM
  • International Certificate of Digital Literacy — Mahanaim College

LANGUAGES

English (Fluent)
Swahili (Native)

Timeline

Research Intern - Kenya Medical Research Institute
09.2024 - 09.2025
Luxdev Hq - Data Science & Artificial Intelligence ,
University of Nairobi - Bachelor of Science, Microbiology And Biotechnology
MAUREEN MUTHONI