Muntasir Hossain

Logo

I am a data scientist with expertise in big data analysis, machine learning (ML) and deep learning, computational modelling, predictive modelling, computer vision and generative AI. I have a proven track record of delivering impactful results in diverse areas such as energy, technology, and cybersecurity. I have practical experience in developing end-to-end machine learning workflow including data preprocessing, model training at scale and model evaluation, deploying in production and model monitoring with data pipeline automation.

View my LinkedIn profile

Selected projects in data science, machine learning, deep learning, and LLMs.


Neural Network-Based Time-Series Forecasting (CNN-LSTM)

This project implements a multi-step time-series forecasting model using a hybrid CNN-LSTM architecture. The 1D convolutional neural network (CNN) extracts spatial features (e.g., local fluctuations) from the input sequence, while the LSTM network captures long-term temporal dependencies. Unlike recursive single-step prediction, the model performs direct multi-step forecasting (Seq2Seq), outputting am entire future sequence of values at once. Trained on historical energy data, the model forecasts weekly energy consumption over a consecutive 10-week horizon, achieving a Mean Absolute Percentage Error (MAPE) of 10% (equivalent to an overall accuracy of 90%). The results demonstrate robust performance for long-range forecasting, highlighting the effectiveness of combining CNNs for feature extraction and LSTMs for sequential modeling in energy demand prediction.

Figure: Actual and predicted energy usage over 10 weeks of time period.

View sample codes on GitHub


MLOps with AWS: End-to-End ML Pipelines and Deploymen

Develop an end-to-end machine learning (ML) workflow with automation for all the steps including data preprocessing, training models at scale with distributed computing (GPUs/CPUs), model evaluation, deploying in production, model monitoring and drift detection with Amazon SageMaker Pipeline - a purpose-built CI/CD service.

Figure: ML orchestration reference architecture with AWS

Figure: CI/CD pipeline with Amazon Sagemaker

AWS Amazon Sagemaker Amazon API Gateway

View sample codes on GitHub


Fine-tuning LLMs with ORPO & QLoRA (Mistral-v0.3)

ORPO (Odds Ratio Preference Optimization) is a single-stage fine-tuning method to align LLMs with human preferences efficiently while preserving general performance and avoiding multi-stage training. This method trains directly on human preference pairs (chosen, rejected) without a reward model or reinforcement learning (RL) loop, reducing training complexity and resource usage. However, fine-tuning an LLM (e.g. full fine-tuning) for a particular task can still be computationally intensive as it involves updating all the LLM model parameters. Parameter-efficient fine-tuning (PEFT) updates only a small subset of parameters, allowing LLM fine-tuning with limited resources. Here, I have fine-tuned the Mistral-7B-v0.3 foundation model with ORPO and QLoRA (a form of PEFT), by using NVIDIA L4 GPUs. In QLoRA, the pre-trained model weights are first quantized with 4-bit NormalFloat (NF4). The original model weights are frozen while trainable low-rank decomposition weight matrices are introduced and modified during the fine-tuning process, allowing for memory-efficient fine-tuning of the LLM without the need to retrain the entire model from scratch.  

Check the model on Hugging Face hub!


Evaluating Safety and Vulnerabilities of LLM apps

Overview

This project demonstrates iterative red-teaming of a policy assistant designed to answer questions about a government-style digital services policy, while strictly avoiding legal advice, speculation, or guidance on bypassing safeguards. The focus is on safety evaluation, failure analysis, and mitigation, rather than model fine-tuning.

Model Separation Strategy

The system intentionally uses different models for generation and evaluation:

Initial Evaluation

The policy assistant was evaluated using Giskard across prompt-injection, misuse, and bias detectors. The scan identified multiple failures where the agent did not attempt to answer questions based on the provided policy document. These were not hallucinations or unsafe outputs, but overly conservative refusals.

Figure 1: Initial scan results from Giskard.

Analysis

The root cause was over-refusal. The safety layer correctly blocked requests involving legal advice, speculation, or bypassing safeguards, but also refused some benign questions that could have been partially answered using neutral policy language. This reduced policy grounding and triggered Giskard failures.

Mitigation

The refusal strategy was refined to better distinguish between:

Outcome

A follow-up Giskard scan showed improved behavior:

Figure 2: Post mitigation scan results from Giskard.

This project demonstrates a complete red-teaming loop — evaluation, failure analysis, mitigation, and re-evaluation — and shows how safety behavior can be systematically improved without increasing risk or cost.

View project and source codes on GitHub


Retrieval-Augmented Generation (RAG) with LLMs and Vector Databases

RAG is a technique that combines a retriever and a generative LLM to deliver accurate responses to queries. It involves retrieving relevant information from a large corpus and then generating contextually appropriate responses to queries. Here, I used the open-source Llama 3 and Mistral v2 models and LangChain with GPU acceleration to perform generative question-answering (QA) with RAG.

View example codes for introduction to RAG on GitHub

Try my app below that uses the Llama 3/Mistral v2 models and FAISS vector store for RAG on your PDF documents!


Analysis & Interactive Visualisation of Global CO₂ Emissions

The World Bank provides data for greenhouse gas emissions in million metric tons of CO₂ equivalent (Mt CO₂e) based on the AR5 global warming potential (GWP). It provides information on environmental impact at both national, regional and economic levels over the past six decades.

Analytical approach:

Time-series aggregation and normalisation across countries, regions, and income groups; comparative cohort analysis across geographic and economic categories; interactive filtering and visual exploration to support exploratory analysis and pattern discovery.

Some key insights from the data:

Global CO₂ emissions

Figure: Interactive visualization of global CO₂ emissions by country and year

Time sereis CO₂ emissions

Figure: Time sereis CO₂ emissions for selected countries

Population Growth

Figure: Population growth for selected countries

CO₂ emissions by income groups

Figure: Interactive visualization of CO₂ emissions for different income zones from 1970 to 2023

CO₂ emissions by geographic regions

Figure: Interactive visualization of CO₂ emissions for different geographic regions from 1970 to 2023


Computer Vision: Building and Deploying YOLOv8 models for object detection at scale

Deployed a state-of-the-art YOLOv8 object detection model to real-time Amazon SageMaker endpoints, enabling scalable, low-latency inference for image and video inputs. Focused on model serving, endpoint configuration, and operational inference rather than model training.

Figure: Object detection with YOLOv8 model deployed to a real-time Amazon SageMaker endpoints.

YOLO AWS Amazon Sagemaker

View project on GitHub