Muhammad Fikri Kafilli Indonesia

About Me

Muhammad Fikri Kafilli

Hi I am Muhammad Fikri Kafilli from Indonesia. I am immersed in the dynamic field of artificial intelligence. I hold a Bachelor's degree in Computer Science from Universitas Pendidikan Indonesia.

My passion lies in leveraging data to solve real-world challenges, and I find great satisfaction in applying statistical insights to drive meaningful outcomes. I am dedicated to staying at the forefront of AI advancements and actively keep myself updated with new technologies and methodologies in the field. I believe in the power of continuous learning and actively seek opportunities to learn from others in the AI and data science community.

Feel free to connect with me on LinkedIn!

Achievements & Certifications

Dev Certified for Machine Learning with TensorFlow

Certificate Image

A comprehensive certification validating expertise in Machine Learning. Covers TensorFlow skills, Neural Network Fundamentals, Image Classification, Natural Language Processing, and Time Series prediction.

View Credential

Bangkit Academy 2024
Machine Learning Cohort

Certificate Image

Bangkit is a unique career readiness program led by Google and delivered with support from industry experts from GoTo and Traveloka. Intensive training in Machine Learning technology, essential soft skills, and professional English.

View Credential

IDCamp 2024
Data Scientist

Certificate Image

Completed the Expert Data Scientist learning path by Indosat Ooredoo Hutchison & Dicoding. Mastered advanced data analysis and predictive modeling.

View Credential

Hology 7.0
Data Mining Finalist

Certificate Image

Secured a Finalist position in the Data Mining category at Hology 7.0. Developed a multi-label classification system capable of simultaneously detecting clothing types and their colors.

View Credential

Projects

Job-CV

Real-time Two-stream Violence Detection in Videos

TensorFlow ConvLSTM RTMO (Pose) OpenCV

A robust dual-stream deep learning system designed for real-time surveillance. By combining Skeleton data with Frame Differences, it effectively detects violence while preserving privacy and significantly reducing computational load through smart motion filtering.

View Project
Job-CV

Job-CV Matching System with Vector Search

Sentence-BERT (SBERT) ReactJS (Vite) Flask Zilliz Milvus Docker GCP

This project is an AI-powered job matching web application aimed at helping job seekers, especially in the IT industry, find suitable job opportunities based on their skills. The system leverages vector search technology to match resumes with job descriptions effectively.

View Project
AI-Clipper

AI-Clipper: Automated Video Highlighting & Rendering Pipeline

Multi-Agent LLM GCP (Firestore & GCS) FastAPI React (TypeScript) Whisper (Local STT)

An end-to-end, AI-powered video processing platform built for high-volume content curation. It features a Dual-Processing Architecture: an automated pipeline utilizing a Multi-Agent Quality Control system to extract and rank viral-worthy clips, and a highly optimized manual mode powered by Local Whisper AI for precision cuts. The system handles complex tasks asynchronously, including dynamic Person Tracking, multi-language translation, and automated video rendering.

View Project
LendGuard-ID

LendGuard: Automated Credit Scoring & Fraud Detection

FastAPI Supabase (Cloud DB) DeepFace (VGG-Face) & CatBoost Celery & Redis Docker GitHub Actions (CI/CD)

An end-to-end fintech simulation engineered with a Hybrid Cloud Architecture. It integrates DeepFace for biometric KYC verification and CatBoost for credit risk analysis. The system solves high-latency AI tasks using an asynchronous Celery & Redis worker pipeline, while leveraging Supabase for scalable cloud database & storage management. Fully containerized via Docker for consistent deployment. Featuring comprehensive Unit Testing (Pytest) and a fully automated CI/CD pipeline to ensure code quality and deployment stability.

View Project
Multilabel

Multilabel Clothing Classification

Vision Transformer (ViT)

🏆 Hology 7.0 Finalist Project
This project presents the implementation of a multi-label classification model designed to classify clothing attributes. The project was developed as part of the Hology Data Mining 2024 Preliminary Competition, organized by Universitas Brawijaya. It focuses on categorizing images of T-shirts and hoodies from various online stores in Indonesia based on their type (e.g., T-shirt or hoodie) and color (e.g., black, blue, white, etc.).

View Project
Multilabel

End-to-End e-KYC Identity Verification System

PyTorch FaceNet GenAI (Qwen-VL) Quantization Streamlit

A production-prototype e-KYC solution engineered for high-efficiency identity verification. It integrates a hybrid quantization pipeline (FP32/INT8) for low-latency face matching and utilizes Multimodal LLMs for robust, structure-aware ID card data extraction.

View Project
Multilabel

Student Dropout Analysis

Python Streamlit Scikit-Learn Docker Metabase

Developed a comprehensive system to reduce student dropout rates. Includes a Metabase dashboard for real-time monitoring and a deployed Streamlit Machine Learning app that identifies at-risk students based on academic performance and financial factors.

View Project
Steam Review Intelligence

Steam Review Intelligence: Business Insights & Sentiment Analysis

NLP (NLTK) Scikit-Learn TensorFlow (Keras) TF-IDF Pandas

An end-to-end NLP pipeline analyzing over 6.4 million Steam reviews to extract actionable business intelligence. By leveraging advanced N-Gram analysis, the system identifies core drivers of player churn (e.g., predatory monetization) and retention. It features a highly optimized predictive modeling phase, benchmarking multiple architectures (Logistic Regression vs. LightGBM vs. Deep Learning MLP) on a perfectly balanced dataset to automate sentiment classification with 83%+ accuracy.

View Project
Multilabel

Employee Attrition Prediction

Scikit-Learn Docker Metabase Business Intelligence

Built an end-to-end system to address a high attrition rate (>10%). Consists of a Dockerized Metabase dashboard for real-time HR monitoring and a Random Forest model that identifies key risk factors like Overtime and Stock Options.

View Project
Multilabel

Chatbot PC Builder Agent

Gemini 2.0 LangChain Streamlit Pandas

A smart agent that eliminates LLM hallucinations by combining Gemini's reasoning with a strict Python logic engine for 100% hardware compatibility. Features real-time price fetching from Indonesian marketplaces (Tokopedia/Shopee) and visual component analysis.

View Project
Multilabel

PDF Chatbot using RAG and Vector Database

Streamlit Groq API (Llama 3) Qdrant Cloud SQLite LangChain PyMuPDF

An RAG application engineered for scalability and data persistence. This system supports isolated multi-chat sessions using SQLite for history management and Qdrant Cloud for vector storage. Key features include metadata filtering for context separation, strict anti-hallucination guardrails, and a transparent debug UI to visualize retrieved context chunks.

View Project
Multilabel

Cyclist Data Analysis

Pandas Data Visualization Matplotlib

In this project, I analyzed Cyclistic bike-share data using Python and Pandas to compare usage patterns between annual members and casual riders. I cleaned and processed a year's worth of ride data, revealing that members predominantly use bikes for weekday commutes, while casual riders prefer weekend recreational trips. Based on these insights, I developed data-driven recommendations for targeted marketing and membership strategies, demonstrating my ability to extract actionable insights from large datasets.

View Project