Traditional text-classification workflow
This project builds a spam-detection workflow for SMS messages using classical NLP representations and machine-learning models. It emphasizes model comparison, class-level metrics, and evaluation beyond headline accuracy.
Challenge
- Spam detection is an imbalanced text-classification task where overall accuracy can hide weak spam recall.
- Different text representations can change how classifiers separate ham and spam messages.
- Practical evaluation requires confusion matrices, recall, F1-score, and ranking behavior.
System architecture
Data and inputs
- 5,574 SMS messages with 4,827 ham and 747 spam messages.
- Binary text-classification task using Bag of Words and TF-IDF representations.
- The reported vocabulary size is 6,879 features with high sparsity.
Technical approach
- Preprocess SMS text and build sparse vector representations.
- Train Naive Bayes, Logistic Regression, and Support Vector Machine models.
- Compare Bag of Words and TF-IDF across multiple classifiers.
- Review confusion matrices, ROC curves, spam recall, and spam F1-score.
Evaluation and results
5,574 SMS messages
6 model/representation combinations
SVM + TF-IDF accuracy 97.67%
Spam F1-score 0.90
- SVM with TF-IDF achieved the best reported overall accuracy at 97.67%.
- The same configuration reached a 0.90 spam F1-score.
- The comparison showed why spam recall and F1-score should be reviewed alongside accuracy.
Implementation and code
Implementation focus
The implementation connects data preparation, modeling, evaluation, and interpretation in a structured workflow that makes the technical decisions clear.
Source code
The code is available for exploring the implementation details and extending the experiment when needed.
Scope and responsible use
The project focuses on language-data modeling and evaluation. Broader use would require domain-specific validation, edge-case assessment, monitoring, and testing on fresh data.
Future development
- Evaluate transformer-based spam models against classical baselines.
- Add calibration and threshold tuning for recall-sensitive use cases.
- Test robustness on newer SMS, messaging-app, and multilingual datasets.
Technical contribution
The project demonstrates careful evaluation for imbalanced text classification: comparing representations, reading class-level tradeoffs, and identifying a strong baseline workflow for spam detection.