Open to ML / NLP roles · New York City

Machine Learning Engineer building language systems that understand messy text.

I work in NLP, LLMs, and transformers, turning unstructured real-world text into structure through entity extraction, retrieval-augmented generation, and document understanding.

About

A bit about me

I'm an ML engineer who likes the unglamorous part of NLP: the messy, inconsistent, real-world text that most models choke on. My work blends production engineering with research-driven modeling, from hierarchical transformers to LLM-assisted dataset generation and large-scale text processing.

Right now I'm designing a context-aware transformer that fuses sentence, section, and document-level reasoning for robust employer and entity extraction. I care about systems that ship, reproduce, and hold up on data they have never seen.

Work

Featured projects

LLM · Agents

AI Agent Document Analyzer

A local retrieval agent over PDFs using LangChain, LlamaIndex, and Ollama for question answering and contextual summarization. Runs fully offline.

RAG · AWS Bedrock

PDF Question-Answering Chatbot

A retrieval-augmented chatbot that answers questions over uploaded PDFs, built on AWS Bedrock foundation models and served through a Streamlit interface.

LLM · Extraction

Document Extractor LLM

A Streamlit app that parses documents and pulls out structured fields using large language models, built for fast and accurate data extraction.

RecSys

Hybrid Recommendation System

A skincare recommender combining item-based collaborative filtering and content-based filtering, served through a Flask API and an interactive Streamlit app.

Computer Vision · Thesis

Real-Time Anomaly Detection

An optical-flow plus LSTM system that detects heavy-object anomalies in real-time waste-sorting conveyor footage, improving waste-to-energy processing.

ML · Fraud

E-Commerce Fraud Detection

XGBoost on 590K Vesta transactions, reaching ROC-AUC 0.89 and F1 0.76, tuned for the precision-recall tradeoff so real fraud is caught without flooding false positives.

Toolkit

What I work with

Languages

PythonSQL

ML & NLP

PyTorchTensorFlowTransformersscikit-learnpandasNumPy

LLMs & RAG

LangChainLlamaIndexRAGVector searchOllamaAWS Bedrock

Data & deployment

DockerAWSGoogle CloudPostgreSQLPySparkStreamlitFlask
Contact

Let's build something

I'm open to ML and NLP engineering roles and to collaborating on applied LLM systems and transformer research.