Valentina Sanchez · ML / NLP Engineer

About

A bit about me

I'm an ML engineer who likes the unglamorous part of NLP: the messy, inconsistent, real-world text that most models choke on. My work blends production engineering with research-driven modeling, from hierarchical transformers to LLM-assisted dataset generation and large-scale text processing.

Right now I'm designing a context-aware transformer that fuses sentence, section, and document-level reasoning for robust employer and entity extraction. I care about systems that ship, reproduce, and hold up on data they have never seen.

Work

Featured projects

LLM · Agents

AI Agent Document Analyzer

A local retrieval agent over PDFs using LangChain, LlamaIndex, and Ollama for question answering and contextual summarization. Runs fully offline.

Code

RAG · AWS Bedrock

PDF Question-Answering Chatbot

A retrieval-augmented chatbot that answers questions over uploaded PDFs, built on AWS Bedrock foundation models and served through a Streamlit interface.

Code

LLM · Extraction

Document Extractor LLM

A Streamlit app that parses documents and pulls out structured fields using large language models, built for fast and accurate data extraction.

Code

RecSys

Hybrid Recommendation System

A skincare recommender combining item-based collaborative filtering and content-based filtering, served through a Flask API and an interactive Streamlit app.

Code

Computer Vision · Thesis

Real-Time Anomaly Detection

An optical-flow plus LSTM system that detects heavy-object anomalies in real-time waste-sorting conveyor footage, improving waste-to-energy processing.

Code

ML · Fraud

E-Commerce Fraud Detection

XGBoost on 590K Vesta transactions, reaching ROC-AUC 0.89 and F1 0.76, tuned for the precision-recall tradeoff so real fraud is caught without flooding false positives.

Code

Machine Learning Engineer building language systems that understand messy text.

A bit about me

Featured projects

AI Agent Document Analyzer

PDF Question-Answering Chatbot

Document Extractor LLM

Hybrid Recommendation System

Real-Time Anomaly Detection

E-Commerce Fraud Detection

What I work with

Languages

ML & NLP

LLMs & RAG

Data & deployment

Let's build something