Hi

I'm

Siddhant Rajhans.

I'm a Machine Learning Engineer specializing in LLM evaluation and AI security. I'm a contributor to EleutherAI's lm-evaluation-harness, a published researcher (IEEE, Springer), and I've built production AI at scale (400M+ records/month) with monitoring, guardrails, and responsible-AI practice.

Siddhant Rajhans
Merged PR to EleutherAI's
lm-evaluation-harness
3 peer-reviewed papers
IEEE · Springer
400M+ records / month
in production
PyPI · HF published
open-source toolkit

01 / Selected Work

Explore My Latest Projects

CortexLab Architecture

CortexLab: Brain Encoding Toolkit

Multimodal fMRI brain encoding toolkit built on Meta's TRIBE v2. GPU voxelwise ridge (Triton), causal modality lesion analysis, brain-alignment benchmarking, 3D brain viewer, live inference from webcam/screen/video. 143 tests, 4 contributors, published on PyPI and HuggingFace.

Bloomberg Sentiment x FinBERT Architecture

Bloomberg Sentiment × FinBERT

Empirical test of whether Bloomberg's NEWS_SENTIMENT_DAILY_AVG can be replicated with open-source FinBERT on a 30-stock S&P 100 universe (2018-2026, 62.8K stock-day obs). Multi-horizon IC, long-short quintile backtest, Fama-French 3 with Newey-West HAC. Replication fails (ρ = −0.26) — and that's the point.

DreamStudio Architecture

DreamStudio: AI Cinematic Story Director

AI-powered cinematic story director. Speak naturally, point your camera, and watch scenes materialize as images, video, and music in real-time. Full-stack app with web, mobile, and Python backend.

Data Pipeline Architecture

Messaging-Based Data Pipeline

End-to-end streaming pipeline processing 400M+ records/month with 65s latency (p95). Kappa architecture with event-time processing, watermarking, and fault tolerance built for bursty event ingestion.

Multimodal RAG Architecture

Multimodal RAG for Scientific Literature

Textual + multimodal RAG over 500K+ papers with a LangGraph agent that runs a grounding-validation step, flagging weakly-supported answers instead of returning them (a hallucination guardrail). FAISS retrieval with cross-encoder reranking; 94% recall on an internal evaluation set.

Bike Lane Sentinel Architecture

Bike Lane Sentinel

Computer vision system that monitors illegal vehicle encroachment into NYC bike lanes and automatically alerts NYC DOT. Real-time detection with interactive dashboard and evidence capture.

02 / Open Source

Open Source & Contributions

03 / Experience

Built Production Systems & Shipped Real Products.

Oct 2023 – Dec 2024

Machine Learning Engineer

Seed-Stage Health-Tech Startup (NDA)

  • Designed and shipped streaming ML pipelines on AWS (Kafka, Airflow, Docker) with the data engineering team, processing 400M+ healthcare records/month at <65s p95 latency.
  • Developed transformer-based NLP and multimodal retrieval pipelines for clinical notes, lab reports, and diagnostic documents, improving retrieval accuracy by 30%.
  • Led the retrieval and prompt-orchestration workstream of an agentic AI system powering adaptive, personalized clinical-facing interactions.
  • Built model monitoring and data validation checks tracking data drift, prediction quality, and data integrity across clinical ML workflows.
AWSKafkaAirflowDockerNLPAgentic AI
Apr – Sep 2023

Software Development Engineer Intern

Seed-Stage Health-Tech Startup (NDA)

  • Contributed to developing backend services and APIs supporting ML-driven features, helping reduce average request latency by 40% by moving synchronous flows to async processing.
  • Helped improve ML model serving pipelines with caching and request batching, making clinical-facing features more responsive.
  • Assisted senior engineers in strengthening API security: input validation, authentication middleware, and audit logging for sensitive healthcare data.
ML ServingAPIsBackend OptimizationSecurity

04 / Toolkit

Technologies and Tools I Work With

LLM Evaluation & AI Security

LLM Evaluationlm-evaluation-harnessRed-teamingGuardrailsGrounding / FaithfulnessModel Monitoring & DriftResponsible AI (NIST AI RMF)

LLM, Agents & RAG

LangGraphRAGAgentic OrchestrationTool CallingPrompt EngineeringBenchmark Design

Core ML & Deep Learning

PyTorchTensorFlowScikit-learnTransformersHugging FaceCNNs

Data, Infra & MLOps

PythonSQLApache SparkKafkaFAISSPostgres/PgvectorAWS (SageMaker, S3)DockerAirflowMLflowW&B

05 / Research

Research Publications

Peer-reviewed work in AI security and trust — deepfake detection, ML for cybersecurity, and threats in AI-integrated cloud systems.

06 / Writing

From the Blog

Notes on building ML systems, evaluating frontier models, and lessons from production.

Read all posts

07 / Education

Education & Credentials

2025–2026

MS in Machine Learning

Stevens Institute of Technology

Hoboken, NJ
2019–2023

B.Tech in Computer Science & Engineering

Swami Rama Himalayan University

Dehradun, India

Certifications

Weights & Biases Training & Fine-tuning LLMs Verify credential
Stanford / Coursera Machine Learning Verify credential
NVIDIA Generative AI with Diffusion Models Verify credential
IBM Cloud Computing Diploma Verify credential

08 / Contact

Let's build something.

Graduating December 2026. Open to Applied AI, ML, and research-engineering roles, especially in LLM evaluation and AI security. Based in Hoboken, NJ; open to New York and remote.