Full Stack Engineer

 

Description:

🚀 We’re Hiring: Full Stack Engineer (LLM / AI Systems)
📍 Colabs , Gulberg II, Lahore, Onsite
🕘 9:30 AM – 6:00 PM
📅 Monday to Friday

We are seeking a highly skilled Full Stack Engineer to work onsite with our product, data, and platform teams to deploy, optimize, and scale Large Language Models (LLMs) in production environments.
This role combines LLM engineering, backend infrastructure, and product integration, requiring strong ownership and hands-on production experience.

🔹 Key Responsibilities
LLM & AI Systems
• Deploy and optimize open-source and commercial LLMs (GPT-4/4.1, Claude, LLaMA, Mistral, Mixtral)
• Implement inference optimization (quantization, batching, caching, distillation)
• Design and maintain RAG pipelines (embeddings, vector DBs, retrieval strategies)
• Improve latency, accuracy, hallucination reduction, and cost efficiency
• Implement prompt versioning and A/B testing
Backend & Infrastructure
• Design scalable APIs for AI-driven features
• Manage model-serving infrastructure (Docker, Kubernetes, GPUs)
• Optimize inference performance and hardware utilization
• Implement monitoring, logging & observability (Prometheus, Grafana, OpenTelemetry)
• Ensure security and data compliance across AI pipelines
Full Stack & Product Integration
• Build internal AI workflow tools
• Integrate LLM services into web & mobile applications
• Collaborate closely with product and design teams
• Rapid prototyping → testing → production deployment

🔹 Required Skills
• 4–8+ years experience (Full Stack / Backend)
• Strong Python & JavaScript/TypeScript
• FastAPI / Django / Node.js
• React / Next.js
• Pytorch/Tensor
• Experience with vLLM, Triton, TGI
• Cloud: AWS / GCP / Azure
• Docker & Kubernetes expertise
• Strong distributed systems fundamentals

🔹 Nice to Have
• Multimodal AI experience
• LLM cost optimization at scale
• Startup / high-growth environment exposure
• Experience building AI-native products

🎯 What Success Looks Like
• Stable, low-latency production LLM services
• Measurable cost-performance optimization
• Fast idea-to-production cycles
• Well-documented AI infrastructure

Organization Xccelerated
Industry Web Development / Design Jobs
Occupational Category Full Stack Engineer
Job Location Lahore,Pakistan
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Experienced Professional
Experience 4 Years
Posted at 2026-04-17 3:37 pm
Expires on 2026-06-01