Full Stack Engineer

Description:

🚀 We’re Hiring: Full Stack Engineer (LLM / AI Systems)
📍 Colabs , Gulberg II, Lahore, Onsite
🕘 9:30 AM – 6:00 PM
📅 Monday to Friday

We are seeking a highly skilled Full Stack Engineer to work onsite with our product, data, and platform teams to deploy, optimize, and scale Large Language Models (LLMs) in production environments.
This role combines LLM engineering, backend infrastructure, and product integration, requiring strong ownership and hands-on production experience.

🔹 Key Responsibilities
LLM & AI Systems
• Deploy and optimize open-source and commercial LLMs (GPT-4/4.1, Claude, LLaMA, Mistral, Mixtral)
• Implement inference optimization (quantization, batching, caching, distillation)
• Design and maintain RAG pipelines (embeddings, vector DBs, retrieval strategies)
• Improve latency, accuracy, hallucination reduction, and cost efficiency
• Implement prompt versioning and A/B testing
Backend & Infrastructure
• Design scalable APIs for AI-driven features
• Manage model-serving infrastructure (Docker, Kubernetes, GPUs)
• Optimize inference performance and hardware utilization
• Implement monitoring, logging & observability (Prometheus, Grafana, OpenTelemetry)
• Ensure security and data compliance across AI pipelines
Full Stack & Product Integration
• Build internal AI workflow tools
• Integrate LLM services into web & mobile applications
• Collaborate closely with product and design teams
• Rapid prototyping → testing → production deployment

🔹 Required Skills
• 4–8+ years experience (Full Stack / Backend)
• Strong Python & JavaScript/TypeScript
• FastAPI / Django / Node.js
• React / Next.js
• Pytorch/Tensor
• Experience with vLLM, Triton, TGI
• Cloud: AWS / GCP / Azure
• Docker & Kubernetes expertise
• Strong distributed systems fundamentals

🔹 Nice to Have
• Multimodal AI experience
• LLM cost optimization at scale
• Startup / high-growth environment exposure
• Experience building AI-native products

🎯 What Success Looks Like
• Stable, low-latency production LLM services
• Measurable cost-performance optimization
• Fast idea-to-production cycles
• Well-documented AI infrastructure

Organization	Xccelerated
Industry	Web Development / Design Jobs
Occupational Category	Full Stack Engineer
Job Location	Lahore,Pakistan
Shift Type	Morning
Job Type	Full Time
Gender	No Preference
Career Level	Experienced Professional
Experience	4 Years
Posted at	2026-04-17 3:37 pm
Expires on	2026-08-12