Description:
🚀 We’re Hiring: Full Stack Engineer (LLM / AI Systems)
📍 Colabs , Gulberg II, Lahore, Onsite
🕘 9:30 AM – 6:00 PM
📅 Monday to Friday
We are seeking a highly skilled Full Stack Engineer to work onsite with our product, data, and platform teams to deploy, optimize, and scale Large Language Models (LLMs) in production environments.
This role combines LLM engineering, backend infrastructure, and product integration, requiring strong ownership and hands-on production experience.
🔹 Key Responsibilities
LLM & AI Systems
• Deploy and optimize open-source and commercial LLMs (GPT-4/4.1, Claude, LLaMA, Mistral, Mixtral)
• Implement inference optimization (quantization, batching, caching, distillation)
• Design and maintain RAG pipelines (embeddings, vector DBs, retrieval strategies)
• Improve latency, accuracy, hallucination reduction, and cost efficiency
• Implement prompt versioning and A/B testing
Backend & Infrastructure
• Design scalable APIs for AI-driven features
• Manage model-serving infrastructure (Docker, Kubernetes, GPUs)
• Optimize inference performance and hardware utilization
• Implement monitoring, logging & observability (Prometheus, Grafana, OpenTelemetry)
• Ensure security and data compliance across AI pipelines
Full Stack & Product Integration
• Build internal AI workflow tools
• Integrate LLM services into web & mobile applications
• Collaborate closely with product and design teams
• Rapid prototyping → testing → production deployment
🔹 Required Skills
• 4–8+ years experience (Full Stack / Backend)
• Strong Python & JavaScript/TypeScript
• FastAPI / Django / Node.js
• React / Next.js
• Pytorch/Tensor
• Experience with vLLM, Triton, TGI
• Cloud: AWS / GCP / Azure
• Docker & Kubernetes expertise
• Strong distributed systems fundamentals
🔹 Nice to Have
• Multimodal AI experience
• LLM cost optimization at scale
• Startup / high-growth environment exposure
• Experience building AI-native products
🎯 What Success Looks Like
• Stable, low-latency production LLM services
• Measurable cost-performance optimization
• Fast idea-to-production cycles
• Well-documented AI infrastructure
| Organization | Xccelerated |
| Industry | Web Development / Design Jobs |
| Occupational Category | Full Stack Engineer |
| Job Location | Lahore,Pakistan |
| Shift Type | Morning |
| Job Type | Full Time |
| Gender | No Preference |
| Career Level | Experienced Professional |
| Experience | 4 Years |
| Posted at | 2026-04-17 3:37 pm |
| Expires on | 2026-06-01 |