LockedIn AI – AI Cloud Engineer
LockedIn AIJob Overview
Job Description
About LockedIn AI
LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted by over 1 million users worldwide. We are building the most advanced AI-powered career preparation platform, helping users succeed in interviews, assessments, and professional communication in real time.
Our system powers real-time AI assistance during live interviews, coding rounds, and meetings—helping users communicate with clarity, confidence, and precision when it matters most.
Role Overview
We are looking for a cloud-native, AI-focused AI Cloud Engineer to design, build, and optimize the infrastructure that powers LockedIn AI’s machine learning systems and real-time inference pipelines.
This role sits at the intersection of cloud engineering and AI infrastructure. You will be responsible for the environments where models are trained, fine-tuned, deployed, and served at scale to over 1 million users.
You will build and operate high-performance, GPU-powered cloud systems that ensure low-latency, cost-efficient, and highly reliable AI experiences in production.
Key Responsibilities
AI Cloud Architecture
- Design scalable cloud infrastructure for AI/ML workloads (training, inference, evaluation)
- Build GPU-based compute systems optimized for model performance and cost efficiency
- Architect secure and scalable environments across AWS, GCP, or Azure
- Implement multi-stage environments for training, staging, and production AI systems
AI Model Serving Infrastructure
- Build and maintain real-time inference systems serving LLM-based applications
- Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, etc.)
- Improve latency, throughput, and reliability of AI inference pipelines
- Design load balancing and failover systems for high-availability AI services
GPU Compute & Training Systems
- Manage distributed GPU clusters for model training and fine-tuning
- Optimize GPU utilization using scheduling, spot instances, and auto-scaling
- Support large-scale distributed training workflows
- Work with managed AI platforms like SageMaker, Vertex AI, or Azure ML
Cost Optimization & FinOps
- Optimize cloud spend across GPU, storage, and inference workloads
- Reduce LLM and compute costs through intelligent infrastructure design
- Build dashboards to monitor cost-per-inference and GPU efficiency
- Identify and eliminate underutilized or over-provisioned resources
Security & Networking
- Implement secure cloud networking (VPCs, private endpoints, IAM policies)
- Protect AI assets including models, embeddings, and training data
- Ensure encryption, logging, and compliance for AI systems
- Design secure APIs for real-time AI inference
Infrastructure Automation & Observability
- Build Infrastructure as Code using Terraform or similar tools
- Automate deployment of AI environments and inference endpoints
- Implement monitoring for GPU health, latency, and system performance
- Set up alerting systems for failures, bottlenecks, and anomalies
Required Qualifications
- 3+ years of experience in cloud engineering, DevOps, or infrastructure roles
- Hands-on experience with AI/ML infrastructure or GPU-based workloads
- Strong knowledge of AWS, GCP, or Azure cloud platforms
- Experience with Kubernetes and containerized deployments
- Proficiency in Python, Go, or Bash
- Experience with Infrastructure as Code (Terraform, Pulumi, or CloudFormation)
- Understanding of model serving systems and inference optimization
- Experience working with cross-functional engineering and AI teams
Preferred Qualifications
- Experience with LLM inference at scale (multi-GPU, low-latency systems)
- Background in distributed training systems and large model deployment
- Familiarity with real-time AI or streaming systems
- Knowledge of RDMA, InfiniBand, or high-performance networking
- Experience with cost optimization for large-scale cloud AI systems
- Contributions to open-source AI or cloud infrastructure projects
- Startup experience (Seed to Series A stage)
What We Offer
- Equity: Early-stage ownership in a fast-growing AI company
- Impact: Your work powers systems used by 1M+ users worldwide
- Flexibility: Remote-first with optional NYC collaboration
- Growth: Fast-paced environment with rapid learning opportunities
- Team: Small, high-ownership, AI-native engineering team
Why Join LockedIn AI?
- You’ll build the infrastructure behind a category-defining AI product
- Your systems directly power real-time AI used in high-stakes interviews
- You’ll work at the frontier of cloud + AI systems at massive scale
- You’ll solve real-world problems where performance, latency, and cost matter every day
How to Apply
Please submit:
- Resume or CV
- Short note covering:
- Why you want to join LockedIn AI
- Whether you’ve used the product
- What improvements you would suggest
- Optional: GitHub, portfolio, or technical work
Equal Opportunity
LockedIn AI is committed to building a diverse and inclusive team. We welcome applicants from all backgrounds and experiences. Hiring decisions are based on merit, skills, and business needs.