LockedIn AI – AI Cloud Engineer

LockedIn AI
Manhattan, New York, United States 0 to 3 Years 2 days ago

Job Overview

Employment type National
Experience 0 to 3 Years
Salary Not specified
Deadline 29-May-2026
Location Manhattan, New York, United States
Required Skill Set
Engineering - Hardware & Networks

Job Description

About LockedIn AI

LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted by over 1 million users worldwide. We are building the most advanced AI-powered career preparation platform, helping users succeed in interviews, assessments, and professional communication in real time.

Our system powers real-time AI assistance during live interviews, coding rounds, and meetings—helping users communicate with clarity, confidence, and precision when it matters most.


Role Overview

We are looking for a cloud-native, AI-focused AI Cloud Engineer to design, build, and optimize the infrastructure that powers LockedIn AI’s machine learning systems and real-time inference pipelines.

This role sits at the intersection of cloud engineering and AI infrastructure. You will be responsible for the environments where models are trained, fine-tuned, deployed, and served at scale to over 1 million users.

You will build and operate high-performance, GPU-powered cloud systems that ensure low-latency, cost-efficient, and highly reliable AI experiences in production.


Key Responsibilities

AI Cloud Architecture

  • Design scalable cloud infrastructure for AI/ML workloads (training, inference, evaluation)
  • Build GPU-based compute systems optimized for model performance and cost efficiency
  • Architect secure and scalable environments across AWS, GCP, or Azure
  • Implement multi-stage environments for training, staging, and production AI systems

AI Model Serving Infrastructure

  • Build and maintain real-time inference systems serving LLM-based applications
  • Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, etc.)
  • Improve latency, throughput, and reliability of AI inference pipelines
  • Design load balancing and failover systems for high-availability AI services

GPU Compute & Training Systems

  • Manage distributed GPU clusters for model training and fine-tuning
  • Optimize GPU utilization using scheduling, spot instances, and auto-scaling
  • Support large-scale distributed training workflows
  • Work with managed AI platforms like SageMaker, Vertex AI, or Azure ML

Cost Optimization & FinOps

  • Optimize cloud spend across GPU, storage, and inference workloads
  • Reduce LLM and compute costs through intelligent infrastructure design
  • Build dashboards to monitor cost-per-inference and GPU efficiency
  • Identify and eliminate underutilized or over-provisioned resources

Security & Networking

  • Implement secure cloud networking (VPCs, private endpoints, IAM policies)
  • Protect AI assets including models, embeddings, and training data
  • Ensure encryption, logging, and compliance for AI systems
  • Design secure APIs for real-time AI inference

Infrastructure Automation & Observability

  • Build Infrastructure as Code using Terraform or similar tools
  • Automate deployment of AI environments and inference endpoints
  • Implement monitoring for GPU health, latency, and system performance
  • Set up alerting systems for failures, bottlenecks, and anomalies

Required Qualifications

  • 3+ years of experience in cloud engineering, DevOps, or infrastructure roles
  • Hands-on experience with AI/ML infrastructure or GPU-based workloads
  • Strong knowledge of AWS, GCP, or Azure cloud platforms
  • Experience with Kubernetes and containerized deployments
  • Proficiency in Python, Go, or Bash
  • Experience with Infrastructure as Code (Terraform, Pulumi, or CloudFormation)
  • Understanding of model serving systems and inference optimization
  • Experience working with cross-functional engineering and AI teams

Preferred Qualifications

  • Experience with LLM inference at scale (multi-GPU, low-latency systems)
  • Background in distributed training systems and large model deployment
  • Familiarity with real-time AI or streaming systems
  • Knowledge of RDMA, InfiniBand, or high-performance networking
  • Experience with cost optimization for large-scale cloud AI systems
  • Contributions to open-source AI or cloud infrastructure projects
  • Startup experience (Seed to Series A stage)

What We Offer

  • Equity: Early-stage ownership in a fast-growing AI company
  • Impact: Your work powers systems used by 1M+ users worldwide
  • Flexibility: Remote-first with optional NYC collaboration
  • Growth: Fast-paced environment with rapid learning opportunities
  • Team: Small, high-ownership, AI-native engineering team

Why Join LockedIn AI?

  • You’ll build the infrastructure behind a category-defining AI product
  • Your systems directly power real-time AI used in high-stakes interviews
  • You’ll work at the frontier of cloud + AI systems at massive scale
  • You’ll solve real-world problems where performance, latency, and cost matter every day

How to Apply

Please submit:

  • Resume or CV
  • Short note covering:
    • Why you want to join LockedIn AI
    • Whether you’ve used the product
    • What improvements you would suggest
  • Optional: GitHub, portfolio, or technical work

Equal Opportunity

LockedIn AI is committed to building a diverse and inclusive team. We welcome applicants from all backgrounds and experiences. Hiring decisions are based on merit, skills, and business needs.