LockedIn AI – AI Cloud Engineer

LockedIn AI

Manhattan, New York, United States 0 to 3 Years 2 days ago

Apply Now Save Job

Job Overview

Employment type National

Experience 0 to 3 Years

Salary Not specified

Deadline 29-May-2026

Location Manhattan, New York, United States

Required Skill Set

Engineering - Hardware & Networks

Job Description

About LockedIn AI

LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted by over 1 million users worldwide. We are building the most advanced AI-powered career preparation platform, helping users succeed in interviews, assessments, and professional communication in real time.

Our system powers real-time AI assistance during live interviews, coding rounds, and meetings—helping users communicate with clarity, confidence, and precision when it matters most.

Role Overview

We are looking for a cloud-native, AI-focused AI Cloud Engineer to design, build, and optimize the infrastructure that powers LockedIn AI’s machine learning systems and real-time inference pipelines.

This role sits at the intersection of cloud engineering and AI infrastructure. You will be responsible for the environments where models are trained, fine-tuned, deployed, and served at scale to over 1 million users.

You will build and operate high-performance, GPU-powered cloud systems that ensure low-latency, cost-efficient, and highly reliable AI experiences in production.

Key Responsibilities

AI Cloud Architecture

Design scalable cloud infrastructure for AI/ML workloads (training, inference, evaluation)
Build GPU-based compute systems optimized for model performance and cost efficiency
Architect secure and scalable environments across AWS, GCP, or Azure
Implement multi-stage environments for training, staging, and production AI systems

AI Model Serving Infrastructure

Build and maintain real-time inference systems serving LLM-based applications
Deploy and optimize model serving frameworks (vLLM, Triton, TensorRT, etc.)
Improve latency, throughput, and reliability of AI inference pipelines
Design load balancing and failover systems for high-availability AI services

GPU Compute & Training Systems

Manage distributed GPU clusters for model training and fine-tuning
Optimize GPU utilization using scheduling, spot instances, and auto-scaling
Support large-scale distributed training workflows
Work with managed AI platforms like SageMaker, Vertex AI, or Azure ML

Cost Optimization & FinOps

Optimize cloud spend across GPU, storage, and inference workloads
Reduce LLM and compute costs through intelligent infrastructure design
Build dashboards to monitor cost-per-inference and GPU efficiency
Identify and eliminate underutilized or over-provisioned resources

Security & Networking

Implement secure cloud networking (VPCs, private endpoints, IAM policies)
Protect AI assets including models, embeddings, and training data
Ensure encryption, logging, and compliance for AI systems
Design secure APIs for real-time AI inference

Infrastructure Automation & Observability

Build Infrastructure as Code using Terraform or similar tools
Automate deployment of AI environments and inference endpoints
Implement monitoring for GPU health, latency, and system performance
Set up alerting systems for failures, bottlenecks, and anomalies

Required Qualifications

3+ years of experience in cloud engineering, DevOps, or infrastructure roles
Hands-on experience with AI/ML infrastructure or GPU-based workloads
Strong knowledge of AWS, GCP, or Azure cloud platforms
Experience with Kubernetes and containerized deployments
Proficiency in Python, Go, or Bash
Experience with Infrastructure as Code (Terraform, Pulumi, or CloudFormation)
Understanding of model serving systems and inference optimization
Experience working with cross-functional engineering and AI teams

Preferred Qualifications

Experience with LLM inference at scale (multi-GPU, low-latency systems)
Background in distributed training systems and large model deployment
Familiarity with real-time AI or streaming systems
Knowledge of RDMA, InfiniBand, or high-performance networking
Experience with cost optimization for large-scale cloud AI systems
Contributions to open-source AI or cloud infrastructure projects
Startup experience (Seed to Series A stage)

What We Offer

Equity: Early-stage ownership in a fast-growing AI company
Impact: Your work powers systems used by 1M+ users worldwide
Flexibility: Remote-first with optional NYC collaboration
Growth: Fast-paced environment with rapid learning opportunities
Team: Small, high-ownership, AI-native engineering team

Why Join LockedIn AI?

You’ll build the infrastructure behind a category-defining AI product
Your systems directly power real-time AI used in high-stakes interviews
You’ll work at the frontier of cloud + AI systems at massive scale
You’ll solve real-world problems where performance, latency, and cost matter every day

How to Apply

Please submit:

Resume or CV
Short note covering:
- Why you want to join LockedIn AI
- Whether you’ve used the product
- What improvements you would suggest
Optional: GitHub, portfolio, or technical work

Equal Opportunity

LockedIn AI is committed to building a diverse and inclusive team. We welcome applicants from all backgrounds and experiences. Hiring decisions are based on merit, skills, and business needs.