Serverless AI Model Execution on AWS Fargate Spot

Moved an AI/ML compute workload from an always-on server to an on-demand setup using ECS Fargate Spot, so compute runs only when a job is submitted.

A FastAPI API records each job in DynamoDB, launches a short-lived Fargate task, and tracks progress until completion. Results are written to S3, and the API returns a download link when ready.

The container outputs a small completion payload to CloudWatch Logs, which the API reads after the task ends — keeping the system simple and loosely coupled.

This change reduced compute costs by roughly ~70% compared to on-demand Fargate (and far more compared to an always-on server), while keeping the user experience intact.

AWS ECS Fargate SpotAWS ECRAWS S3AWS CloudWatch LogsAWS DynamoDBDockerFastAPIVue.jsPython (RDKit, Pandas, Boto3)

View on GitHub

What this covers

Always-On to On-Demand

Replaced an idle-heavy server with per-job Fargate Spot tasks that start, run, and terminate automatically.

Containerized ML Model

Packaged the model + dependencies into a single Docker image so it runs the same locally and in Fargate.

Async Job Orchestration

API tracks job states in DynamoDB and manages task lifecycle (start → poll → complete/fail) with timeouts.

CloudWatch as Communication Channel

Task writes outputs to S3 and logs a small JSON completion signal; API reads it after the task finishes.

S3 Result Pipeline

Results stored in S3 and delivered via presigned URLs; avoids pushing large files through the API server.

Full-Stack Job Lifecycle

Frontend polls job status and shows progress + results; users can download or clean up completed jobs.

ECS Task Definition as Infrastructure

Task resources, logging, and networking are codified and versioned so runs are repeatable and auditable.

Back to portfolio