In this step, we’ll build a translation microservice using FastAPI, containerize it using Docker, and deploy it on Kubernetes for scalability. The translation service will use a Hugging Face model (like MarianMT) to perform real-time translations.
Set Up the Environment:
Install FastAPI
and transformers
libraries to work with the Hugging Face MarianMT model. You may also need torch
for model inference.
pip install fastapi transformers torch uvicorn
Implement the Translation Service in Python:
The following FastAPI service uses MarianMT from Hugging Face for translation.
Service Code:
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from transformers import MarianMTModel, MarianTokenizer
# Initialize the FastAPI app
app = FastAPI()
# Load the MarianMT model and tokenizer
model_name = 'Helsinki-NLP/opus-mt-es-en' # Spanish to English model, adjust as needed
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
# Define a request model for structured API requests
class TranslationRequest(BaseModel):
text: str
source_language: str
target_language: str
# Define the translation endpoint
@app.post("/translate")
def translate(request: TranslationRequest):
# Prepare input for the model
inputs = tokenizer(request.text, return_tensors="pt", truncation=True)
try:
# Perform translation
translated_tokens = model.generate(**inputs)
translated_text = tokenizer.decode(translated_tokens[0], skip_special_tokens=True)
return {"translated_text": translated_text}
except Exception as e:
raise HTTPException(status_code=500, detail=f"Translation error: {str(e)}")
Explanation:
MarianMTModel
and MarianTokenizer
are loaded from Hugging Face, specifically for Spanish-to-English translation in this example. Change model_name
to target other languages./translate
endpoint accepts a POST
request with the fields text
, source_language
, and target_language
. It tokenizes the input text, performs translation, and returns the translated text.Test the Service Locally:
Run the FastAPI app to test locally:
uvicorn translation_service:app --host 0.0.0.0 --port 8000
Send a POST request to test:
curl -X POST "<http://localhost:8000/translate>" -H "Content-Type: application/json" -d '{"text": "Hola, ¿cómo estás?", "source_language": "es", "target_language": "en"}'
This should return the translated text: {"translated_text": "Hello, how are you?"}
.
Create a Dockerfile:
The Dockerfile describes how to build the container for the FastAPI service.
Dockerfile:
# Use a FastAPI-compatible base image
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8
# Set the working directory
WORKDIR /app
# Copy and install requirements
COPY ./requirements.txt /app/requirements.txt
RUN pip install -r /app/requirements.txt
# Copy the application code
COPY . /app
Create requirements.txt
for Python Dependencies:
requirements.txt:
fastapi
transformers
torch
uvicorn
Build and Test the Docker Image Locally:
Build the Docker image:
docker build -t translation_service
Run the container to ensure it works as expected:
docker run -p 8000:8000 translation_service
Access the translation service at http://localhost:8000/translate
.
Push the Docker Image to a Container Registry:
If you are using Docker Hub, tag and push the image:
docker tag translation_service your_dockerhub_username/translation_service:latest
docker push your_dockerhub_username/translation_service:latest
Replace your_dockerhub_username
with your actual Docker Hub username.
Set Up Kubernetes Deployment and Service Files:
Kubernetes uses YAML files to define resources. Create two YAML files: one for the deployment and one for the service.
Deployment YAML (deployment.yaml
):
apiVersion: apps/v1
kind: Deployment
metadata:
name: translation-service
spec:
replicas: 3 # Number of pods
selector:
matchLabels:
app: translation-service
template:
metadata:
labels:
app: translation-service
spec:
containers:
- name: translation-service
image: your_dockerhub_username/translation_service:latest
ports:
- containerPort: 8000
resources:
limits:
memory: "512Mi"
cpu: "500m"
Service YAML (service.yaml
):
apiVersion: v1
kind: Service
metadata:
name: translation-service
spec:
type: LoadBalancer
selector:
app: translation-service
ports:
- protocol: TCP
port: 80
targetPort: 8000
Explanation:
512Mi
memory and 500m
CPU.Apply the YAML Files to Deploy on Kubernetes:
Run the following commands to deploy on your Kubernetes cluster:
kubectl apply -f deployment.yaml
kubectl apply -f service.yam
Verify the Deployment and Service:
Check if the pods are running:
kubectl get pods
Check the external IP for the translation service:
kubectl get service translation-service
Once you have the external IP, you can access the translation service via http://<EXTERNAL_IP>/translate
.