Models
Guide to supported models and their requirements
Supported Models
Currently, Magemaker supports deployment of Hugging Face models only. Support for cloud provider marketplace models is coming soon!
Hugging Face Models
Future Support
We plan to add support for the following model sources:
AWS SageMaker
Models from AWS Marketplace and SageMaker built-in algorithms
GCP Vertex AI
Models from Vertex AI Model Garden and Foundation Models
Azure ML
Models from Azure ML Model Catalog and Azure OpenAI
Model Requirements
Instance Type Recommendations by Cloud Provider
AWS SageMaker
- Small Models (ml.m5.xlarge)
- Medium Models (ml.g4dn.xlarge)
- Large Models (ml.g5.12xlarge)
GCP Vertex AI
- Small Models (n1-standard-4)
- Medium Models (n1-standard-8 + GPU)
- Large Models (a2-highgpu-1g)
Azure ML
- Small Models (Standard_DS3_v2)
- Medium Models (Standard_NC6s_v3)
- Large Models (Standard_ND40rs_v2)
Example Deployments
Example Hugging Face Model Deployment
Deploy the same Hugging Face model to different cloud providers:
AWS SageMaker:
GCP Vertex AI:
Azure ML:
The model ids for Azure are different from AWS and GCP. Make sure to use the one provided by Azure in the Azure Model Catalog.
To find the relevnt model id, follow the following steps
Go to your workpsace studio
Find the workpsace in the Azure portal and click on the studio url provided. Click on the Model Catalog
on the left side bar
Select Hugging Face in the Collections List
Select Hugging-Face from the collections list. The id of the model card is the id you need to use in the yaml file
Model Configuration
Basic Parameters
Advanced Parameters
Best Practices
-
Model Selection
- Compare pricing across cloud providers
- Consider data residency requirements
- Test latency from different regions
-
Cost Management
- Compare instance pricing
- Make sure you set up the relevant alerting
Troubleshooting
Common model-related issues:
-
Cloud-Specific Issues
- Check quota limits
- Verify regional availability
- Review cloud-specific logs
-
Performance Issues
- Compare cross-cloud latencies
- Check network connectivity
- Monitor resource utilization
-
Authentication Issues
- Verify cloud credentials
- Check model access permissions
- Validate API keys