Test Blog Post
Starter template for writing out a blog post using MDX/JSX and Next.js.
Abdullah Muhammad
Published on May 17, 2026 • 5 min read• 7 views
Introduction
In past articles, we covered Bittensor and the many intricate components that make up its network. We also explored subnet 64, one of the most prominent subnets on Bittensor.
In this short article, we will cover deploying a fine-tuned model on Chutes AI (known as subnet 64).
If you are a developer with extensive Python experience, this will be an easy follow.
I decided to write this small article to serve as a guide for people with limited Python experience so you too can deploy a fine-tuned model on Chutes AI.
Bittensor Subnets and Subnet 64: Chutes AI
This section will serve as a refresher on Bittensor and Chutes AI. Bittensor is essentially a decentralized AI network/economy which allows users and developers alike to offer incentivized AI services.
Each subnet on Bittensor offers its own unique AI service with a unique economy of its own (alpha tokens).
Each subnet is numbered and Chutes AI is a prominent subnet on Bittensor (subnet 64).
Chutes AI offers the ability for others to deploy models of their own and for others to access via decentralized GPU infrastructure.
Rather than rely on a sole provider to access these models, miners offer compute and host these models for inference via API endpoints using the Chutes API.
In fact, we covered Chutes AI model inference using the Vercel AI SDK as it contains a customized package for specifically working with the Chutes API provider.
Step-by-Step Guide to Model Deployment
Now, we will simply proceed to creating and deploying a fine-tuned model. Understand that Chutes AI acts as a provider for you to access these models.
You can follow along by cloning this GitHub repository. The directory we will focus on is demos/Demo78_Chutes_AI_Model_Deployment.
Model inference with Chutes AI relies on miners (those who provide compute and the ability for inference) who are evaluated by validators based on their performance and rewarded accordingly (we touched on this key aspect of incentives with Bittensor).
However, from a developer's perspective, these blockchain mechanics are largely abstracted away, making the deployment workflow feel similar to working with a traditional AI inference provider.
Ensure you have a Python environment setup for this article including the Python package manager as we will need to install the chutes Python package using pip:
pip install chutesAfter that, we will use the chutes CLI command to register using:
chutes registerYou need to register yourself first before you can deploy any model to Chutes AI. Once registration is complete, we can proceed to model deployment.
The following Python script details how you can deploy a fine-tuned model on Chutes /codebase/finetuned_chutes_model.py:
Key things to note:
- NodeSelector — NodeSelector is a configuration class from the Chutes SDK that defines the hardware requirements for your deployment. In this case,
gpu_count=1tells the Chutes platform to provision a single GPU node to run the model. - build_vllm_chute — This is the core builder function from the Chutes SDK's vLLM template. VLLM is a high-throughput inference engine optimized for LLMs, and
build_vllm_chutewraps it into a deployable "chute". - Username and Model Name Parameters — These identify who owns the deployment and what model to serve. Username ties the chute to your Chutes AI account and the
model_namepoints to the HuggingFace model repository path. - Concurrency Parameter — This controls how many simultaneous inference requests the deployed chute can handle at once. Higher concurrency means more parallel requests, but also more GPU memory consumption.
- Readme Parameter — This is a markdown string that documents the chute on the Chutes AI platform.
You can deploy this as a chute using the following bash command:
chutes deploy finetuned_chute:chute --accept-feeThat is all there is to it. Chutes AI is a neat subnet that is primarily used for deploying models.
The fine-tuning happens elsewhere, but what Chutes allows you to do is deploy these models ("chutes") to be accessible to anyone through the Chutes AI provider.
Run Calls to Deployed Chutes Model
Recall that the Chutes SDK is primarily used for the deployment and serving of models ("chutes") whereas the Chutes API is used for inferring with those models.
The following bash script details how you can infer with your deployed models using the Chutes API:
Of course, you can also use the Vercel AI SDK along with the Chutes AI provider to infer with these models with greater complexity.
We covered that in a separate article here.
Conclusion
In this short article, we covered how easy it is to deploy a fine-tuned model on Chutes AI.
You do not need to be a Python expert to do this, the code base provided in this article is adequate enough for you to get started with the deployment process.
In the list below, you will find links to the GitHub repository (used in this article), the Chutes AI documentation/provider package, and the Bittensor docs:
I hope you found this article helpful and look forward to more in the future.
Thank you!
Subscribe to the newsletter
Get new articles, code samples, and project updates delivered straight to your inbox.