Welcome to Hugging Face track! Here we provide you with Hugging Face credits.
You can spend them through two main services:
Hugging Face Jobs — On-demand GPU compute for running workloads directly on Hugging Face infrastructure. You submit a job (via the hf CLI, the Python client, or the HTTP API), it runs on hardware ranging from CPUs to A100s, and your credits are spent only for the seconds the instance is used. This is where your training runs, data processing, and heavy compute happen.
Inference Providers — Serverless API access to hundreds of models hosted by partners like Cerebras, Groq and more. You call a model through a single consistent API and pay per request. This is useful for generating synthetic data or any task where you need a large hosted model.
Docs: HF Jobs · Inference Providers - Zero GPU Spaces - Spaces GPU Hardware
The most direct use of your credits. Use TRL (Transformer Reinforcement Learning) to fine-tune models. TRL comes with many training scripts for alignment methods like Supervised Fine-tuning, Direct Preference Optimization and others that you can run directly with Hugging Face Jobs infra.
# Example: SFT via the TRL CLI, launched as a Job
hf jobs run --image trl-latest --command \\
"trl sft --model_name Qwen/Qwen3-0.6B --dataset_name trl-lib/Capybara --output_dir ./output"
TRL docs · Training via Jobs guide
You can just do things. Do all of the above with text prompts using agent skills. Install Hugging Face Agent Skills in your coding agent (Claude Code, Cursor, Codex, Gemini CLI) and describe what you want in natural language. Skills like hf-trainer, hf-jobs, and hf-datasets guide the agent through setting up and launching your full training pipeline — from dataset prep to fine-tuning.
# In Claude Code
/plugin marketplace add huggingface/skills
/plugin install hf-trainer@huggingface/skills
/plugin install hf-jobs@huggingface/skills
Agent Skills docs · Skills repo
Use Inference Providers to call powerful LLMs and VLMs to generate training data, instruction-response pairs, image captions, whatever your task needs. You can also run synthetic data generation as a Job using vLLM for fast local inference on GPUs.