Changelog

2024-09-26

Workers AI Birthday Week 2024 announcements

Meta Llama 3.2 1B, 3B, and 11B vision is now available on Workers AI
@cf/black-forest-labs/flux-1-schnell is now available on Workers AI
Workers AI is fast! Powered by new GPUs and optimizations, you can expect faster inference on Llama 3.1, Llama 3.2, and FLUX models.
No more neurons. Workers AI is moving towards unit-based pricing
Model pages get a refresh with better documentation on parameters, pricing, and model capabilities
Closed beta for our Run Any* Model feature, sign up here
Check out the product announcements blog post for more information
And the technical blog post if you want to learn about how we made Workers AI fast

Meta Llama 3.1 now available on Workers AI

Workers AI now suppoorts Meta Llama 3.1.

New community-contributed tutorial

Introducing embedded function calling

Added support for traditional function calling

Native support for AI Gateways

Workers AI now natively supports AI Gateway.

Deprecation announcement for `@cf/meta/llama-2-7b-chat-int8`

We will be deprecating @cf/meta/llama-2-7b-chat-int8 on 2024-06-30.

Replace the model ID in your code with a new model of your choice:

@cf/meta/llama-3-8b-instruct is the newest model in the Llama family (and is currently free for a limited time on Workers AI).
@cf/meta/llama-3-8b-instruct-awq is the new Llama 3 in a similar precision to your currently selected model. This model is also currently free for a limited time.

If you do not switch to a different model by June 30th, we will automatically start returning inference from @cf/meta/llama-3-8b-instruct-awq.

Add new public LoRAs and note on LoRA routing

Added documentation on new public LoRAs.
Noted that you can now run LoRA inference with the base model rather than explicitly calling the -lora version

Add OpenAI compatible API endpoints

Added OpenAI compatible API endpoints for /v1/chat/completions and /v1/embeddings. For more details, refer to Configurations.

Add AI native binding

Added new AI native binding, you can now run models with const resp = await env.AI.run(modelName, inputs)
Deprecated @cloudflare/ai npm package. While existing solutions using the @cloudflare/ai package will continue to work, no new Workers AI features will be supported. Moving to native AI bindings is highly recommended