We’re excited to announce that Amazon Bedrock Custom Model Import now helps Qwen fashions. You’ll be able to now import customized weights for Qwen2, Qwen2_VL, and Qwen2_5_VL architectures, together with fashions like Qwen 2, 2.5 Coder, Qwen 2.5 VL, and QwQ 32B. You’ll be able to convey your personal custom designed Qwen fashions into Amazon Bedrock and deploy them in a completely controlled, serverless setting—with no need to control infrastructure or style serving.
On this put up, we quilt how you can deploy Qwen 2.5 fashions with Amazon Bedrock Customized Fashion Import, making them available to organizations taking a look to make use of cutting-edge AI functions throughout the AWS infrastructure at an efficient charge.
Review of Qwen fashions
Qwen 2 and a pair of.5 are households of huge language fashions, to be had in quite a lot of sizes and specialised variants to fit numerous wishes:
- Common language fashions: Fashions starting from 0.5B to 72B parameters, with each base and instruct variations for general-purpose duties
- Qwen 2.5-Coder: Specialised for code era and final touch
- Qwen 2.5-Math: All in favour of complex mathematical reasoning
- Qwen 2.5-VL (vision-language): Symbol and video processing functions, enabling multimodal packages
Review of Amazon Bedrock Customized Fashion Import
Amazon Bedrock Customized Fashion Import permits the import and use of your custom designed fashions along present basis fashions (FMs) via a unmarried serverless, unified API. You’ll be able to get right of entry to your imported customized fashions on-demand and with out the want to set up the underlying infrastructure. Boost up your generative AI utility construction by way of integrating your supported customized fashions with local Amazon Bedrock gear and lines like Amazon Bedrock Wisdom Bases, Amazon Bedrock Guardrails, and Amazon Bedrock Brokers. Amazon Bedrock Customized Fashion Import is in most cases to be had within the US-East (N. Virginia), US-West (Oregon), and Europe (Frankfurt) AWS Areas. Now, we’ll discover how you’ll use Qwen 2.5 fashions for 2 not unusual use instances: as a coding assistant and for symbol working out. Qwen2.5-Coder is a cutting-edge code style, matching functions of proprietary fashions like GPT-4o. It helps over 90 programming languages and excels at code era, debugging, and reasoning. Qwen 2.5-VL brings complex multimodal functions. Consistent with Qwen, Qwen 2.5-VL isn’t just gifted at spotting items comparable to plants and animals, but in addition at examining charts, extracting textual content from photographs, decoding record layouts, and processing lengthy movies.
Necessities
Earlier than uploading the Qwen style with Amazon Bedrock Customized Fashion Import, just remember to have the next in position:
- An energetic AWS account
- An Amazon Simple Storage Service (Amazon S3) bucket to retailer the Qwen style recordsdata
- Sufficient permissions to create Amazon Bedrock style import jobs
- Verified that your Region supports Amazon Bedrock Custom Model Import
Use case 1: Qwen coding assistant
On this instance, we will be able to exhibit how you can construct a coding assistant the usage of the Qwen2.5-Coder-7B-Instruct style
- Move to to Hugging Face and seek for and replica the Fashion ID Qwen/Qwen2.5-Coder-7B-Instruct:
You are going to use Qwen/Qwen2.5-Coder-7B-Instruct
for the remainder of the walkthrough. We don’t exhibit fine-tuning steps, however you’ll additionally fine-tune prior to uploading.
- Use the next command to obtain a snapshot of the style in the community. The Python library for Hugging Face supplies a software referred to as snapshot obtain for this:
Relying to your style measurement, this is able to take a couple of mins. When finished, your Qwen Coder 7B style folder will comprise the next recordsdata.
- Configuration recordsdata: Together with
config.json
,generation_config.json
,tokenizer_config.json
,tokenizer.json
, andvocab.json
- Fashion recordsdata: 4
safetensor
recordsdata andstyle.safetensors.index.json
- Documentation:
LICENSE
,README.md
, andmerges.txt
- Add the style to Amazon S3, the usage of
boto3
or the command line:
aws s3 cp ./extractedfolder s3://yourbucket/trail/ --recursive
- Get started the import style activity the usage of the next API name:
You’ll be able to additionally do that the usage of the AWS Control Console for Amazon Bedrock.
- Within the Amazon Bedrock console, select Imported fashions within the navigation pane.
- Select Import a style.
- Input the main points, together with a Fashion title, Import activity title, and style S3 location.
- Create a brand new provider position or use an present provider position. Then select Import style
- After you select Import at the console, you must see standing as uploading when style is being imported:
In case you’re the usage of your personal position, you’ll want to upload the next believe dating as describes in Create a service role for model import.
After your style is imported, watch for style inference to be waiting, after which chat with the style at the playground or throughout the API. Within the following instance, we append Python
to suggested the style to without delay output Python code to record pieces in an S3 bucket. Keep in mind to make use of the best chat template to enter activates within the structure required. For instance, you’ll get the best chat template for any suitable style on Hugging Face the usage of beneath code:
Word that after the usage of the invoke_model
APIs, you will have to use the total Amazon Useful resource Identify (ARN) for the imported style. You’ll be able to in finding the Fashion ARN within the Bedrock console, by way of navigating to the Imported fashions segment after which viewing the Fashion main points web page, as proven within the following determine
After the style is waiting for inference, you’ll use Chat Playground in Bedrock console or APIs to invoke the style.
Use case 2: Qwen 2.5 VL symbol working out
Qwen2.5-VL-* gives multimodal functions, combining imaginative and prescient and language working out in one style. This segment demonstrates how you can deploy Qwen2.5-VL the usage of Amazon Bedrock Customized Fashion Import and check its symbol working out functions.
Import Qwen2.5-VL-7B to Amazon Bedrock
Obtain the style from Huggingface Face and add it to Amazon S3:
Subsequent, import the style to Amazon Bedrock (both by way of Console or API):
Take a look at the imaginative and prescient functions
After the import is entire, check the style with a picture enter. The Qwen2.5-VL-* style calls for correct formatting of multimodal inputs:
When supplied with an instance symbol of a cat (such the next symbol), the style correctly describes key options such because the cat’s place, fur colour, eye colour, and overall look. This demonstrates Qwen2.5-VL-* style’s skill to procedure visible knowledge and generate related textual content descriptions.
The style’s reaction:
Pricing
You’ll be able to use Amazon Bedrock Customized Fashion Import to make use of your customized style weights inside Amazon Bedrock for supported architectures, serving them along Amazon Bedrock hosted FMs in a completely controlled approach via On-Call for mode. Customized Fashion Import doesn’t rate for style import. You might be charged for inference in accordance with two elements: the collection of energetic style copies and their length of process. Billing happens in 5-minute increments, ranging from the primary a hit invocation of every style replica. The pricing in keeping with style replica in keeping with minute varies in accordance with elements together with structure, context duration, Area, and compute unit model, and is tiered by way of style replica measurement. The customized style unites required for webhosting is dependent upon the style’s structure, parameter rely, and context duration. Amazon Bedrock robotically manages scaling in accordance with your utilization patterns. If there are not any invocations for five mins, it scales to 0 and scales up when wanted, despite the fact that this would possibly contain cold-start latency of as much as a minute. Further copies are added if inference quantity persistently exceeds single-copy concurrency limits. The utmost throughput and concurrency in keeping with replica is made up our minds throughout import, in accordance with elements comparable to enter/output token combine, {hardware} sort, style measurement, structure, and inference optimizations.
For more info, see Amazon Bedrock pricing.
Blank up
To keep away from ongoing fees after finishing the experiments:
- Delete your imported Qwen fashions from Amazon Bedrock Customized Fashion Import the usage of the console or the API.
- Optionally, delete the style recordsdata out of your S3 bucket in case you not want them.
Keep in mind that whilst Amazon Bedrock Customized Fashion Import doesn’t rate for the import procedure itself, you’re billed for style inference utilization and garage.
Conclusion
Amazon Bedrock Customized Fashion Import empowers organizations to make use of tough publicly to be had fashions like Qwen 2.5, amongst others, whilst making the most of enterprise-grade infrastructure. The serverless nature of Amazon Bedrock gets rid of the complexity of managing style deployments and operations, permitting groups to concentrate on development packages relatively than infrastructure. With options like auto scaling, pay-per-use pricing, and seamless integration with AWS products and services, Amazon Bedrock supplies a production-ready setting for AI workloads. The combo of Qwen 2.5’s complex AI functions and Amazon Bedrock controlled infrastructure gives an optimum steadiness of functionality, charge, and operational potency. Organizations can delivery with smaller fashions and scale up as wanted, whilst keeping up complete regulate over their style deployments and making the most of AWS safety and compliance functions.
For more info, discuss with the Amazon Bedrock User Guide.
In regards to the Authors
Ajit Mahareddy is an skilled Product and Move-To-Marketplace (GTM) chief with over twenty years of enjoy in Product Control, Engineering, and Move-To-Marketplace. Previous to his present position, Ajit led product control development AI/ML merchandise at main generation firms, together with Uber, Turing, and eHealth. He’s advancing Generative AI applied sciences and riding real-world have an effect on with Generative AI.
Shreyas Subramanian is a Major Information Scientist and is helping shoppers by way of the usage of generative AI and deep studying to unravel their trade demanding situations the usage of AWS products and services. Shreyas has a background in large-scale optimization and ML and in the usage of ML and reinforcement studying for accelerating optimization duties.
Yanyan Zhang is a Senior Generative AI Information Scientist at Amazon Internet Products and services, the place she has been operating on state of the art AI/ML applied sciences as a Generative AI Specialist, serving to shoppers use generative AI to succeed in their desired results. Yanyan graduated from Texas A&M College with a PhD in Electric Engineering. Outdoor of labor, she loves touring, figuring out, and exploring new issues.
Dharinee Gupta is an Engineering Supervisor at AWS Bedrock, the place she makes a speciality of enabling shoppers to seamlessly make the most of open supply fashions via serverless answers. Her staff makes a speciality of optimizing those fashions to ship the most efficient cost-performance steadiness for purchasers. Previous to her present position, she won in depth enjoy in authentication and authorization programs at Amazon, creating protected get right of entry to answers for Amazon choices. Dharinee is making complex AI applied sciences available and environment friendly for AWS shoppers.
Lokeshwaran Ravi is a Senior Deep Finding out Compiler Engineer at AWS, focusing on ML optimization, style acceleration, and AI safety. He makes a speciality of improving potency, lowering prices, and development protected ecosystems to democratize AI applied sciences, making state of the art ML available and impactful throughout industries.
June Won is a Major Product Supervisor with Amazon SageMaker JumpStart. He makes a speciality of making basis fashions simply discoverable and usable to assist shoppers construct generative AI packages. His enjoy at Amazon additionally comprises cellular buying groceries packages and final mile supply.
Source link