Models

Models

Claude

Anthropic

Claude is an AI assistant created by Anthropic built to be helpful, harmless, and honest. It can understand and respond to nuanced instructions, provide thoughtful explanations, and assist with a wide variety of tasks.

Conversational AILLMText GenerationClaude

Models

Claude 3.5 Sonnet

Anthropic

Anthropic's latest mid-size model with exceptional reasoning, instruction following, and content creation capabilities. Designed to be more helpful, harmless, and honest than previous models.

LLMText GenerationMulti-modalAnthropic

Models

Command R+

Cohere

Cohere's enterprise-focused 104B open-weight LLM optimized for RAG, tool use, and multi-step agentic tasks in production environments.

LLMRAGEnterpriseAgentic

Models

DALL-E 3

OpenAI

DALL-E 3 is OpenAI's advanced text-to-image model that generates detailed and creative images from natural language descriptions. It can create almost any visual concept in various artistic styles with remarkable accuracy.

Image GenerationText-to-ImageGenerative AIDALL-E

Models

DeepSeek-V3

DeepSeek

State-of-the-art open-source MoE model with 671B total parameters (37B active). Matches frontier closed models on coding, math, and reasoning benchmarks.

LLMMoEOpen SourceCoding

Website GitHub

Models

Gemini 1.5 Pro

Google

Google's most capable multimodal AI model with a 1 million token context window, enabling unprecedented long-context understanding across text, code, audio, image, and video inputs.

LLMMulti-modalLong ContextGoogle

Models

Gemini 2.0 Flash

Google

Google's fastest and most efficient frontier model. Supports a 1M token context window, native multimodality, and real-time streaming at low cost.

LLMMultimodalClosed SourceGoogle

Models

Gemma 3

Google

Google's lightweight open-weight model family (1B–27B). Multimodal, multilingual across 140+ languages, and optimized to run on a single GPU or TPU.

LLMMultimodalOpen SourceMultilingual

Models

Google Gemini

Google

Gemini is Google's most capable AI model, built to be multimodal from the ground up. It can understand and reason about text, images, code, audio, and video, and generate content across these modalities.

MultimodalLLMContent GenerationGemini

Models

GPT-4o

OpenAI

OpenAI's flagship multimodal model processing text, audio, and images in real-time. Powers ChatGPT and the OpenAI API with the best overall performance.

LLMMultimodalClosed SourceOpenAI

Models

Grok

xAI

Grok is an AI assistant by xAI designed to answer questions with humor and personality. It features powerful reasoning capabilities and real-time knowledge about the world through web browsing functionality.

Conversational AILLMText GenerationMultimodal

Models

Hugging Face Transformers

huggingface/transformers

Transformers provides thousands of pretrained models to perform tasks on different modalities such as text, vision, and audio. These models can be applied on text, images, audio and more.

NLPComputer VisionAudioTransformers

Models

Llama 2

Meta AI

Llama 2 is Meta's next-generation open-source large language model, designed for dialogue and text generation. Available in various sizes from 7B to 70B parameters.

LLMText GenerationOpen SourceMeta AI

Models

Midjourney

Midjourney is an AI image generation model that creates images from natural language descriptions. It excels at creating detailed artistic renderings, realistic images, and conceptual art based on text prompts.

Image GenerationText-to-ImageGenerative AIMidjourney

Models

Mistral 7B

Mistral AI

Mistral 7B is a powerful open-source language model that matches or outperforms other models of similar size. It features improved attention mechanisms and efficient inference.

LLMText GenerationOpen SourceMistral AI

Models

Perplexity AI

An AI-powered search engine and conversational answer engine that combines multiple LLMs with real-time web search to provide accurate, up-to-date information with cited sources.

SearchLLMInformation RetrievalSearch Engine

Models

Phi-4

Microsoft

Microsoft's 14B parameter model that punches far above its weight on reasoning and STEM benchmarks, outperforming much larger models through data quality focus.

LLMReasoningSTEMSmall

Models

Qwen 2.5

Alibaba

Alibaba's open-weight model family (0.5B–72B) with strong multilingual, coding, and math capabilities. Top-performing open model in its size class.

LLMMultilingualOpen SourceCoding

Website GitHub

Models

Stable Diffusion

Stability AI

Stable Diffusion is an open-source text-to-image model that generates realistic images from text descriptions. It can be run locally on consumer hardware and supports various applications including inpainting, outpainting, and style transfer.

Image GenerationOpen SourceGenerative AIStable Diffusion

Models

Stable Diffusion 3

Stability AI

Open-source text-to-image generation model with significantly improved photorealism, prompt following, text rendering, and better understanding of complex scenes and instructions.

Image GenerationOpen SourceGenerative AIStability AI

Models

Whisper

OpenAI

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual data, capable of transcription, translation, and language identification.

Speech RecognitionAudio ProcessingMultilingualAudio

Models

YOLOv8

Ultralytics

YOLOv8 is the latest version of the YOLO object detection model, offering improved accuracy and speed. It supports object detection, segmentation, classification, and tracking.

Computer VisionObject DetectionReal-timeYOLO