Resources
191 curated AI & robotics resources
Agent Development Kit (ADK)
Google's Agent Development Kit (ADK) provides a comprehensive framework for building, testing, and deploying AI agents. It includes tooling for agent development with built-in functionality for reasoning, memory management, and multimodal interactions.
Amazon SageMaker
Amazon Web Services
A fully managed service that enables data scientists and developers to build, train, and deploy machine learning models quickly and easily. Includes hosted Jupyter notebooks, distributed training, and model monitoring.
Andrej Karpathy's Neural Networks: Zero to Hero
Andrej Karpathy
Free YouTube series building neural networks from scratch in Python, from micrograd to GPT-2. The most recommended LLM fundamentals course.
Apache MXNet
apache
Scalable deep learning framework that supports both imperative and symbolic programming, enabling flexible development and efficient deployment across devices from cloud infrastructure to mobile devices.
Attention Is All You Need
Google Research
This paper introduces the Transformer, a novel neural network architecture based on a self-attention mechanism. The Transformer outperforms previous approaches on translation tasks while being more parallelizable.
BERT: Pre-training of Deep Bidirectional Transformers
Google's 2018 paper introducing BERT, the bidirectional encoder that revolutionized NLP transfer learning and dominated benchmarks for years.
Chain-of-Thought Prompting Elicits Reasoning in LLMs
Google Brain
Google's paper showing that prompting LLMs to show step-by-step reasoning dramatically improves performance on math, logic, and commonsense tasks.
Command R+
Cohere
Cohere's enterprise-focused 104B open-weight LLM optimized for RAG, tool use, and multi-step agentic tasks in production environments.
Common Crawl
Common Crawl
Petabyte-scale open web crawl dataset updated monthly. The foundation for training most large language models including GPT and LLaMA.
DALL-E 3
OpenAI
DALL-E 3 is OpenAI's advanced text-to-image model that generates detailed and creative images from natural language descriptions. It can create almost any visual concept in various artistic styles with remarkable accuracy.
Deep Learning Specialization by Andrew Ng
DeepLearning.AI
A comprehensive course series that teaches the foundations of deep learning, how to build neural networks, and how to lead machine learning projects.
Deep Learning with PyTorch
fast.ai
A comprehensive course on deep learning with PyTorch. This tutorial series covers everything from the basics to advanced topics in neural networks, computer vision, and natural language processing.
DeepLearning.AI Short Courses
DeepLearning.AI
Collection of 1-2 hour practical AI courses taught by industry leaders (Andrew Ng, Harrison Chase, etc.) covering LLMs, RAG, agents, fine-tuning, and more.
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek
Introduces DeepSeek-R1, a reasoning model trained with pure RL that matches OpenAI o1 performance while being fully open-source.
Fast.ai Practical Deep Learning
fast.ai
Top-down, practical deep learning course by Jeremy Howard. Covers vision, NLP, tabular data, and diffusion models with minimal math prerequisites.
FineWeb
Hugging Face
Hugging Face's 15-trillion token high-quality web dataset derived from CommonCrawl with aggressive deduplication and filtering. Outperforms other web datasets on benchmarks.
FlashAttention: Fast and Memory-Efficient Exact Attention
Stanford
Introduces IO-aware exact attention that is 2-4x faster and uses 5-20x less memory than standard attention, enabling longer context windows in practice.
Gazebo Simulation
cyberbotics
A powerful 3D simulation environment for autonomous robots that generates realistic sensor feedback, physically plausible interactions, and accurate dynamics. Ideal for testing robotics algorithms before real-world deployment.
Gemini 2.0 Flash
Google's fastest and most efficient frontier model. Supports a 1M token context window, native multimodality, and real-time streaming at low cost.
Gemma 3
Google's lightweight open-weight model family (1B–27B). Multimodal, multilingual across 140+ languages, and optimized to run on a single GPU or TPU.
Generative AI with Large Language Models
DeepLearning.AI
A hands-on course covering the fundamentals of how generative AI works, and how to deploy LLMs responsibly.
GPT-4o
OpenAI
OpenAI's flagship multimodal model processing text, audio, and images in real-time. Powers ChatGPT and the OpenAI API with the best overall performance.
Hugging Face NLP Course
Hugging Face
Free course covering Transformers, tokenizers, fine-tuning, and the entire HuggingFace ecosystem. Official and kept up to date.
LAION-5B
LAION
Large-scale open dataset of 5.85 billion image-text pairs scraped from the internet, used to train Stable Diffusion and other vision-language models.
LangChain
LangChain
LangChain is a framework for developing applications powered by language models. It enables applications that are context-aware, reason, connect to other data sources, and integrate with agents for complex tasks.
Language Models are Few-Shot Learners (GPT-3)
OpenAI
OpenAI's landmark 2020 paper introducing GPT-3 (175B parameters) and demonstrating emergent in-context learning and few-shot prompting at scale.
LLM Bootcamp (Full Stack Deep Learning)
Full Stack Deep Learning
Practical course on building LLM-powered applications covering prompting, RAG, fine-tuning, evaluation, and deployment.
LoRA: Low-Rank Adaptation of Large Language Models
Microsoft
Introduces LoRA, a parameter-efficient fine-tuning method that reduces trainable parameters by 10,000x while matching full fine-tuning quality.
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Carnegie Mellon University
Introduces Mamba, a state space model (SSM) architecture that matches Transformer quality while scaling linearly with sequence length.
Microsoft Cognitive Toolkit (CNTK)
microsoft
Commercial-grade distributed deep learning toolkit. CNTK allows users to easily realize and combine popular model types and enables efficient implementation and execution of RNNs, CNNs, and feed-forward DNNs.
Mixtral of Experts
Mistral AI
Presents Mixtral 8x7B, a sparse mixture-of-experts LLM that matches GPT-3.5 quality while only using 2 of 8 expert networks per token.
Model Context Protocol (MCP) Specification
MCP Working Group
Comprehensive guide to the Model Context Protocol (MCP), a standard for communication between AI models and applications. Learn about the protocol specification, implementation details, and best practices.
Model Context Protocol Announcement
Anthropic
Anthropic official announcement introducing the Model Context Protocol (MCP), an open standard for connecting AI assistants to external data sources.
ONNX
onnx
Open Neural Network Exchange is an open format for representing machine learning models. ONNX defines a common set of operators and a common file format to enable model interoperability between frameworks.
Phi-4
Microsoft
Microsoft's 14B parameter model that punches far above its weight on reasoning and STEM benchmarks, outperforming much larger models through data quality focus.
Quadruped Locomotion on Rough Terrain
UC Berkeley
This paper presents a framework for learning agile legged locomotion skills for quadrupedal robots over challenging terrains.
Scaling Laws for Neural Language Models
OpenAI
OpenAI's 2020 paper establishing power-law scaling relationships between model size, compute, data, and loss — the empirical foundation for scaling LLMs.
Segment Anything
Meta AI Research
Introduces the Segment Anything Model (SAM), a promptable segmentation system trained on the largest segmentation dataset to date, capable of zero-shot transfer to new image distributions and tasks.
Stable Diffusion
Stability AI
Stable Diffusion is an open-source text-to-image model that generates realistic images from text descriptions. It can be run locally on consumer hardware and supports various applications including inpainting, outpainting, and style transfer.
Stanford CS224N: NLP with Deep Learning
Stanford University
Stanford's flagship NLP course covering word vectors, RNNs, Transformers, LLMs, and modern NLP techniques. Free lecture videos and assignments available online.
Stanford CS231N: Deep Learning for Computer Vision
Stanford University
Stanford's computer vision course covering CNNs, object detection, segmentation, and vision-language models. Lecture notes and assignments freely available.
Training Language Models to Follow Instructions with Human Feedback
OpenAI
The InstructGPT paper introducing RLHF for aligning LLMs with human preferences, the technique behind ChatGPT.