LAION AI
@LAION-AIOrganizationThis is the repo of LAION, a non-profit organization to liberate machine learning research, models and datasets.
On the leaderboard
| Rank | Repository | Stars |
|---|---|---|
| 669 | LAION-AI/Open-Assistant | 37,422 |
Top repositories by stars
- LAION-AI/Open-Assistant(on leaderboard)
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Python37,448 - LAION-AI/CLAP
Contrastive Language-Audio Pretraining
Python2,030 - LAION-AI/CLIP_benchmark
CLIP-like model evaluation
Python800 - LAION-AI/audio-dataset
Audio Dataset for training CLAP and other models
Python729 - LAION-AI/aesthetic-predictor
A linear estimator on top of clip to predict the aesthetic quality of pictures
Jupyter Notebook672 - LAION-AI/dalle2-laion
Pretrained Dalle2 from laion
Python504 - Python497
- Python460
- LAION-AI/lucidrains-projects
A summary of all lucidrains repositores and links to training / research approaches by LAION or other communities.
Jupyter Notebook330 - LAION-AI/laion-3d
Collect large 3d dataset and build models
291 - LAION-AI/laion-datasets
Description and pointers of laion datasets
HTML250 - LAION-AI/phenaki
A phenaki reproduction using pytorch.
Python220 - LAION-AI/Open-Instruction-Generalist
Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks
Python209 - LAION-AI/scaling-laws-openclip
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
Jupyter Notebook189 - LAION-AI/ldm-finetune
Home of `erlich` and `ongo`. Finetune latent-diffusion/glid-3-xl text2image on your own data.
Python181 - LAION-AI/laion-dreams
Aim for the moon. If you miss, you may hit a star.
164 - LAION-AI/AIW
Alice in Wonderland code base for experiments and raw experiments data
Python131 - Python130
- HTML121
- Python107
- LAION-AI/Discord-Scrapers
Implementation of a discord channel scraper to generate datasets.
Python102 - LAION-AI/video-clip
Let's make a video clip
96 - LAION-AI/Open-GIA
O-GIA is an umbrella for research, infrastructure and projects ecosystem that should provide open source, reproducible datasets, models, applications & safety tools for Open Generalist Interactive Agents (O-GIA). O-GIA systems will act in collaboration with human or autonomously, supporting various kind of validated decision making and assistance.
87 - LAION-AI/watermark-detection
A repository containing datasets and tools to train a watermark classifier.
Python74 - Jupyter Notebook65
- Python61
- LAION-AI/LAION-SAFETY
An open toolbox for NSFW & toxicity detection
Jupyter Notebook61 - LAION-AI/Big-Interleaved-Dataset
Big-Interleaved-Dataset
Python58 - LAION-AI/riverbed
Tools for content datamining and NLP at scale
Python44 - LAION-AI/Desktop_BUD-E
BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the creation and integration of diverse skills for educational and research applications.
Python42 - Jupyter Notebook42
- LAION-AI/blade2blade
Adversarial Training and SFT for Bot Safety Models
Python40 - LAION-AI/laion5B-paper
Building the laion5B paper
36 - LAION-AI/emotional-speech-annotations
This repository contains prompts & best practices to annotate audio clips with a very high degree of details using Audio-Language-Models
35 - LAION-AI/deep-image-diffusion-prior
Inverts CLIP text embeds to image embeds and visualizes with deep-image-prior.
Jupyter Notebook35 - LAION-AI/temporal-embedding-aggregation
Aggregating embeddings over time
Python32 - LAION-AI/medical
This repository will be a summary and outlook on all our open, medical, AI advancements.
Jupyter Notebook30 - LAION-AI/conditioned-prior
(wip) Use LAION-AI's CLIP "conditoned prior" to generate CLIP image embeds from CLIP text embeds.
Python29 - LAION-AI/Anh
Anh - LAION's multilingual assistant datasets and models
Python27 - LAION-AI/laion50BU
Un-*** 50 billions multimodality dataset
23 - LAION-AI/school-bud-e-frontend-old
A frontend that is compatible to the school-bud-e-backend.
TypeScript22 - LAION-AI/Desktop-BUD-E_V1.0
BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the creation and integration of diverse skills for educational and research applications.
Python22 - LAION-AI/math_problems-step-by-step_solutions
Here we provide and collect many functions to generate math problem and step by step solutions for LLM training
Python18 - Python18
- Jupyter Notebook17
- LAION-AI/bud-e
A general human-ai interaction platform.
TypeScript16 - LAION-AI/opendream
Frontend (and soon also midleware and backend) for a new, opensource image generation platform.
TypeScript14 - LAION-AI/LAION-PEOPLE
This project provides a data set with bounding boxes, body poses, 3D face meshes & captions of people from our LAION-2.2B. Additionally it provides clusters based on the poses and face meshes and pose-related captions based on these cluster assignments.
14 - LAION-AI/super-resolution
This is the LAION repository for creating open super-resolution models with the help of LAION-5B subsets.
13 - LAION-AI/notebooks
A collection of generative and training notebooks getting mirrored to google colab.
Jupyter Notebook12 - LAION-AI/project-menu
Projects at LAION
12 - LAION-AI/laionide
This repository contains training code and checkpoitns for finetuning glide.
Python11 - LAION-AI/dataset-spec
Describe the format of image/text datasets
Python11 - LAION-AI/model-retrieval
Easily compute model embeddings and save the embeddings.
Jupyter Notebook10 - LAION-AI/project-alexandria
Official repo for Project Alexandria
8 - LAION-AI/Vocalino-V0.1-Voice-Acting-Pipeline
Open-weights voice acting pipeline combining zero-shot voice cloning with natural-language direction. Provide a reference voice (or generate one) and describe how the line should be performed. Produces speech that keeps the voice identity while following emotional and stylistic prompts—no training required.
HTML7 - LAION-AI/Megatron-LM-Open-Sci
MegaTron open-sci fork
Python6 - LAION-AI/KAISER
Knowledge Acquisition and Interlinking via Semantic Embeddings and Reasoning
6 - LAION-AI/dataset-usage
This repository is a summary of all systems and scientific papers that use LAION datasets.
6 - LAION-AI/laion-ai.github.io
laion github website
Svelte6 - LAION-AI/open_clip_mammut
OpenCLIP fork with MaMMUT support
Python5 - LAION-AI/safety-pipeline
A collection of safety classifiers and models to process image and texts.
Python5 - LAION-AI/repository-overview
This repository will give a quick overview of all projects and repositories from LAION.
5 - LAION-AI/LionizeR
Experiments with Summarization, Long Context and Retrieval
Python4 - LAION-AI/emonet-face
Official repository for the NeurIPS 2025 paper “EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition.” Includes a 40-category emotion taxonomy, balanced synthetic datasets, expert annotations, and baseline models for fair and reproducible evaluation.
Jupyter Notebook3 - LAION-AI/annotate-collection
A repository with data for annotation.
Python3 - LAION-AI/dataset-inference
The new repository for the genral inference pipeline.
Python3 - LAION-AI/GIF
General / Global Inference Framework
Python3 - LAION-AI/decentralized-learning
A basic setup for decentralized-learning that can be used for training future DALLE/CLIP/CLAP models.
3 - LAION-AI/balanced-laion5b
This repository shall help finding a good distribution for huge datasets like LAION-5B for more efficient training.
3 - LAION-AI/website
This is the development repository of the LAION-AI website.
HTML3 - LAION-AI/curiosit-e
File server for curiosit-e content.
TypeScript2 - Python2
- LAION-AI/django-htmx-llm-streaming
A prototype showing how to stream using Django x htmx.
JavaScript2 - LAION-AI/hand-inference
A model to run hand inference on a cluster.
Jupyter Notebook2 - LAION-AI/public-domain-images
A collection of public domain images donated for ML training.
2 - LAION-AI/public-relations
All media / publicity on LAION and related stuff!
2 - LAION-AI/human_artifacts
A repo containing images for artifact annotation.
2 - LAION-AI/introduction-resources
Recommended intro resources
2 - LAION-AI/laion5b-subsets
Creating subsets from laion5b via embeddings search
Jupyter Notebook2 - LAION-AI/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Jupyter Notebook2 - LAION-AI/crawlingathome
A client library for Crawling@Home's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
Python2 - LAION-AI/school-bud-e-frontend
School Bud-E is an intelligent and empathetic learning assistant designed to revolutionize the educational experience.
TypeScript1 - LAION-AI/bud-e-mobile
Mobile app development of all bud-e derivatives.
1 - Jupyter Notebook1
- LAION-AI/AIW_webpage
Alice in Wonderland project and initiative webpage
1 - LAION-AI/llm-template
A template for procedural template generation using JSON outputs form LLMs.
TypeScript1 - LAION-AI/laion5b-bias
This repository is a collection of found biases in the LAION-5B dataset.
1 - LAION-AI/dataset-tasks
datasets that should be downloaded & converted to our standard training formart.
1 - LAION-AI/Admin_Bud-E
Admin Bud-E is a lightweight, privacy-first control center for AI chat, speech-to-text, and text-to-speech. Manage providers, routing, and costs with a simple Admin Console. Give users per-period credits, prices per model, and a shared Common Pool. EU-friendly via OpenAI-Format endüpoints or our optional Google Cloud Vertex proxy.
Python0 - TypeScript0