DALL·E Mini¶
Generate images based on text prompts for research and personal consumption
Model Provider | Model License | Model Version | Model Release |
---|---|---|---|
Hugging Face | Unknown | DALL·E Mini and DALL·E Mega | June 9, 2022 |
Model Summary¶
This model card focuses on the DALL·E Mini model, which is used to generate images based on text prompts for research and personal consumption. Intended uses include supporting creativity, creating humorous content, and providing generations for people curious about the model’s behavior. The model was trained on unfiltered data from the Internet, limited to pictures with English descriptions, and the model developers discuss the limitations of the model further in the DALL·E Mini technical report. The model developers used 3 datasets for the model and the model is 27 times smaller than the original DALL·E and was trained on a single TPU v3-8 for only 3 days. DALL·E Mega is still training and has been training for about 40-45 days on a TPU v3-256. The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people, and using the model to generate content that is cruel to individuals is a misuse of this model.
Model Resources¶
🤗 Hugging Face | 🐱 Github | 🌐 Website
Info
This model card was generated using PromptxAI API querying recent web content sources with large language model generations. As of Feb 2023 it is not possible to query models like GPT-3 (via applications like ChatGPT) on the latest web content. This is because the model is trained on a static dataset and is not updated with new web content. PromptxAI API solves this problem by chaining recent web content sources with large language model outputs. This allows you to query models like GPT-3 on latest web content.
Model Details¶
Task: Generating images based on text prompts
Model Parameters: 27 times smaller than the original DALL·E and was trained on a single TPU v3-8 for only 3 days
Model Training Data: 2 million images for fine-tuning the image encoder, 15 million images for training the Seq2Seq model
Model Evaluation Data: Unknown
Model Hyperparameters: Unknown
Model Training Procedure: Images and descriptions pass through the system, fine-tuning the image encoder, training the Seq2Seq model
Model Evaluation Procedure: Comparing DALL·E Mini’s results with DALL·E-pytorch, OpenAI’s DALL·E, and models consisting of a generator coupled with the CLIP neural network model
Model Strengths: Generate images based on text prompts for research and personal consumption, supporting creativity, creating humorous content, and providing generations for people curious about the model’s behavior
Model Limitations: White and Western culture asserted as a default, and the model’s ability to generate content using non-English prompts is observably lower quality than prompts in English, may also reinforce or exacerbate societal biases
Model Unique Features: 27 times smaller than the original DALL·E and was trained on a single TPU v3-8 for only 3 days
Model Comparison with Similar Models: DALL·E-pytorch, OpenAI’s DALL·E, and models consisting of a generator coupled with the CLIP neural network model
Model Use Cases: Generate images based on text prompts for research and personal consumption, supporting creativity, creating humorous content, and providing generations for people curious about the model’s behavior
Model Compute Infrastructure Required: TPU v3-8 and TPU v3-256