Open AI has today announced GPT-4, the next-generation AI language model that can read photos and explain what’s in them.
Chat GPT-3 has taken the world by storm but up until now the deep learning language model only accepted text inputs. GPT-4 Virtual Assistant feature is now integrated into the existing app. The App is powered by Open Ai’s GPT-4 language model, which includes an innovative image-to-text generator. With this feature, users can send images through the app to an AI-powered Virtual Volunteer. It can provide instantaneous visual assistance for a wide range of tasks and answer any questions related to the image.
“It generates text outputs given inputs consisting of interspersed text and images,” OpenAI write today. “Over a range of domains — including documents with text and photographs, diagrams, or screenshots — GPT-4 exhibits similar capabilities as it does on text-only inputs.”
What this means in practice is that the AI chatbot will be able to analyse what is in an image.
For example, it can tell the user what is unusual about the below photo of a man ironing his clothes while attached to a taxi.
GPT-4 is the successor to GPT-3. Launched on March 14, Open AI says this latest version can process up to 25,000 words – about eight times as many as GPT-3 – process images and handle much more nuanced instructions than GPT-3.5.
What new things can you do with GPT-4?
OpenAI says “GPT-4 excels at tasks that require advanced reasoning, complex instruction understanding and more creativity”.
In just a few hours since its launch, users have reported several creative uses of GPT-4, including:
Describe images and generate recipes
GPT-4’s multimodal feature allows users to upload images such as a photo of flour, butter, eggs and milk and GPT-4 will recommend some tasty treats that can be made with those ingredients. This feature, however, does not yet appear to be available to GPT-4 subscribers from the public.