Open AI Introduce Voice, Image Features In ChatGPT

Great news in for users as OpenAI has introduced voice and image features to ChatGPT, marking a significant leap in AI technology.

This means that users can now communicate with ChatGPT not just through text but also via voice and images, making interactions with this AI assistant more dynamic and versatile.

Picture this: Instead of just typing, you can have a real conversation with ChatGPT using your voice. You can ask questions, seek advice, or even request a bedtime story, and ChatGPT will respond in spoken words. This takes the AI experience to a whole new level, making it feel much more like a genuine conversation.

But that’s not all! The ability to engage ChatGPT with images is a game-changer. You can upload pictures and ask the AI to identify objects, explain concepts, or provide step-by-step instructions. This opens up a world of possibilities for learning and problem-solving.

According to OpenAI, the new voice feature in ChatGPT is made possible by a cutting-edge text-to-speech model that can generate remarkably human-like voices from text and a brief sample of spoken speech.

Use CHATGPT For Voice

To kickstart voice interactions, follow these steps: First, navigate to “Settings” in the mobile app and opt into voice conversations. Once that’s done, tap the headphone icon in the top-right corner of the home screen, and there you can select your preferred voice from a choice of five distinct options.

This impressive voice capability is made possible by a state-of-the-art text-to-speech model, which can transform plain text and a short audio snippet into human-like speech. To craft these voices, OpenAI collaborated with professional voice actors. Additionally, they utilize Whisper, their open-source speech recognition system, to transcribe your spoken words into text, creating a seamless voice interaction experience.

Use CHATGPT For Images

In the realm of images, the process is equally fascinating. To begin, tap the photo button to either capture a new image or select an existing one. If you’re using the iOS or Android app, you’ll need to tap the plus button first. Moreover, you can discuss multiple images or utilize the drawing tool to guide your AI assistant.

The AI’s prowess in understanding images is powered by multimodal models like GPT-3.5 and GPT-4. These models have the remarkable ability to apply their language comprehension skills to a wide array of visual content, including photographs, screenshots, and documents that contain both text and images. This integration of language reasoning with visual content unlocks a world of possibilities for users engaging with ChatGPT.

The new voice and images in ChatGPT will be available to Plus and Enterprise users over the next two weeks. Also these features are coming on iOS and Android (opt-in in your settings) and images will be available on all platforms.