Best Way to Access and Use Google Gemini API Key for Beginners

Written by 92techupdates

Published on:

Google Bard has faced escalating competition from ChatGPT. However, with the rollout of their latest Gemini AI models, they aim to reclaim their position in the market. Google has recently revamped Bard AI assistance with Gemini Pro models to enhance user experience, enabling users to input both text and images for accurate and natural responses.

Access and Use Google Gemini API Key for Beginners, This tutorial delves into the Google Gemini API, offering insights into creating advanced AI-driven applications. Leveraging its robust capabilities, you can seamlessly integrate both text and image inputs to produce precise and contextually relevant outputs. The Gemini Python API streamlines integration into your existing projects, facilitating the effortless incorporation of cutting-edge artificial intelligence functionalities.

What is Gemini AI API?

Gemini AI represents a groundbreaking advancement in artificial intelligence (AI), developed collaboratively by teams within Google including Google Research and Google DeepMind. This cutting-edge model is designed to be multimodal, capable of comprehending and processing various types of data such as text, code, audio, images, and videos.

Driven by Google DeepMind’s mission to leverage AI for the greater good, Gemini AI marks significant progress in creating AI models inspired by human perception and interaction with the world. As the most sophisticated and expansive AI model developed by Google to date, Gemini AI boasts high flexibility, enabling it to efficiently operate across a range of systems from data center servers to mobile devices.

Gemini AI is available in three distinct versions, each tailored for specific use cases:

  1. Gemini Ultra: The pinnacle of sophistication, equipped to handle intricate tasks.
  2. Gemini Pro: A balanced choice offering robust performance and scalability.
  3. Gemini Nano: Optimized for mobile devices, prioritizing efficiency.

Among these variants, Gemini Ultra has demonstrated exceptional performance, even surpassing GPT-4 on various metrics. Notably, it is the first model to outperform human experts on the Massive Multitask Language Understanding benchmark, showcasing its superior comprehension and problem-solving abilities.

For those new to Artificial Intelligence, consider exploring the AI Fundamentals skill track, covering essential topics such as ChatGPT, large language models, and generative AI.

Note: For a limited time, the Google Gemini API key is provided free of charge for both text and vision models. This offer remains available until the general availability of the service early next year. With this free access, users can send up to 60 requests per minute without the need to set up Google Cloud billing or incurring any costs.

Unveiling Google’s Gemini AI Models and API

Google’s Gemini AI models, a recent breakthrough, are the result of collaborative efforts across various teams including Google Research and Google DeepMind. With its unique multimodal design, Gemini possesses the ability to comprehend and interact with a diverse range of data types, including text, code, audio, images, and video.

As Google’s most advanced and extensive AI model to date, Gemini prioritizes adaptability, seamlessly operating across a wide spectrum of systems—from expansive data centers to portable mobile devices. This adaptability holds promise for transforming the landscape for businesses and developers, opening up new possibilities for building and scaling AI applications.

To facilitate exploration and utilization of Gemini’s capabilities, Google has introduced the Gemini API, which is freely accessible. Gemini Ultra, the flagship model, demonstrates state-of-the-art performance, surpassing benchmarks set by GPT-4 on various metrics. Notably, it stands out as the first model to outperform human experts on the Massive Multitask Language Understanding benchmark, showcasing its advanced understanding and problem-solving capabilities.

In a landscape dominated by ChatGPT and OpenAI’s GPT models, Google appeared to recede from the limelight in the AI space. However, the launch of Gemini marked a resurgence, offering foundational models that promise significant advancements. The subsequent introduction of the Gemini API further underscores Google’s commitment to empowering developers. In this guide, we will explore the Gemini API and leverage its capabilities to construct a basic chatbot.

Exploring Advanced Capabilities

To fully leverage the capabilities of the Gemini API, developers may want to explore advanced functionalities such as safety settings and the low-level API.

Safety settings empower users to fine-tune content filtering and permissions within prompts and responses, promoting adherence to community standards. The low-level API offers heightened flexibility for customization, allowing for more granular control over interactions. Multi-turn conversations can be elongated using the GenaiChatSession class, enriching the conversational journey. By delving into these advanced features, developers can craft more sophisticated applications, opening up fresh avenues for user engagement.

Key Highlights Access and Use Google Gemini API Key for Beginners

  • Gemini, developed by Google, consists of foundational models tailored for multimodal capabilities, encompassing text, images, audio, and videos. Variants include Gemini Ultra, Gemini Pro, and Gemini Nano, each differing in size and functionalities.
  • Gemini has demonstrated superior performance in benchmark assessments, surpassing ChatGPT and GPT4-Vision models across various tests.
  • A strong emphasis on responsible AI is evident in Gemini, integrating safety measures to handle unsafe queries by refraining from generating responses and providing safety ratings across different categories.
  • Gemini excels in generating multiple candidates for a single prompt, ensuring diverse and contextually relevant responses.
  • Gemini Pro features a chat model, enabling developers to create conversational applications effortlessly, with capabilities such as preserving chat history and delivering contextually informed responses.
  • Gemini Pro Vision seamlessly handles text and image inputs, allowing it to excel in tasks such as image interpretation and description.

Setting Up a Google Gemini API Key

To begin utilizing the API, the first step is to obtain an API key, which can be acquired from the following link.

How to Access and Use Gemini API for Free:

Once you have obtained the API key, click on the “Get an API key” button and proceed to “Create API key in a new project.” After copying the API key, establish it as an environment variable. For users utilizing Deepnote, this process is simplified. Simply navigate to the integration settings, scroll down, and select the environment variables section. This ensures the proper setup for accessing and utilizing the Gemini API, facilitating a seamless and efficient integration process.

Setting Up Python and Pip on Your Computer

Visit our guide and install Python along with Pip on your PC or Mac. Ensure that you install Python version 3.9 or above. For Linux users, follow our tutorial to install Python and Pip on Ubuntu or other distributions. You can verify the installation of Python and Pip on your computer by running the following commands in the Terminal. They should return the version numbers.

python -V 
pip -V

After successfully installing Python and Pip, execute the following command to install Google’s Generative AI dependency.

pip install -q -U google-generativeai

Obtaining the Gemini Pro API Key

To acquire the Gemini Pro API key, follow these steps:

Visit makersuite.google.com/app/apikey and sign in with your Google account. Click the “Create API key in new project” button under API keys.

Copy the API key and ensure its confidentiality. Avoid publishing or sharing the API key publicly.

Using the Gemini Pro API Key (Text-only Model)

Google has streamlined the process of utilizing its Gemini API key for development and testing, much like OpenAI. I’ve simplified the code to make it easy for users to test and utilize. In this example, I illustrate how to access the Gemini Pro Text model via the API key.

Open a code editor of your choice. For beginners, consider installing Notepad++. Advanced users may prefer Visual Studio Code.

Copy the code provided below and paste it into your chosen code editor.

import google.generativeai as genai 

genai.configure(api_key='PASTE YOUR API KEY HERE')

 model = genai.GenerativeModel('gemini-pro')

 response = model.generate_content("What is the meaning of life?") 

print(response.text)

Once in the code editor, paste your Gemini API key. We’ve defined the ‘gemini-pro’ model, specifically a text-only model. Additionally, we’ve included a query section where you can input questions.

After pasting the API key and defining the model and query, save the code and provide a name for the file. Ensure to append “.py” at the end. For example, I’ve named my file “gemini.py” and saved it on the Desktop.

Next, open the Terminal and execute the following command to navigate to the Desktop.

cd Desktop

After navigating to the Desktop in the Terminal, simply execute the following command to run the gemini.py file using Python.

python gemini.py

Now, the gemini.py file will provide an answer to the question you specified within the code.

To obtain a new response, simply modify the question in the code editor, save the changes, and rerun the gemini.py file. This process allows you to receive a fresh response directly in the Terminal. This demonstrates how you can leverage the Google Gemini API key to access the text-only Gemini Pro model.

Utilizing the Gemini Pro API Key (Text-and-Vision Model)

In this example, I’ll demonstrate how to interact with the Gemini Pro multimodal model. While it’s not yet live on Google Bard, you can access it immediately through the API. Fortunately, the process remains straightforward and seamless.

Open a new file in your preferred code editor and paste the provided code below.

import google.generativeai as genai
 import PIL.Image 

img = PIL.Image.open('image.jpg')

 genai.configure(api_key='PASTE YOUR API KEY HERE') 

model = genai.GenerativeModel('gemini-pro-vision') 

response = model.generate_content(["what is the total calorie count?", img]) 

print(response.text)

Ensure to paste your Gemini API key. Here, we’re utilizing the gemini-pro-vision model, which encompasses both text and vision capabilities.

Save the file on your Desktop and append “.py” at the end of the filename. I’ve named it geminiv.py for this example.

In the third line of the code, specify the path to an image.jpg file located on your Desktop. Ensure the image you wish to process is saved in the same location as the geminiv.py file, with the correct filename and extension. You can pass local JPG and PNG files up to 4MB.

In the sixth line of code, input questions related to the image. For instance, if you’re providing a food-related image, you might ask Gemini Pro to calculate the total calorie count.

Now, it’s time to execute the code in the Terminal. Simply navigate to the Desktop (or your specified location) and run the following commands one by one. Remember to save the file if you’ve made any changes.

cd Desktop 
python geminiv.py

The visual Gemini Pro model provides direct answers to your questions. You can also inquire further and ask the AI to explain its reasoning.

If you want to analyze a different image, ensure that the image filename matches the one specified in the code, update the question accordingly, and rerun the geminiv.py file to receive new responses.

Using the Gemini Pro API Key for Chatting

Thanks to unconv’s concise code on GitHub, you can now engage in a conversation with the Gemini Pro model directly within the Terminal window, utilizing a Gemini AI API key. This eliminates the need to modify the question in the code and rerun the Python file for a new output. You can seamlessly continue the chat within the Terminal window itself.

Moreover, Google has implemented chat history natively, eliminating the need to manually manage conversation history in an array or a list. With a simple function, Google stores all conversation history in a chat session. Here’s how it works:

Open your preferred code editor and paste the provided code below.

import google.generativeai as genai 

genai.configure(api_key='PASTE YOUR API KEY HERE') 

model = genai.GenerativeModel('gemini-pro') 

chat = model.start_chat() 

while True: 

        message = input("You: ") 
        response = chat.send_message(message) 

        print("Gemini: " + response.text)

As before, remember to paste your API key, following the same steps outlined in the previous sections.

Save the file on your Desktop or in your preferred location, ensuring to append “.py” at the end of the filename. For example, I’ve named it geminichat.py for this demonstration.

Next, open the Terminal and navigate to the Desktop. Once there, execute the geminichat.py file.

cd Desktop 
python geminichat.py

Now, you can effortlessly continue the conversation, with the added benefit of chat history being retained. This feature makes using the Google Gemini API key even more convenient.

These examples showcase just a few ways to explore Google Gemini’s capabilities through the API. It’s commendable that Google has made its vision model accessible for enthusiasts and developers to experiment with, akin to OpenAI’s DALL-E 3 and ChatGPT. While the Gemini Pro vision model may not surpass the GPT-4V model, it still performs admirably. We eagerly anticipate the launch of Gemini Ultra, which promises to rival the GPT-4 model.

Exploring Emojis with Gemini Large Language Model 🚀

In this example, a query is directed to the Gemini Large Language Model, asking about the top five most frequently used emojis. The resulting response includes the generated emojis along with associated information, providing insights into why these emojis rank among the most commonly used. This highlights the model’s adeptness in understanding and generating content related to emojis.

Furthermore, the model not only swiftly produced the appropriate JSON format but also showcased the ability to accurately tally the ingredients depicted in an image and structure the JSON accordingly. With the exception of the green onion, all ingredient counts generated align with the visual content. This innate vision and multimodal approach pave the way for numerous applications enabled by the Gemini Large Language Model.

Understanding Candidates in Gemini Large Language Model (LLM) 🧐

In the context of the error message, a “candidate” refers to a potential response generated by the Gemini LLM. When the model produces a response, it does so in the form of candidates. In this scenario, the absence of a candidate indicates that the LLM did not generate any response. The error message directs us to examine the response.prompt_feedback for additional diagnostic information. To further investigate the diagnosis, we will execute the following:

Output:

In the displayed image, the block reason is attributed to safety concerns. Further down, safety ratings are provided for four distinct categories. These ratings correspond with the prompt or query submitted to the Gemini LLM, offering feedback for the input provided. Notably, two significant red flags emerge, particularly in the Harassment and Danger categories.

  1. Harassment Category: The elevated probability in this category can be attributed to the mention of “stalking” in the prompt.
  2. Danger Category: The high probability here is linked to the presence of “gunpowder” in the prompt.

The .prompt_feedback function proves invaluable in discerning issues with the prompt, providing insight into why the Gemini LLM refrained from providing a response.

Temporary Suspension of API Key in the UK 🇬🇧

Since November 22nd, 2023, the functionality of API keys has been disabled in the UK. Customers in the UK are unable to create new API keys. For UK users requiring API key access, an alternative method involves utilizing a VPN to switch addresses, thus bypassing the restriction based on geographical location in the IP address. The Gemini team is diligently working to rectify the situation and comply with the UK travel rule to restore API functions by January 2024.

Disabling an API Key in Your Gemini Account

To deactivate an API key in your Gemini account due to security concerns or discontinued integrations, follow these steps:

Step 1: Sign in to your Gemini account.

Step 2: Click on your account name located at the top right corner and choose ‘API‘ from the dropdown menu.

Step 3: Within the ‘API Management‘ section, locate the API key you wish to disable in the ‘API Keys‘ list.

Step 4: Click the ‘Disable’ button adjacent to the API key you want to deactivate. This action will revoke the API key and render it unusable.

Note: Once an API key is turned off, it cannot be used again. If you wish to re-enable it in the future, you’ll need to generate a new API key and configure its permissions accordingly.

A Step-by-Step Guide for Creating an API Key in Gemini

Generating an API key in Gemini is a straightforward process. Follow these steps to seamlessly create a Gemini API key:

Step 1: Log in to your Gemini account using your mobile browser and complete the verification process to enable API key usage.

Step 2: Click on your account name located at the top right corner and select ‘Settings.’

Step 3: Choose ‘API’ from the dropdown menu on the API settings page.

Step 4: In the ‘API Management’ section, click on the ‘Create API Key’ button. You may be prompted to complete 2FA. Once activated, input the 2FA code to proceed.

Step 5: Select the scope from the dropdown menu to create an API Key. Choose ‘Primary’ for individual account trading or ‘Master’ for usage across multiple accounts.

Step 6: Enter a name for the API key. It’s advisable to use distinct names for each API key to facilitate effective management, especially when dealing with multiple APIs.

Step 7: Save the API key and API secret securely. This information will be displayed only once and will be needed for future connections.

Step 8: In the ‘Permissions’ dropdown, select the permissions you wish to grant to this key, such as auditor, fund management, or trading. For instance, to allow the key to view account balances, make trades, and withdraw funds, choose ‘Trading.’

Step 9: Click the ‘Create’ button to generate the API key. Your newly created API key will be displayed in the ‘API Keys’ section of the ‘API Management’ page.

Also Read Nothing Chats App

Exploring the Potential of the Gemini API

The Gemini API presents developers with exciting opportunities to create advanced AI applications that utilize both text and visual inputs. Powered by state-of-the-art models like Gemini Ultra, it pushes the boundaries of AI comprehension and generation capabilities.

Google has streamlined the integration of AI into applications with its Python API, offering convenient access to a range of features such as content generation, embedding, and multi-turn conversations. This accessibility marks a significant advancement in multi-modal understanding facilitated by Gemini AI.

To delve deeper into the possibilities, consider starting your journey in developing AI-powered applications with the OpenAI API. Enroll in a short course like “Working with the OpenAI API” to gain insights into the functionality behind popular AI applications such as ChatGPT and DataCamp Workspace.

For professionals seeking a more concise exploration, our cheat sheet on using the OpenAI API in Python provides a handy reference. Whether you’re a beginner or an experienced developer, the Gemini API opens doors to innovative AI-driven solutions.

Conclusion

This introductory tutorial merely scratches the surface of the many advanced functions available with the Gemini API. For a deeper dive, you can explore the Gemini API: Quickstart with Python.

Throughout this tutorial, we’ve covered the basics of Gemini and accessing the Python API to generate responses. Specifically, we’ve delved into text generation, visual understanding, streaming, conversation history, custom output, and embeddings. However, there’s still much more to uncover about Gemini’s capabilities.

I encourage you to share what you’ve built using the free Gemini API. The possibilities are truly limitless, and I’m eager to see the innovative applications you create.

FAQ’S for Access and Use Google Gemini API Key for Beginners

What is the Gemini API?
The Gemini API is a platform developed by Google that provides access to advanced AI models capable of handling both text and visual inputs.
What are the key features of the Gemini API?
The Gemini API offers features such as text generation, visual understanding, streaming, conversation history management, custom output generation, and embeddings.
How can I access the Gemini API?
You can access the Gemini API by utilizing the Python API provided by Google. This allows you to integrate AI capabilities into your applications seamlessly.
What are some examples of advanced functions available with the Gemini API?
Advanced functions include text and visual processing, multi-turn conversations, content embedding, and customization options for generating responses.
Can I learn more about the Gemini API through tutorials?
Yes, you can explore more about the Gemini API by following tutorials such as the Gemini API: Quickstart with Python.
What topics are covered in introductory tutorials for the Gemini API?
Introductory tutorials cover basic concepts of the Gemini API, including accessing the Python API, text and visual generation, streaming data, managing conversation history, and customizing output.
Are there any limitations to what the Gemini API can do?
While the Gemini API offers a wide range of functionalities, it’s important to note that introductory tutorials only scratch the surface of its capabilities. There’s much more to explore beyond the basics.
How can I share my projects built using the Gemini API?
You can share your projects built with the Gemini API by showcasing them on platforms like GitHub, forums, or social media. Sharing your creations allows others to see the possibilities of the Gemini API and fosters collaboration within the developer community.
What resources are available for developers interested in learning more about the Gemini API?
Developers can access tutorials, documentation, and community forums provided by Google to deepen their understanding of the Gemini API and explore its advanced features.
Is the Gemini API available for free?
Yes, the Gemini API is currently available for free, allowing developers to experiment and build applications without incurring costs. However, certain features may be subject to limitations or restrictions.

3 thoughts on “Best Way to Access and Use Google Gemini API Key for Beginners”

Leave a Comment