-
Notifications
You must be signed in to change notification settings - Fork 131
Add individual README.md to each sample #133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,15 +4,15 @@ This sample is part of the [AI Sample Catalog](../../). To build and run this sa | |
|
|
||
| ## Description | ||
|
|
||
| This sample demonstrates a basic chat bot using the `Gemini 2.5 Flash` model. Users can send text-based messages, and the generative model will respond, creating an interactive chat experience. This showcases how to build a simple, yet powerful, conversational AI with the Gemini API. | ||
| This sample demonstrates a basic chat bot using the Gemini Flash model. Users can send text-based messages, and the generative model will respond, creating an interactive chat experience. This showcases how to build a simple, yet powerful, conversational AI with the Gemini API. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img width="320" alt="Gemini Chatbot in action" src="gemini_chatbot.png" /> | ||
| </div> | ||
|
|
||
| ## How it works | ||
|
|
||
| The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with the `Gemini 2.5 Flash` model. The core logic is in the `GeminiChatbotViewModel.kt` file. A `generativeModel` is initialized, and then a `chat` session is started from it. When a user sends a message, it's passed to the model, which then generates a text response. | ||
| The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with the Gemini Flash model. The core logic is in the `GeminiChatbotViewModel.kt` file. A `generativeModel` is initialized, and then a `chat` session is started from it. When a user sends a message, it's passed to the model, which then generates a text response. | ||
|
|
||
| Here is the key snippet of code that calls the generative model: | ||
|
|
||
|
|
@@ -31,4 +31,4 @@ fun sendMessage(message: String) { | |
| } | ||
| ``` | ||
|
|
||
| Read more about [getting started with Gemini](https://developer.android.com/ai/gemini/get-started) in the Android Documentation. | ||
| Read more about the [Gemini API](https://developer.android.com/ai/gemini) in the Android Documentation. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: "Gemini API" can refer to a lot of things. Would you consider adding a reference to "cloud" here somewhere?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The "Gemini API" is the name of the product to enable access to Flash and Pro to developers. So I think we should keep it. |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| # Gemini Live Todo Sample | ||
|
|
||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||
|
|
||
| ## Description | ||
|
|
||
| This sample demonstrates how to use the Gemini Live API for real-time, voice-based interactions in a simple ToDo application. Users can add, remove, and update tasks by speaking to the app, showcasing a hands-free, conversational user experience powered by the Gemini API. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img width="320" alt="Gemini Live Todo in action" src="gemini_live_todo.png" /> | ||
| </div> | ||
|
|
||
| ## How it works | ||
|
|
||
| The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Gemini Flash. The core logic is in the [`TodoScreenViewModel.kt`](./src/main/java/com/android/ai/samples/geminilivetodo/ui/TodoScreenViewModel.kt) file. A `liveModel` is initialized with a set of function declarations (`addTodo`, `removeTodo`, `toggleTodoStatus`, `getTodoList`) that allow the model to interact with the ToDo list. When the user starts a voice conversation, the model processes the spoken commands and executes the corresponding functions to manage the tasks. | ||
|
|
||
| Here is the key snippet of code that initializes the model and connects to a live session: | ||
|
|
||
| ```kotlin | ||
| val generativeModel = Firebase.ai(backend = GenerativeBackend.vertexAI()).liveModel( | ||
| "gemini-2.0-flash-live-preview-04-09", | ||
| generationConfig = liveGenerationConfig, | ||
| systemInstruction = systemInstruction, | ||
| tools = listOf( | ||
| Tool.functionDeclarations( | ||
| listOf(getTodoList, addTodo, removeTodo, toggleTodoStatus), | ||
| ), | ||
| ), | ||
| ) | ||
|
|
||
| try { | ||
| session = generativeModel.connect() | ||
| } catch (e: Exception) { | ||
| Log.e(TAG, "Error connecting to the model", e) | ||
| liveSessionState.value = LiveSessionState.Error | ||
| } | ||
| ``` | ||
|
|
||
| Read more about the [Gemini Live API](https://developer.android.com/ai/gemini/live) in the Android Documentation. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| # Gemini Multimodal Sample | ||
|
|
||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||
|
|
||
| ## Description | ||
|
|
||
| This sample demonstrates a multimodal (image and text) prompt, using the Gemini Flash model. Users can select an image and provide a text prompt, and the generative model will respond based on both inputs. This showcases how to build a simple, yet powerful, multimodal AI with the Gemini API. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img width="320" alt="Gemini Multimodal in action" src="gemini_multimodal.png" /> | ||
| </div> | ||
|
|
||
| ## How it works | ||
|
|
||
| The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Gemini Flash. The core logic is in the [`GeminiDataSource.kt`](./src/main/java/com/android/ai/samples/geminimultimodal/data/GeminiDataSource.kt) file. A `generativeModel` is initialized, and then a `chat` session is started from it. When a user provides an image and a text prompt, they are combined into a multimodal prompt and sent to the model, which then generates a text response. | ||
|
|
||
| Here is the key snippet of code that initializes the generative model: | ||
|
|
||
| ```kotlin | ||
| private val generativeModel by lazy { | ||
| Firebase.ai(backend = GenerativeBackend.googleAI()).generativeModel( | ||
| "gemini-2.5-flash", | ||
| generationConfig = generationConfig { | ||
| temperature = 0.9f | ||
| topK = 32 | ||
| topP = 1f | ||
| maxOutputTokens = 4096 | ||
| }, | ||
| safetySettings = listOf( | ||
| SafetySetting(HarmCategory.HARASSMENT, HarmBlockThreshold.MEDIUM_AND_ABOVE), | ||
| SafetySetting(HarmCategory.HATE_SPEECH, HarmBlockThreshold.MEDIUM_AND_ABOVE), | ||
| SafetySetting(HarmCategory.SEXUALLY_EXPLICIT, HarmBlockThreshold.MEDIUM_AND_ABOVE), | ||
| SafetySetting(HarmCategory.DANGEROUS_CONTENT, HarmBlockThreshold.MEDIUM_AND_ABOVE), | ||
| ), | ||
| ) | ||
| } | ||
| ``` | ||
|
|
||
| Here is the key snippet of code that calls the [`generateText`](./src/main/java/com/android/ai/samples/geminimultimodal/data/GeminiDataSource.kt) function: | ||
|
|
||
| ```kotlin | ||
| suspend fun generateText(bitmap: Bitmap, prompt: String): String { | ||
| val multimodalPrompt = content { | ||
| image(bitmap) | ||
| text(prompt) | ||
| } | ||
| val result = generativeModel.generateContent(multimodalPrompt) | ||
| return result.text ?: "" | ||
| } | ||
| ``` | ||
|
|
||
| Read more about [the Gemini API](https://developer.android.com/ai/gemini) in the Android Documentation. | ||
|
|
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| # Gemini Video Metadata Creation Sample | ||
|
|
||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||
|
|
||
| ## Description | ||
|
|
||
| This sample demonstrates how to generate various types of video metadata (description, hashtags, chapters, account tags, links, and thumbnails) using Gemini Flash. Users can select a video, and the generative model will analyze its content to provide relevant metadata, showcasing how to enrich video content with AI-powered insights. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img width="320" alt="Gemini Video Metadata Creation in action" src="gemini_video_metadata.png" /> | ||
| </div> | ||
|
|
||
| ## How it works | ||
|
|
||
| The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Gemini Flash. The core logic involves several functions (e.g., [`generateDescription`](./src/main/java/com/android/ai/samples/geminivideometadatacreation/GenerateDescription.kt), `generateHashtags`, `generateChapters`, `generateAccountTags`, `generateLinks`, `generateThumbnails`) that send video content to the Gemini API for analysis. The model processes the video and returns structured metadata based on the specific prompt. | ||
|
|
||
| Here is a key snippet of code that generates a video description: | ||
|
|
||
| ```kotlin | ||
| suspend fun generateDescription(videoUri: Uri): @Composable () -> Unit { | ||
| val response = Firebase.ai(backend = GenerativeBackend.vertexAI()) | ||
| .generativeModel(modelName = "gemini-2.5-flash") | ||
| .generateContent( | ||
| content { | ||
| fileData(videoUri.toString(), "video/mp4") | ||
| text( | ||
| """ | ||
| Provide a compelling and concise description for this video in less than 100 words. | ||
| Don't assume if you don't know. | ||
| The description should be engaging and accurately reflect the video's content. | ||
| You should output your responses in HTML format. Use styling sparingly. You can use the following tags: | ||
| * Bold: <b> | ||
| * Italic: <i> | ||
| * Underline: <u> | ||
| * Bullet points: <ul>, <li> | ||
| """.trimIndent(), | ||
| ) | ||
| }, | ||
| ) | ||
|
|
||
| val responseText = response.text | ||
| return if (responseText != null) { | ||
| { DescriptionUi(responseText) } | ||
| } else { | ||
| { ErrorUi(response.promptFeedback?.blockReasonMessage) } | ||
| } | ||
| } | ||
|
Comment on lines
+20
to
+47
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The A better approach would be for this function to reside in a data layer (e.g., a repository) and return only the data (e.g., Style Guide ReferencesFootnotes
|
||
| ``` | ||
|
|
||
| Read more about [the Gemini API](https://developer.android.com/ai/gemini) in the Android Documentation. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| # Gemini Video Summarization Sample | ||
|
|
||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||
|
|
||
| ## Description | ||
|
|
||
| This sample demonstrates how to generate a text summary of a video using Gemini Flash. Users can select a video, and the generative model will analyze its content to provide a concise summary, showcasing how to extract key information from video content with the Gemini API. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img width="320" alt="Gemini Video Summarization in action" src="gemini_video_summarization.png" /> | ||
| </div> | ||
|
|
||
| ## How it works | ||
|
|
||
| The application uses the Firebase AI SDK (see [How to run](../../#how-to-run)) for Android to interact with Gemini Flash. The core logic is in the [`VideoSummarizationViewModel.kt`](./src/main/java/com/android/ai/samples/geminivideosummary/viewmodel/VideoSummarizationViewModel.kt) file. A `generativeModel` is initialized. When a user requests a summary, the video content and a text prompt are sent to the model, which then generates a text summary. | ||
|
|
||
| Here is the key snippet of code that calls the generative model: | ||
|
|
||
| ```kotlin | ||
| val generativeModel = | ||
| Firebase.ai(backend = GenerativeBackend.vertexAI()) | ||
| .generativeModel("gemini-2.5-flash") | ||
|
|
||
| val requestContent = content { | ||
| fileData(videoSource.toString(), "video/mp4") | ||
| text(promptData) | ||
| } | ||
| val outputStringBuilder = StringBuilder() | ||
| generativeModel.generateContentStream(requestContent).collect { response -> | ||
| outputStringBuilder.append(response.text) | ||
| } | ||
| ``` | ||
|
|
||
| Read more about [getting started with Gemini](https://developer.android.com/ai/gemini) in the Android Documentation. |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,39 @@ | ||||||
| # Image Description with On-Device Gemini Nano Sample | ||||||
|
|
||||||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||||||
|
|
||||||
| ## Description | ||||||
|
|
||||||
| This sample demonstrates how to generate short descriptions of images on-device using the ML Kit GenAI API powered by Gemini Nano. Users can select an image, and the model will generate a short descriptive text, showcasing the power of on-device multimodal AI. | ||||||
|
|
||||||
| <div style="text-align: center;"> | ||||||
| <img width="320" alt="Image Description with Nano in action" src="nano_image_description.png" /> | ||||||
| </div> | ||||||
|
|
||||||
| ## How it works | ||||||
|
|
||||||
| The application uses the ML Kit GenAI Image Description API to interact with the on-device Gemini Nano model. The core logic is in the [`GenAIImageDescriptionViewModel.kt`](https://github.com/android/ai-samples/blob/main/samples/genai-image-description/src/main/java/com/android/ai/samples/genai_image_description/GenAIImageDescriptionViewModel.kt) file. An `ImageDescriber` client is initialized. When a user provides an image, it's converted to a bitmap and sent to the `runInference` method, which streams back the generated description. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For better maintainability, it's recommended to use relative paths for links to files within the repository instead of hardcoded GitHub URLs. This prevents links from breaking if the repository is forked, renamed, or if the default branch name changes.
Suggested change
|
||||||
|
|
||||||
| Here is the key snippet of code that calls the generative model: | ||||||
|
|
||||||
| ```kotlin | ||||||
| private var imageDescriber: ImageDescriber = ImageDescription.getClient( | ||||||
| ImageDescriberOptions.builder(context).build(), | ||||||
| ) | ||||||
| //... | ||||||
|
|
||||||
| private suspend fun generateImageDescription(imageUri: Uri) { | ||||||
| _uiState.value = GenAIImageDescriptionUiState.Generating("") | ||||||
| val bitmap = MediaStore.Images.Media.getBitmap(context.contentResolver, imageUri) | ||||||
| val request = ImageDescriptionRequest.builder(bitmap).build() | ||||||
|
|
||||||
| imageDescriber.runInference(request) { newText -> | ||||||
| _uiState.update { | ||||||
| (it as? GenAIImageDescriptionUiState.Generating)?.copy(partialOutput = it.partialOutput + newText) ?: it | ||||||
| } | ||||||
| }.await() | ||||||
| // ... | ||||||
| } | ||||||
lethargicpanda marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| ``` | ||||||
|
|
||||||
| Read more about [GenAI Image Description API](https://developers.google.com/ml-kit/genai/image-description/android) in the documentation. | ||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| # Summarization with On-Device Gemini Nano Sample | ||
|
|
||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||
|
|
||
| ## Description | ||
|
|
||
| This sample demonstrates how to summarize articles and conversations on-device using the GenAI API powered by Gemini Nano. Users can input text, and the model will generate a summary in 1-3 bullet points, showcasing the power of on-device text processing with AI. | ||
|
|
||
| <div style="text-align: center;"> | ||
| <img width="320" alt="Summarization with Nano in action" src="nano_summarization.png" /> | ||
| </div> | ||
|
|
||
| ## How it works | ||
|
|
||
| The application uses the ML Kit GenAI Summarization API to interact with the on-device Gemini Nano model. The core logic is in the `GenAISummarizationViewModel.kt` file. A `Summarizer` client is initialized. When a user provides text, it's passed to the `runInference` method, which streams back the generated summary. | ||
|
|
||
| Here is the key snippet of code that calls the generative model from [`GenAISummarizationViewModel.kt`](./src/main/java/com/android/ai/samples/genai_summarization/GenAISummarizationViewModel.kt): | ||
|
|
||
| ```kotlin | ||
| private suspend fun generateSummarization(summarizer: Summarizer, textToSummarize: String) { | ||
| _uiState.value = GenAISummarizationUiState.Generating("") | ||
| val summarizationRequest = SummarizationRequest.builder(textToSummarize).build() | ||
|
|
||
| try { | ||
| // Instead of using await() here, alternatively you can attach a FutureCallback<SummarizationResult> | ||
| summarizer.runInference(summarizationRequest) { newText -> | ||
| (_uiState.value as? GenAISummarizationUiState.Generating)?.let { generatingState -> | ||
| _uiState.value = generatingState.copy(generatedOutput = generatingState.generatedOutput + newText) | ||
| } | ||
lethargicpanda marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| }.await() | ||
| } catch (genAiException: GenAiException) { | ||
| // ... | ||
| } | ||
| // ... | ||
| } | ||
| ``` | ||
|
|
||
| Read more about [GenAI Summarization API](https://developers.google.com/ml-kit/genai/summarization/android) in the documentation. | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,31 @@ | ||||||
| # Writing Assistance with On-Device Gemini Nano Sample | ||||||
|
|
||||||
| This sample is part of the [AI Sample Catalog](../../). To build and run this sample, you should clone the entire repository. | ||||||
|
|
||||||
| ## Description | ||||||
|
|
||||||
| This sample demonstrates how to proofread and rewrite short content on-device using the ML Kit GenAI APIs powered by Gemini Nano. Users can input text and choose to either proofread it for grammar and spelling errors or rewrite it in various styles, showcasing on-device text manipulation with AI. | ||||||
|
|
||||||
| <div style="text-align: center;"> | ||||||
| <img width="320" alt="Writing Assistance with Nano in action" src="nano_rewrite.png" /> | ||||||
| </div> | ||||||
|
|
||||||
| ## How it works | ||||||
|
|
||||||
| The application uses the ML Kit GenAI Proofreading and Rewriting APIs to interact with the on-device Gemini Nano model. The core logic is in the [`GenAIWritingAssistanceViewModel.kt`](https://github.com/android/ai-samples/blob/main/samples/genai-writing-assistance/src/main/java/com/android/ai/samples/genai_writing_assistance/GenAIWritingAssistanceViewModel.kt) file. `Proofreader` and `Rewriter` clients are initialized. When a user provides text, it's passed to either the `runProofreadingInference` or `runRewritingInference` method, which then returns the polished text. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For better maintainability, it's recommended to use relative paths for links to files within the repository instead of hardcoded GitHub URLs. This prevents links from breaking if the repository is forked, renamed, or if the default branch name changes.
Suggested change
|
||||||
|
|
||||||
| Here is the key snippet of code that runs the proofreading inference from [`GenAIWritingAssistanceViewModel.kt`](.src/main/java/com/android/ai/samples/genai_writing_assistance/GenAIWritingAssistanceViewModel.kt): | ||||||
|
|
||||||
| ```kotlin | ||||||
| private suspend fun runProofreadingInference(textToProofread: String) { | ||||||
| val proofreadRequest = ProofreadingRequest.builder(textToProofread).build() | ||||||
| // More than 1 result may be generated. Results are returned in descending order of | ||||||
| // quality of confidence. Here we use the first result which has the highest quality | ||||||
| // of confidence. | ||||||
| _uiState.value = GenAIWritingAssistanceUiState.Generating | ||||||
| val results = proofreader.runInference(proofreadRequest).await() | ||||||
| _uiState.value = GenAIWritingAssistanceUiState.Success(results.results[0].text) | ||||||
lethargicpanda marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
| } | ||||||
| ``` | ||||||
|
|
||||||
| Read more about the [GenAI Proofreading](https://developers.google.com/ml-kit/genai/proofreading/android) and [GenAI Rewriting](https://developers.google.com/ml-kit/genai/rewriting/android) documentation. | ||||||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hoping this works. Maybe we should also add "if model version is updated in the sample make sure it's updated in the readme as well", wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about adding a separate section for the Readme? Adding it to the view model section implies that the business logic will live in that file. Instead we could say something like:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are good suggestion in principle, but we should test them to see if they work.
In my (limited) testing, I wasn't able to have Gemini Code Assist specifically picking up an isolated model change and suggesting a ReadMe update.
So I will default to removing the model version from the ReadMe files for now.