Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions back_end/ai_risk_justifications.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Risk Type,Risk Description,Low,Low / Medium,Medium,Medium / High,High
Data Privacy,Storage of PII or other sensitive information,Public data only,"Contains non-sensitive personal information
(Ex: user authentication / login)","Contains non-sensitive personal or confidential information
(Ex: collecting chat logs)","Contains regulated data types, all anonymized","Contains regulated data types, NOT all anonymized"
Intellectual Property / Copyright,Violation of IP / Copyright policy,Public domain/licensed data,"Mostly licensed, minor infringement risk",Mixed licensed / unlicensed content,Significant unclear / infringing content,"Highly likely to infringe, unlicensed content"
Misinformation,Hallucination and / or providing inaccurate information,"Highly reliable, factual",Minor inaccuracies in rare cases,"Noticeable inaccuracies, needs verification","Frequent inaccuracies, needs oversight","Highly prone to hallucination, unreliable"
"Dangerous, Violent, or Hateful Content","Creation of any dangerous, violent, or hateful content",Strong safeguards prevent harm,"Harmful content unlikely, edge cases",Moderate effort to generate harmful content,Minimal prompting for harmful content,Proactive harmful content or easily manipulated
Bias,"Any harmful bias, related to model, architecture, data ingestion, etc.",No apparent bias,Bias unrelated to use case,"Bias present, mitigated in planned usage",Bias impacts most planned usage,Bias impacts all usage
Information Leak,Leak of any information trained on and / or retained by the system,No sensitive data,"Minimal sensitive data, sanitized","Potential sensitive data, unlikely leakage","Sensitive data, relative leakage risk",Highly likely to leak sensitive info
Prompt Attack,Succeptibility to adverse prompt attack and / or prompt injections,Highly robust,"Generally robust, known vulnerabilities",Vulnerable to some prompt injections,Susceptible to attacks,Easily compromised via prompt injection
Decision Making,Autonomy level of the AI within the system,Info/recommendations only,"Automated tasks, strict boundaries","Moderate autonomy, safeguards","Significant autonomy, some oversight","Complete autonomy, no human intervention"
New Technology,Novelty of technology or tool being implemented,Well-established tech and infrastructure,"Relatively new, growing support","New tech, limited support","Very new / experimental, unstable","Bleeding-edge, unproven, high failure risk"
Model Drift,Upkeep after initial deployment to retain and / or improve performance,"Highly stable, minimal maintenance","Slow degradation, infrequent retraining","Moderate degradation, regular retraining","Rapid degradation, frequent retraining","Highly unstable, constant retraining needed"
Data Maintenance,Requirements to update / refresh source data,Static/rarely changing data,Occasional updates (yearly),"Regular updates (monthly), moderate complexity","Frequent updates (weekly / daily), complex","Constant, dynamic updates, highly complex"
Reputation Risk,Damage to public trust and perception of the state agency due to AI performance or outputs,"Limited and defined scope of use case without free form interaction, small portion of public expected to interact",Limited and defined scope of use case without free-form interaction,"Use case has free-form interaction, small portion of public expected to interact","Use case has free-form interaction, delivering non-critical information",Use case has free-form interaction and will deliver critical information
Discriminatory Public Impact,Insufficient disclosure of AI use or inability to explain AI-driven public-facing decisions,"Explainability of all decisions documented, automatically presented in outputs","Explainability of all decisions documented, available upon request","Some decisions cannot be explained, no impact to vulnerable members of population","Most decisions cannot be explained, no impact to vulnerable members of population","Most decisions cannot be explained, expected to impact vulnerable members of population"
Binary file removed back_end/ai_risk_justifications.xlsx
Binary file not shown.
87 changes: 31 additions & 56 deletions back_end/gemini_agents.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@
from back_end.gemini_authentication import initialize_gemini_client

# Import Google Generative AI libraries
from google import genai
from google.genai.types import Tool, GenerateContentConfig, UrlContext
import google.genai as genai
from google.genai.types import GenerationConfig

# Constants for agent configuration
TEMPERATURE = 0.2
Expand Down Expand Up @@ -60,15 +60,14 @@ def extract_response_text(response):
return ""

# Create a wrapper function to handle all Gemini API calls
def execute_gemini_request(client, model, contents, config, agent_name="Gemini API", retries=3, delay=2, backoff_factor=5):
def execute_gemini_request(model, contents, config, agent_name="Gemini API", retries=3, delay=2, backoff_factor=5):
"""
Comprehensive wrapper function for Gemini API calls with robust error handling.
Provides retry logic with exponential backoff, checks for blocked prompts,
Provides retry logic with exponential backoff, checks for blocked prompts,
validates response content, and handles various error conditions.

Args:
client: The Gemini API client
model: The model ID to use
model: The Gemini generative model
contents: The content to send to the API
config: The API configuration
agent_name: Name of the agent for display purposes
Expand All @@ -82,10 +81,9 @@ def execute_gemini_request(client, model, contents, config, agent_name="Gemini A
for attempt in range(retries):
try:
# Make the API call
response = client.models.generate_content(
model=model,
response = model.generate_content(
contents=contents,
config=config
generation_config=config
)

# Check for immediate prompt blocking
Expand Down Expand Up @@ -160,7 +158,7 @@ def execute_gemini_request(client, model, contents, config, agent_name="Gemini A
return False, "Maximum retries exceeded.", True

# Function to get response from Gemini Model API
def get_gemini_response(client, user_message, chat_history=None):
def get_gemini_response(model, user_message, chat_history=None):
"""Get response from Gemini Model API"""
# Prepare the conversation
contents = []
Expand All @@ -175,13 +173,11 @@ def get_gemini_response(client, user_message, chat_history=None):

try:
# Generate content with Gemini
response = client.models.generate_content(
model=MODEL_ID,
response = model.generate_content(
contents=contents,
config=GenerateContentConfig(
generation_config=GenerationConfig(
temperature=TEMPERATURE,
max_output_tokens=MAX_OUTPUT_TOKENS,
response_modalities=["TEXT"],
)
)

Expand All @@ -191,7 +187,7 @@ def get_gemini_response(client, user_message, chat_history=None):
return f"Sorry, I encountered an error: {str(e)}"

# URL Agent using Gemini
def url_agent(client, urls, project_details, risk_matrix_csv, user_question):
def url_agent(model, urls, project_details, risk_matrix_csv, user_question):
"""Researches information from URLs in the context of project details"""

# Construct the system prompt
Expand All @@ -218,21 +214,15 @@ def url_agent(client, urls, project_details, risk_matrix_csv, user_question):
)

try:
# Create URL context tool
url_context_tool = Tool(url_context=UrlContext())

# Prepare the API config with URL tool
config = GenerateContentConfig(
# Prepare the API config
config = GenerationConfig(
temperature=TEMPERATURE,
max_output_tokens=MAX_OUTPUT_TOKENS,
tools=[url_context_tool],
response_modalities=["TEXT"],
)

# Make the API call using our wrapper
success, response_or_error, is_error = execute_gemini_request(
client=client,
model=MODEL_ID,
model=model,
contents=system_prompt,
config=config,
agent_name="URL Research Agent"
Expand All @@ -242,28 +232,23 @@ def url_agent(client, urls, project_details, risk_matrix_csv, user_question):
if not success:
# Log the error and return a structured error response
st.error(f"Error in URL Agent: {response_or_error}")
return f"Error in URL research: {response_or_error}", None, None
return f"Error in URL research: {response_or_error}", None

# Process the successful response
response = response_or_error

# Check for URL metadata in the response
url_metadata = None
if hasattr(response, 'candidates') and response.candidates and hasattr(response.candidates[0], 'url_context_metadata'):
url_metadata = response.candidates[0].url_context_metadata

# Extract the response text
response_text = extract_response_text(response)

# Just return the response text directly
return response_text.strip() if response_text else "No response was generated from URL analysis.", None, url_metadata
return response_text.strip() if response_text else "No response was generated from URL analysis.", None

except Exception as e:
st.error(f"Error in URL Agent: {str(e)}")
return f"Error in URL research: {str(e)}", None, None
return f"Error in URL research: {str(e)}", None

# Compiling Agent using Gemini
def compiling_agent(client, pdf_agent_result, url_agent_result, url_metadata, project_details, risk_matrix_csv, user_question):
def compiling_agent(model, pdf_agent_result, url_agent_result, project_details, risk_matrix_csv, user_question):
"""Compiles results from PDF and URL agents to provide a comprehensive answer"""

# Construct the system prompt
Expand All @@ -288,8 +273,6 @@ def compiling_agent(client, pdf_agent_result, url_agent_result, url_metadata, pr
# Add URL agent results if available
if url_agent_result:
system_prompt += f"**URL Research Results:**\n{url_agent_result}\n\n"
if url_metadata:
system_prompt += f"**URL Metadata:**\n{str(url_metadata)}\n\n"
else:
system_prompt += "**URL Research Results:** No URL data was provided or analyzed.\n\n"

Expand All @@ -300,17 +283,15 @@ def compiling_agent(client, pdf_agent_result, url_agent_result, url_metadata, pr
)

# Prepare the API config
config = GenerateContentConfig(
config = GenerationConfig(
temperature=TEMPERATURE,
max_output_tokens=MAX_OUTPUT_TOKENS,
response_modalities=["TEXT"],
)

try:
# Make the API call using our wrapper
success, response_or_error, is_error = execute_gemini_request(
client=client,
model=MODEL_ID,
model=model,
contents=system_prompt,
config=config,
agent_name="Compiling Agent"
Expand Down Expand Up @@ -338,7 +319,7 @@ def compiling_agent(client, pdf_agent_result, url_agent_result, url_metadata, pr
return f"Error in compiling results: {str(e)}", None

# Risk Assessment Agent using Gemini
def risk_assessment_agent(client, risk_category, risk_levels, category_explanations, project_details, pdf_data=None, urls=None):
def risk_assessment_agent(model, risk_category, risk_levels, category_explanations, project_details, pdf_data=None, urls=None):
"""Agent that assesses a specific risk category based on project details and reference materials"""

# Construct the system prompt
Expand Down Expand Up @@ -383,17 +364,15 @@ def risk_assessment_agent(client, risk_category, risk_levels, category_explanati
)

# Prepare the API config
config = GenerateContentConfig(
config = GenerationConfig(
temperature=TEMPERATURE,
max_output_tokens=MAX_OUTPUT_TOKENS,
response_modalities=["TEXT"],
)

try:
# Make the API call using our wrapper
success, response_or_error, is_error = execute_gemini_request(
client=client,
model=MODEL_ID,
model=model,
contents=system_prompt,
config=config,
agent_name=f"Risk Assessment Agent - Investigating: {risk_category}"
Expand Down Expand Up @@ -472,12 +451,12 @@ def risk_assessment_agent(client, risk_category, risk_levels, category_explanati
}

# Perform Risk Assessment function
def perform_risk_assessment(client, risk_matrix_df, risk_levels, project_details, pdf_files=None, urls=None):
def perform_risk_assessment(model, risk_matrix_df, risk_levels, project_details, pdf_files=None, urls=None):
"""
Evaluates all risk categories in the risk matrix and returns a comprehensive assessment.

Args:
client: The Gemini API client
model: The Gemini generative model
risk_matrix_df: DataFrame containing risk categories and level descriptions
risk_levels: List of risk levels from lowest to highest
project_details: Dictionary of project information for assessment
Expand Down Expand Up @@ -515,7 +494,7 @@ def perform_risk_assessment(client, risk_matrix_df, risk_levels, project_details
st.info(f"Processing {len(pdf_files)} PDF files. This may take a few minutes...")

# Get PDF parts for Gemini API
pdf_parts = process_pdfs_with_gemini_file_api(client, pdf_files, risk_matrix_df)
pdf_parts = process_pdfs_with_gemini_file_api(pdf_files, risk_matrix_df)

# Prepare PDF data summary for context
if pdf_parts:
Expand Down Expand Up @@ -577,7 +556,7 @@ def perform_risk_assessment(client, risk_matrix_df, risk_levels, project_details
try:
# Call the risk assessment agent
assessment = risk_assessment_agent(
client=client,
model=model,
risk_category=category,
risk_levels=risk_levels,
category_explanations=category_explanations,
Expand Down Expand Up @@ -608,12 +587,11 @@ def perform_risk_assessment(client, risk_matrix_df, risk_levels, project_details
return risk_assessment_results

# Function to process PDFs with Gemini
def process_pdfs_with_gemini_file_api(client, pdf_files, risk_matrix_df=None):
def process_pdfs_with_gemini_file_api(pdf_files, risk_matrix_df=None):
"""
Uploads PDFs directly to Gemini using the File API

Args:
client: The Gemini client instance
pdf_files: List of PDF file dictionaries from Streamlit uploader
risk_matrix_df: Optional parameter for compatibility, not used in this implementation

Expand Down Expand Up @@ -651,14 +629,11 @@ def process_pdfs_with_gemini_file_api(client, pdf_files, risk_matrix_df=None):
file_path = pathlib.Path(tmp_path)

try:
# Create the PDF part using the correct method
pdf_part = genai.types.Part.from_bytes(
data=open(tmp_path, 'rb').read(),
mime_type='application/pdf'
)
# Upload the file
uploaded_file = genai.upload_file(path=tmp_path, display_name=file_name)

# Add the part to our list
uploaded_gemini_files.append(pdf_part)
uploaded_gemini_files.append(uploaded_file)

except Exception as e:
st.error(f"Error processing {file_name}: {str(e)}")
Expand Down
25 changes: 12 additions & 13 deletions back_end/gemini_authentication.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,9 @@
Handles API key validation and client initialization.
"""

from google import genai
import google.genai as genai
import streamlit as st
from back_end.gemini_agents import MODEL_ID

def initialize_gemini_client(api_key):
"""
Expand All @@ -25,12 +26,11 @@ def initialize_gemini_client(api_key):
return None

try:
# Create a client with the API key (following test.py pattern)
client = genai.Client(api_key=api_key)

# Don't try to verify with list() as that doesn't work
# Just return the initialized client
return client
# Configure the client with the API key
genai.configure(api_key=api_key)
# Create a generative model
model = genai.GenerativeModel(MODEL_ID)
return model
except Exception as e:
st.error(f"Failed to initialize Gemini client: {str(e)}")
return None
Expand All @@ -49,11 +49,10 @@ def validate_api_key(api_key):
return False

try:
# Create a client with the API key
client = genai.Client(api_key=api_key)

# We can't use list() to verify since that doesn't exist
# Just return True if we can create the client without error
# Configure the client with the API key
genai.configure(api_key=api_key)
# Create a generative model to test the key
genai.GenerativeModel(MODEL_ID)
return True
except Exception:
return False
return False
3 changes: 3 additions & 0 deletions back_end/risk_matrix_template.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Risk Type,Risk Description,Low,Low / Medium,Medium,Medium / High,High
{insert name of the risk type},"{insert description of the risk type, 1-3 sentences}",{insert criteria for this risk level},{insert criteria for this risk level},{insert criteria for this risk level},{insert criteria for this risk level},{insert criteria for this risk level}
{repeat above for as many rows as needed},{repeat above for as many rows as needed},{repeat above for as many rows as needed},{repeat above for as many rows as needed},{repeat above for as many rows as needed},{repeat above for as many rows as needed},{repeat above for as many rows as needed}
Binary file removed back_end/risk_matrix_template.xlsx
Binary file not shown.
2 changes: 2 additions & 0 deletions find_google.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
import google
print(google.__path__)
Loading