-
Notifications
You must be signed in to change notification settings - Fork 37
Description
I noticed a significant difference in confidence scores between the base and large models for the same input. While I understand that model size can affect this, I'm curious about why the score difference is so substantial.
Base:
from lettucedetect.models.inference import HallucinationDetector
detector = HallucinationDetector(
method="transformer",
model_path="KRLabsOrg/lettucedect-base-modernbert-en-v1"
)
contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)
Output:
Predictions: [{'start': 31, 'end': 71, 'confidence': 0.9891987442970276, 'text': ' The population of France is 69 million.'}]
Large:
from lettucedetect.models.inference import HallucinationDetector
detector = HallucinationDetector(
method="transformer",
model_path="KRLabsOrg/lettucedect-large-modernbert-en-v1"
)
contexts = ["France is a country in Europe. The capital of France is Paris. The population of France is 67 million.",]
question = "What is the capital of France? What is the population of France?"
answer = "The capital of France is Paris. The population of France is 69 million."
predictions = detector.predict(context=contexts, question=question, answer=answer, output_format="spans")
print("Predictions:", predictions)
Output:
Predictions: [{'start': 31, 'end': 71, 'confidence': 0.7649378180503845, 'text': ' The population of France is 69 million.'}]