Arne van Elk’s Post

View profile for Arne van Elk, graphic

SEO consultant @ Kinesso (formerly Reprise Digital), specialist search & information architecture

This is a nice & short explanation of the concept of ‘grounding’ of large language models. Grounding is important to reduce AI hallucinations

View profile for Gary Illyes, graphic

Analyst at Google

I need a place to point to, so let's talk about grounding in generative AI. Plain old generative AI makes predictions about what concepts are likely to follow each other given a prompt, and then gives the prompter the most likely string of concepts for that prompt. A is followed by B, which is followed by C. Based on the training data, it's unlikely that the next in line will be E. Rather it's likely it'll be D. This you get "A, B, C, D" for a prompt like "what are the first 4 letters of the Latin alphabet." It doesn't "know" that; it just knows that there's a 98.7% chance that D is the letter following C, and only 0.7% chance E is. If the likelihood of D or E following C are close to each other, the LLM might hallucinate and tell you the first four letters of the Latin alphabet are A, B, C, E. This can be caused by a bunch of things, but it's usually because of ambiguity in or staleness of the training data. You can ramp up the quality of your training data to solve this, but usually that's not feasible for one reason or other. However there's a quick hack (?) to decrease the chance of hallucinations: grounding. We could add another data source to our system, say elementary school text book that actually contains the whole Latin alphabet, and essentially verify the output of the LLM against that source. Now the output of the LLM is grounded.

  • No alternative text description for this image

To view or add a comment, sign in

Explore topics