Over the last few weeks, we’ve discussed the importance of the secure application of redaction and how failure to to do so can lead to some embarrassing oversights. This week, we’re diving into how artificial intelligence (AI) can actually find the text to be redacted – but not just any AI. Let’s unpack this.
The Power of Machine Learning and Large Language Models
Not all AI is created equal. Some might call regex a form of AI, but it has serious limitations for redaction. Whilst regex and search techniques do have their place (see our previous blog on The Best Approach to AI Redaction) for reliable, accurate results, we turn to machine learning, specifically a cutting-edge approach called Large Language Models (LLMs) – the same technology behind tools like ChatGPT.
What Exactly Is a Large Language Model?
Think of an LLM as a super-smart neural network trained on massive amounts of text, like Wikipedia or public web data. Its job? To predict the next word in a sentence (or fill in missing words) based on the surrounding context. It’s like training for a “fill-in-the-blank” test, such as completing the phrase “monkeys like to ___ bananas.” LLMs are designed to ace this!
Through this training, LLMs learn the patterns of language, grammar, style, and even meaning. This is why ChatGPT can sound so eloquent and informative (though it’s not always 100% accurate – check out this article for a fun take: ChatGPT Is No Stochastic Parrot, But It Thinks 1 Is Greater Than 1). Best of all, LLMs don’t need humans to manually label their training data, which means they can learn from vast datasets with minimal human effort (just some brilliant data scientists and powerful computers!).
How Imprima Uses LLMs for Redaction
At Imprima, we’ve tailored an LLM specifically for redaction, training it on the types of documents commonly found in due diligence. Because we use LLMs, our technology works across multiple languages, using data from one language to boost accuracy in others. Instead of searching for exact words or phrases like traditional methods, the LLM understands the context and meaning of the text. This allows it to identify and redact sensitive information with far greater precision.
The result? Unmatched accuracy and reliability that blows other automated redaction tools out of the water. We’ve demonstrated this in a previous post (Smart VDRs: It’s All About Accuracy), and we’ll dive deeper into the results next week.
Why This Matters for Due Diligence
At Imprima, we believe LLMs can revolutionise due diligence. Beyond redaction, they can extract critical insights from documents (like our Smart Summaries tool) to streamline the DD process. Utilising AI in Due Diligence, allows users to gather information faster and have a “second pair of eyes” so they can focus on what matters most. At Imprima we’re focused on building trustworthy, reliable AI and its practical benefits for M&A Due Diligence.
What’s Next?
Coming soon we’ll share real-world results from our LLM-based redaction tool, showing just how accurate and reliable it is when applied to actual DD data.
Stay tuned!
Are you looking for a VDR with fully integrated redaction software which leverages AI? Speak to our sales team or check out our Smart Redaction page here.