In this article
There is a question that every organisation should ask before deploying an AI system that handles sensitive data: does the AI see everything first, or does it only see what it should?
The answer to this question defines the security architecture of the entire system. And the difference between the two approaches — post-retrieval filtering and pre-retrieval enforcement — is not a matter of degree. It is a fundamentally different way of thinking about data protection.
The Core Question
When an AI system needs to answer a question using your organisation's documents and data, it must retrieve relevant information. The security question is simple: at what point do you enforce access controls?
- Post-retrieval filtering: The AI searches across all data, retrieves everything potentially relevant, and then filters out anything the user should not see — before showing results.
- Pre-retrieval enforcement: The AI is structurally prevented from searching data the user is not authorised to access. Unauthorised data is never retrieved, never processed, never touched.
Both approaches aim to show the user only the data they are allowed to see. The critical difference is what happens behind the scenes.
How Post-Retrieval Filtering Works
Post-retrieval filtering is the most common approach in AI platforms today. It is used by most enterprise AI tools, most RAG (Retrieval Augmented Generation) frameworks, and most AI assistants that connect to company data.
The process works like this:
- A user asks the AI a question.
- The AI searches across all available data — every document, every file, every database it has access to.
- The system retrieves a set of relevant results.
- A filtering layer checks each result against the user's access permissions.
- Results the user should not see are removed.
- The remaining results are shown to the user.
On the surface, this looks secure. The user only sees what they are allowed to see. But look at steps 2 and 3 again. The AI has already searched through and processed data that the user is not authorised to access. The data has been retrieved into memory. It has been analysed for relevance. It has been processed by the AI model.
The filtering happens after the fact. The data has already been seen by the system.
How Pre-Retrieval Enforcement Works
Pre-retrieval enforcement takes a different approach entirely. Instead of searching everything and filtering afterwards, the system restricts the search space before any retrieval begins.
The process works like this:
- A user asks the AI a question.
- The system checks the user's access permissions and determines which data sources they are authorised to access.
- The AI searches only within the authorised data.
- Results are verified against permissions a second time before being shown.
- The user sees the results.
At no point does the AI search, retrieve, or process data that the user is not authorised to access. The unauthorised data is not filtered out — it is never reached in the first place.
This is the approach used by Other Me's patent-pending SCRS (Secure Context Retrieval System, UK Patent Application No. 2602911.6). The Dual-Gate architecture enforces this in two stages:
- Gate 1 — Block Before Search: Before the AI begins any retrieval, the system scopes the search to only include data sources the user is permitted to access. Unauthorised data is excluded from the search entirely.
- Gate 2 — Verify Before Showing: After retrieval, each result is verified against the user's permissions before being included in the AI's response. This second gate catches edge cases and provides defence in depth.
The key difference: Post-retrieval filtering removes data after the AI has already seen it. Pre-retrieval enforcement prevents the AI from seeing it at all.
The Building Analogy
Imagine your organisation's data as a large office building with many rooms. Each room contains different types of sensitive information. Different employees should have access to different rooms.
Post-Retrieval: The Guard Who Checks Afterwards
Post-retrieval filtering is like giving every employee a master key that opens every room in the building. They walk through every room, look at everything inside, and collect anything that seems relevant to their task. On their way out, a security guard at the exit checks their badge and takes away any documents they should not have.
The employee only leaves with the right documents. But they have already walked through every room. They have already seen what is inside. The guard cannot erase what they saw.
Pre-Retrieval: The Building Where Doors Do Not Exist
Pre-retrieval enforcement is like a building where the corridors themselves change based on who walks in. When you enter, you only see the doors to rooms you are authorised to access. The other rooms are not locked — they simply are not on your map. You cannot walk into a room you do not know exists. You cannot see documents in a room you cannot reach.
There is no guard needed at the exit because you never had access to the wrong rooms in the first place.
Why the Distinction Matters
If both approaches end up showing the same results to the user, why does the distinction matter? There are four critical reasons.
1. Data Has Already Been Processed
In a post-retrieval system, the AI model has already processed unauthorised data as part of generating its response. Even if the final output is filtered, the model's internal state has been influenced by data the user should not have access to. This can lead to subtle information leakage — the AI's response may be shaped by data it should never have seen.
2. Caching Creates Persistent Risk
AI systems use caching to improve performance. When the AI retrieves and processes unauthorised data, that data may be cached in memory, in search indices, or in intermediate processing layers. Post-retrieval filtering removes the data from the final output, but it may persist in caches that could be accessed by other processes or users.
3. No Proof of Non-Access
With post-retrieval filtering, you cannot prove to a regulator or auditor that unauthorised data was not accessed. You can only show that it was filtered from the output. The AI did access it — it searched it, retrieved it, and processed it. The filter simply prevented it from appearing in the response.
With pre-retrieval enforcement, you can demonstrate that the AI's search space was restricted before retrieval began. The audit trail shows that unauthorised data was never queried, never retrieved, and never processed. This is a materially different compliance position.
4. Filter Failures Are Invisible
Every filter can fail. Post-retrieval filters rely on correct permission mapping, accurate metadata, and proper implementation at the filtering layer. If the filter misses a document — due to a misconfigured permission, a metadata error, or an edge case — the unauthorised data flows straight to the user. And because the data was already retrieved, there is no second chance.
Pre-retrieval enforcement provides defence in depth. Gate 1 restricts the search. Gate 2 verifies the results. Both must fail for unauthorised data to reach the user. And because the data was never in the AI's processing context, a Gate 2 verification failure does not mean the AI has already been influenced by unauthorised information.
The fundamental question is not "can the user see unauthorised data?" It is "did the AI process unauthorised data?" Post-retrieval filtering answers only the first question. Pre-retrieval enforcement answers both.
Real-World Implications
Consider these practical scenarios:
A law firm with multiple clients: A solicitor asks the AI to summarise the legal position on a contract dispute. In a post-retrieval system, the AI may search across all client files — including confidential documents from opposing parties — and filter them out afterwards. In a pre-retrieval system, the AI only searches within the authorised client's files. The opposing party's data is never touched.
A bank with Chinese walls: An analyst in the advisory division asks the AI about a company. In a post-retrieval system, the AI may process price-sensitive information from the trading division before filtering it. In a pre-retrieval system, the advisory division's AI cannot reach the trading division's data at all.
An insurance firm with customer data: A claims handler asks the AI to review a claim. In a post-retrieval system, the AI may access medical records from unrelated claims during retrieval. In a pre-retrieval system, only the relevant claim's data is within the AI's search scope.
The Industry Landscape
Today, the vast majority of enterprise AI platforms use post-retrieval filtering. This includes most major AI assistants, most RAG frameworks, and most enterprise search tools that have added AI capabilities. Post-retrieval filtering is easier to build. It works with existing search infrastructure. It does not require rethinking the data architecture.
Pre-retrieval enforcement is harder to implement. It requires the security model to be integrated at the search layer, not bolted on after retrieval. It requires knowing user permissions before the search begins, not after it ends. It is a fundamentally different architecture.
This is why Other Me built SCRS from the ground up as a pre-retrieval system. Retrofitting pre-retrieval enforcement onto a post-retrieval architecture is extremely difficult. The security model must be native to the retrieval process, not added as a layer on top.
Where the Industry Is Heading
As regulators increase scrutiny of AI data handling — particularly in sectors like financial services and legal — the distinction between post-retrieval filtering and pre-retrieval enforcement will become a defining factor in platform selection.
Organisations that must demonstrate data minimisation under UK GDPR, maintain Chinese walls under FCA rules, or prove client confidentiality under SRA regulations will increasingly need to show not just that users did not see unauthorised data, but that the AI did not process it.
Post-retrieval filtering will remain adequate for many use cases — internal knowledge bases with low sensitivity, general corporate information, public-facing content. But for any environment where data access is a compliance requirement, pre-retrieval enforcement is the stronger architecture.
The bottom line: Post-retrieval filtering protects the user from seeing unauthorised data. Pre-retrieval enforcement protects the data from being accessed at all. In regulated industries, that distinction is the difference between "we filtered it" and "it was never touched."
Other Me's SCRS, with its Dual-Gate architecture, represents the pre-retrieval approach. It is built by Pop Hasta Labs Ltd (UK Companies House 16742039), with patent-pending protection under UK Patent Application No. 2602911.6. If your organisation handles sensitive data and needs to demonstrate that AI never accesses what it should not, the architecture matters more than the marketing.
Choose accordingly.