Anyone can find almost anything about almost anyone online. The harder question is whether what they found will hold up. A screenshot is not proof. A profile match is not an identification. A video is not authenticated because it is public. The distance between "found online" and "stands up to scrutiny" is the evidence standard problem, and most open-source work never crosses it.
The distinction that gets skipped
Open-source intelligence and evidence are not the same thing, and treating them as if they were is where investigations fail. Intelligence informs a decision. Evidence survives someone trying to discredit it. Raw open-source material is not automatically reliable, authenticated, or defensible simply because it is public.
A finding becomes evidence not because of where it came from, but because of what was done to it after it was found: how it was documented, preserved, verified, and handled. Skip that work and you do not have a fact. You have a lead.
The moment open source became evidence
On 15 August 2017 the International Criminal Court issued its first arrest warrant based solely on open-source evidence. The case against Mahmoud al-Werfalli, a commander in Libya, rested on seven videos posted to social media showing executions near Benghazi between 2016 and 2017.
What made those videos evidence was not that they existed online. It was the work done to them. Investigators geolocated the scenes, established chronology, cross-referenced uniforms and insignia, and preserved the material before it could be deleted. The same footage, screenshotted into a report without that work, would have proven nothing. The case is often cited as the start of "open-source justice," but its real lesson is narrower and more useful: open-source content can meet an evidentiary standard, and meeting it is a discipline, not a download.
The standard has a name: the Berkeley Protocol
In 2020, UC Berkeley's Human Rights Center and the UN Office of the High Commissioner for Human Rights published the Berkeley Protocol on Digital Open Source Investigations (updated in 2022) — the first global standard for this kind of work. It sets minimum professional standards across five processes: identification, collection, preservation, verification, and analysis of digital open-source information.
It exists for a simple reason. The same investigation can be done rigorously or carelessly, and only one version holds up. The Protocol was written for war-crimes and human-rights investigators, but its logic applies to any investigation whose findings someone will act on. Admissibility rules still differ by jurisdiction — material gathered across several countries must satisfy whichever court hears the case, a problem we examine in how OSINT tracks transnational smuggling networks — but the Protocol gives courts a common baseline to evaluate against. The standard is what separates a defensible conclusion from a confident guess.
What raises a finding from intelligence to evidence
The work the Protocol describes is concrete. A handful of disciplines do most of it.
Provenance. Where the information was found, when, by whom, and how — recorded at the moment of collection, not reconstructed afterwards. A finding with no documented origin cannot be trusted by the person relying on it or challenged by the person it concerns.
Preservation. Public content disappears. Accounts are deleted, posts are edited, pages change. Sound practice captures material as it appeared, increasingly with cryptographic hashing, which fingerprints a file so any later alteration is detectable, and web-archiving formats that preserve a page's full state so it can be replayed exactly as captured. A screenshot proves what you say you saw. A hashed, archived capture proves it independently of your word.
Chain of custody. An unbroken, documented record of who handled the material and how, from collection to final report. Forensic standards such as the ACPO guidelines and NIST's digital-evidence guidance exist precisely so that handling does not become the weak point an opponent attacks.
Verification. Corroboration across independent sources, geolocation, chronolocation — the same techniques that turned the al-Werfalli videos into evidence. One source is a claim. Several that agree, independently confirmed, is a finding. This is also where the discipline of verifying information in an age of synthetic media does its hardest work.
Separating observation from inference. Stating what was actually seen apart from what it suggests. "This account posted this photograph" is an observation. "This person was in this place" is an inference, and it has to be labelled as one. Most overreach in open-source work is an inference quietly promoted to a fact.
None of this is exotic. All of it is skippable, which is why so much open-source output is intelligence wearing the costume of evidence.
The other half of the standard: legitimate collection
A finding can be technically flawless and still unusable. Provenance, preservation, and chain of custody establish that a finding is what it claims to be. They say nothing about whether collecting it was legitimate, and that is the second half of the standard. Material gathered through disproportionate intrusion can be authentic, well-preserved, and still excluded — or worse, taint the investigation around it. The Berkeley Protocol builds this in: alongside preservation and verification it requires proportionality, privacy, and a do-no-harm posture.
The working test is plain. If the subject, a court, and a regulator could all see exactly what you did, why, and what you intend to do with the result, would it survive their scrutiny? If the honest answer is no, the technical quality of the finding is beside the point. Where that line actually falls — public figures exercising public power versus private individuals, a scoped search versus behavioural surveillance — is its own subject, which we map in when investigation becomes surveillance.
Why it matters when you are not going to court
Most investigations never reach a courtroom, and that is exactly where the standard tends to get dropped. A report on an executive's exposure, a due-diligence file, an assessment of someone threatening a client — none of these will be argued before the ICC. But each one will be acted on. A board will make a decision. A lawyer will advise. An insurer will price a risk. A family will change how it lives.
The moment a finding informs a real decision, it is being treated as evidence, whether or not anyone uses the word. If it cannot be traced, preserved, or verified, it is a liability dressed as insight. We do not produce courtroom evidence, and none of this is legal advice — admissibility is a question for courts and counsel. But the discipline behind admissible evidence is the same discipline that makes any finding trustworthy, and applying it to private work is the difference between a professional report and a folder of screenshots.
Where findings collapse
The failure points are predictable. The screenshot with no source, date, or capture method behind it. The synthetic image or cloned voice that was never authenticated and now cannot be. The crucial post that was seen but not preserved and is now gone. The confident identification built on a name and a city that match thousands of people. The inference — "they must have known" — written as if it were established. Each is the same error: treating the discovery of information as the end of the work rather than the start of it.
The standard is the product
The evidence standard problem is not about where information lives. It is about the discipline applied in the space between finding something and relying on it. Anyone can find. The standard is what happens next: documenting, preserving, verifying, and stating plainly what is known as against what is supposed.
For an investigation that someone is going to act on, that discipline is not an extra layer. It is the thing being bought.
A finding is only as good as the discipline behind it. Every investigation we run is built to be traced, preserved, and verified, so that what we hand you holds up when it is questioned.
Talk to an AnalystSources
- Berkeley Protocol on Digital Open Source Investigations. UC Berkeley Human Rights Center & UN OHCHR (launched 2020, updated 2022). Five processes: identification, collection, preservation, verification, analysis.
- International Criminal Court, Prosecutor v. Al-Werfalli, arrest warrant of 15 August 2017 — the first ICC warrant based solely on open-source (social-media) evidence. See also Bellingcat, "Geolocating Libya's Social Media Executioner" (2017) and Harvard Human Rights Journal, "Open Source Evidence and the International Criminal Court" (2019).
- Chain-of-custody and digital-evidence handling: ACPO Good Practice Guide for Digital Evidence; NIST SP 800-series digital-evidence guidance. Tamper-evident capture via cryptographic hashing and WARC web-archiving.