AI voice cloning has made the ‘urgent call from the principal’ indistinguishable from the real thing. Obsidian Helm builds the verification discipline that stops a fabricated voice from moving a fortune.
A private assistant takes a call. The voice is the principal’s — the cadence, the impatience, the private nickname only family use. There is a discreet acquisition in motion, the line is poor, and €2.3m must reach a new account within the hour before the window closes. Every instinct says obey. The voice is a forgery, assembled in seconds from a conference keynote posted online, and the account belongs to a syndicate two continents away. By the time anyone calls back to confirm, the funds have been layered through four jurisdictions and are gone.
Executive fraud has always relied on authority and urgency. Generative AI has simply removed the two things that used to give a fabrication away: the voice and the face. A usable voice clone now requires roughly three seconds of clean audio, and convincing results are routinely produced from under a minute. For an ultra-high-net-worth principal, that raw material is everywhere — keynote speeches, podcast appearances, charity-gala remarks, results calls, a daughter’s wedding toast uploaded by a guest. The person most worth impersonating is also the person whose voice is most abundantly public.
The structural conditions compound the exposure. Family offices and private staff are built to move large sums quickly and discreetly, often with a small team and a culture of deference to the principal. Payment authority is concentrated, transfers are routinely six and seven figures, and questioning an instruction from the boss feels like insubordination. That is precisely the reflex the attack exploits. Deepfake-enabled fraud attempts rose sharply through 2024 and 2025, and losses attributed to synthetic-media business fraud now run into the billions annually. The Arup case — where a Hong Kong finance employee was deceived by a video call full of deepfaked colleagues into paying out US$25m across fifteen transfers — is not an outlier. It is the template, and the principals our clients resemble are the highest-value version of it.
A serious attack is a campaign, not a single call. It begins with reconnaissance: the attacker harvests the principal’s public voice and video, maps the family office and household staff from LinkedIn and press, identifies who holds payment authority, and studies the principal’s travel and diary from social posts and paparazzi coverage. The timing is deliberate — the request lands when the principal is known to be airborne, on a superyacht in a dead zone, or in a time zone that makes a quick confirmation awkward.
The delivery layers channels to defeat suspicion. A spoofed email or text primes the target (‘expect a call from me about the Geneva matter’), then the cloned voice arrives carrying authority and secrecy. Increasingly the approach is multi-modal: a deepfake video call, a follow-up voice note, a forged invoice from a known supplier, each reinforcing the last. The instruction always shares a signature: authority, urgency, secrecy, and a break from normal process — a new account, a rushed deadline, an explicit request not to involve the usual advisers. The money is then moved fast and layered through mule accounts and, frequently, cryptocurrency to frustrate recovery. Understanding the choreography is what lets a household interrupt it before the wire, not after.
No consumer ‘deepfake detector’ reliably stops this in real time; the defence is behavioural. The reliable tells are contextual and procedural rather than acoustic, and staff who are drilled to notice them stop the vast majority of attempts at the first request. The signals below recur across nearly every documented case.
The objective of a mature protocol is simple: ensure that no voice, video, or message — however convincing — can move money or grant access on its own. Verification is decoupled from the instruction, so authority is proven by process rather than by how someone sounds. The following controls, applied together, defeat the attack even when the impersonation is flawless.
| Attack type | Detection signal | Control |
|---|---|---|
| Cloned-voice call to PA / family office | Urgency plus new beneficiary account | Mandatory call-back on a pre-agreed number; instruction never actioned from the inbound call alone |
| Deepfake video call impersonating principal or adviser | Secrecy, pressure to bypass normal approvers | Live challenge question and shared code word that the real party can answer instantly |
| ‘CEO / principal’ urgent wire request | Deadline designed to prevent verification | Dual authorisation and a value threshold above which two named people must approve out of band |
| Forged supplier or escrow invoice | Changed banking details on a known vendor | Independent confirmation of any account change via a previously verified contact, never the number on the invoice |
| Multi-modal campaign (email primer, voice note, follow-up) | Coordinated pressure across channels | Out-of-band confirmation on a separate channel from the one carrying the request |
| Pretexting of household or new staff | Caller knows names but fails an unscripted personal check | Least-privilege payment authority, recurring drills, and a no-blame escalation channel |
The keystone is unglamorous: a family code word and a standing rule that any payment or credential request is verified by calling the person back on a known number before anything is done. A control a staff member is proud to enforce is worth more than any detection appliance.
The weak point is never the technology; it is the culture of deference that surrounds a principal. Household and family-office staff are hired for discretion and loyalty, and the instinct to serve the boss quickly is exactly what the attacker weaponises. Training reframes the reflex: the most loyal act is to verify, and a principal who has authorised the protocol expects to be called back. When the boss has personally endorsed the rule — ‘if it’s urgent and it’s about money, hang up and call me’ — the awkwardness that fraud depends on evaporates.
Effective programmes rehearse rather than lecture. Realistic simulations — a cloned-voice test call, a forged invoice, a fabricated video request — teach staff to feel the signals under pressure, when it counts. Roles are defined so that payment authority is split and no single person can be socially engineered into a transfer. A no-blame reporting line rewards the assistant who raises a false alarm, because the office that punishes caution trains its people to stay silent. Obsidian Helm designs these protocols for the realities of a private household — a small team, sensitive relationships, and transactions that must remain confidential — not for a corporate finance department. Backed by IT Cares Canada and its operating history since 2014, we extend one principle to the people around a principal: verify quietly, every time, so a fabricated voice finds no door open.
Request a confidential Obsidian Helm fraud and protocol assessment. A private advisor will review how instructions, payments, and access are authorised across your family office and household staff, then design call-back, code-word, and dual-authorisation controls tailored to your operation — and rehearse them with your team. By invitation, and held in complete confidence.
Enter The Marketplace Request A Vetted IntroductionNo salesperson. We review every request personally and reply in confidence — sourcing, vetting brokers, or solving the problem above.
Very little. Current tools can build a usable voice clone from roughly three seconds of clean audio, with convincing results from under a minute. For a UHNW principal, that material is freely available in keynote speeches, podcasts, results calls, and social videos. The person most worth impersonating is also the one whose voice is most abundantly public, which is why voice alone can never be treated as proof of identity.
Mandatory call-back on a pre-agreed number. No payment or credential request is ever actioned from the inbound call, message, or video that carried it; the staff member independently calls the principal or approver back on a known number to confirm. Paired with a family code word and dual authorisation above a value threshold, this makes even a flawless voice or video clone worthless, because authority is proven by process, not by sound.
Four signals recur: manufactured urgency with a deadline that prevents verification; enforced secrecy that bypasses the usual advisers; a new or changed beneficiary account, especially cross-border; and a channel that resists a callback. Emotional pressure and flattery often accompany them. Any one of these should trigger verification; two or more together should be treated as fraud until proven otherwise.
No. Consumer-grade detectors cannot reliably flag a cloned voice or video in real time during a live call, and attackers iterate faster than detectors improve. The dependable defence is behavioural and procedural: verification protocols, code words, dual authorisation, out-of-band confirmation, and trained staff. Technology assists, but the decisive controls are human discipline enforced consistently, regardless of how convincing the impersonation appears.
They are built to move large sums quickly and discreetly, with concentrated payment authority, small teams, and a culture of deference to the principal. Questioning an instruction from the boss feels like insubordination, which is exactly the reflex the attack exploits. Combined with a principal’s highly public voice and predictable travel, this makes the private household the highest-value and most exploitable target for synthetic-media fraud.
Tell us, in confidence, what keeps you up. We reply privately, under NDA.
Request Your Invitation