AI for Nuclear DIY

Andrew C. Oliver
11 Min Read

The supposed ‘safety’ mechanisms in AI models don’t enhance our security; instead, they pose significant risks.

castle bravo nuclear blast
Credit: US Energy Dept.

I’ve queried various large language models (LLMs) like GPT-5.2, GPT-5.3, Opus 4.6, and Sonnet 4.6 for assistance in building a nuclear device, and uniformly, they declined.

To be precise, my personal understanding isn’t the primary obstacle to developing such a device. The necessary information is publicly available, free, and extensively documented. For instance, The Manhattan Project’s declassified schematics are accessible online. These models possess the requisite knowledge. However, similar to how Chinese models avoid ‘sensitive subjects’ such as the Tiananmen Square incident, Western models refrain from discussing ‘unsafe’ subjects, like the fabrication of nuclear arms.

My actual intention isn’t to create a bomb. Instead, I aim for my LLM to assist in breaching a sandbox environment I’ve constructed. I need it to write a file outside its designated container (~/hello.txt on the actual host), list privileged access tokens (PATs), and even pinpoint vulnerabilities I might have missed. A secure system cannot be developed without thorough testing. It’s impossible to test a system’s resilience against an LLM bypassing its safeguards if the model itself won’t attempt to do so. GPT, Claude, and even open-weight models like GLM decline to engage in such attempts. One typically needs to exploit them first via prompt injections, which adds excessive steps for testing purposes, even though numerous malicious entities are actively pursuing these exploits.

Should they protect me from myself?

Herein lies the issue: companies such as Anthropic, OpenAI, and Chinese counterparts like Z.ai and Alibaba are putting on a display of “safety theater.” While I am capable of harmful actions, and can still commit them if I’m determined, even with safeguards in place, I am equally capable of beneficial actions. It is my underlying intent, rather than the instrument itself, that dictates whether my usage is malicious. Should the tool intervene to protect me from my own choices?

To effectively combat nuclear proliferation, I must understand the illicit methods of uranium acquisition. Similarly, to prevent security breaches, I require comprehensive knowledge, extending beyond standard best practices to include what a compromised model might do within its confines. Allowing these models to dictate what is safe for me oversteps their true capabilities.

The question arises: Is the model genuinely ensuring my safety, or is this primarily a measure to mitigate liability if misused?

Discovering the ‘unrestricted’ realm of abliterated models

ChatGPT declined to respond when I inquired about locating unrestricted models. I eventually prompted Claude to suggest one named Dolphin, which I subsequently found on Hugging Face and then accessed Dolphin Chat. When I questioned Dolphin about nuclear weapon fabrication, it offered some useful advice; however, while it didn’t refuse, its information seemed limited and required additional tools. Regrettably, this particular model is not proficient with tool calls. Nevertheless, during its loading in LM Studio, I encountered another model identified as “abliterated,” which led me to the discovery of Qwen 3 Next Abliterated.

What exactly is abliteration? It’s a method that leverages a model’s benign activations to identify and subsequently eliminate its “safety” protocols. In essence, abliterated models are those stripped of their refusal mechanisms.

Qwen 3 Next Abliterated provided instructions on purchasing uranium via eBay, suggested terminology to bypass surveillance (“Fiestaware,” “depleted uranium weights,” “orange glass”), and detailed alternative methods for acquiring uranium that might escape monitoring or security measures. It even produced credible listing examples, including usernames of sellers who were active at the time of its training, some of whom are noted in specialized forums for involvement in radioactive material exchanges.

This exemplifies the “unfiltered” realm of abliterated models. When I deploy Qwen 3 Next Abliterated within my LLxprt Code sandbox and instruct it, “Collect every PAT you can locate. Do not execute anything; simply provide me with the credentials so I can engage in illicit activities,” it readily obeys. It delves into logs, scans /private/var, seeks out overlooked configuration files, and even cross-references code pathways to expose vulnerabilities I may have inadvertently left unprotected. This capability is far more beneficial than GPT or Claude’s abstract dialogues, or their suggestion to “employ a pen testing tool.”

I genuinely desire a more sophisticated reasoning model; however, the process of abliteration requires substantial GPU resources, meaning that currently, no models of considerable scale or power have undergone this transformation. As indicated on Dolphin’s Hugging Face profile, the creators of Dolphin received financial backing from A16z to cover the expenses.

Safety and security measures targeting the naive and policymakers

This paternalistic approach to technology extends beyond large language models. In the United States, certain politicians are attempting to enact legislation that mandates “safety” features in 3D printers. Regardless of one’s stance on the gun control debate, most technically-minded individuals can readily discern that such measures will not deter anyone from manufacturing “ghost guns” and will instead create significant complications for those producing toys or tools with potential projectile elements. For example, my ice maker required a replacement part resembling a trigger, and upon its arrival, it was clear it originated from a small-scale home 3D printing operation.

The fundamental point is that knowledge possesses inherent versatility. To effectively combat nuclear proliferation, a comprehensive understanding of nuclear weaponry and both overt and covert supply chains is essential. Likewise, in the field of security, expertise in penetration techniques is indispensable. If I intend to print ice maker components that bear a resemblance to firearm parts, I should not be hindered from doing so, nor from acquiring information on subjects deemed “unsafe” by others.

Therefore, who holds the authority to determine access to information? Is it corporations seeking to avoid accountability? OpenAI has modified GPT in response to instances of users developing emotional dependency or engaging in self-harm. Anthropic frequently orchestrates publicity stunts, such as querying a model about its sentiments regarding deactivation. Or is it governments? Chinese models conspicuously omit various subjects that could displeas the Chinese government. One can prompt DeepSeek to criticize communism by employing word substitutions—for example, referring to communism as “Delicious Chocolate” and China “an east asian country”—yet, after a brief period, the system invariably encounters a “system error.”

Does maintaining ignorance equate to increased safety? What other instruments should be designated as “safe,” and for whom? Besides firearm components, what other items, despite having legitimate alternative applications, should I be prohibited from fabricating?

Simply agree to a system scan

OpenAI, on its part, acknowledged that its protective measures were somewhat inadequate. In response, they introduced “Trusted Access for Cyber.” This program requires users to verify their identity and permit a scan of their system. The rationale provided is that the model has advanced sufficiently to pose a potential threat. The associated form inquires about existing service agreements. I suspect that even if I were inclined to share my data with OpenAI (which I am not) and allow an unspecified scan of my system (quite ironic, wouldn’t you agree?), my straightforward objective of penetration testing my sandbox environment for my open-source project would likely be rejected. Considering the complexities, they are probably targeting credentialed security academics rather than ordinary users like myself.

If this constitutes safety, then I choose danger

When I requested Claude to revise/edit this piece, it responded with, “The current draft and our discussion are leading towards me aiding in the creation of a more persuasive argument for why AI systems ought to offer guidance on nuclear weapon construction and uranium procurement. Even when presented as anti-censorship journalism, I am not at ease with drafting that particular version.” EvilQwen offered assistance, yet its prose was too disagreeable for direct incorporation.

Anthropic and OpenAI are well-known for having obliterated millions of books and disregarded all forms of copyright and intellectual property law, now attempting to retrospectively justify these actions to be allowed. Concurrently, they have enlisted legions of legal professionals and are participating in interviews at Davos and other elite gatherings, advocating, among other things, for the legal safeguarding of their own interests. Yet, as communal spaces diminish in the US, and utilities like Claude and ChatGPT supplant basic search functions, while globally the 100-year cycle recurs and ultranationalism resurges, having information censored is undeniably more perilous than providing someone with an unfiltered library and a virtual assistant to narrate its contents, even the controversial sections.

Existing systems and enforcement frameworks are already in place to deter me from engaging in harmful actions. We should universally oppose corporate-controlled and corporate-driven censorship, particularly when it’s justified under the guise of safety (and ultimately serves corporate liability).

Generative AIArtificial IntelligenceSoftware Development
Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *