EU-hosted Chinese models

Why Orchestra routes GLM, DeepSeek, and Qwen through LLMBase — and what that actually means for your data.

The short version

Models from Chinese labs (Zhipu's GLM, DeepSeek, Alibaba's Qwen) are excellent at coding and dramatically cheaper than Claude or GPT for the same quality. The catch: calling api.z.ai, api.deepseek.com, or dashscope-intl.aliyuncs.com directly sends your code, prompts, and any file contents the agent has read to servers physically located in mainland China.

Orchestra routes these models through LLMBase — a German company that runs the open-source versions of these models on their own GPUs in Germany. Your code and prompts stay in the EU.

The data path, end to end

When the Orchestra agent picks, say, GLM-4.6 to make a code change, this is what physically happens:

Your Orchestra container (Fly.io, Amsterdam region)
        │
        │   HTTPS request — your code + prompt
        ▼
Cloudflare edge POP (Amsterdam, NL)
        │   Cloudflare decrypts the request here, then re-encrypts and forwards.
        │   This is normal for any SaaS that uses Cloudflare as a CDN.
        ▼
LLMBase origin server (claimed: Germany)
        │
        ▼
GPU running the open-source GLM weights file
        │   The model is a math file. It does no network calls during inference.
        ▼
Response → LLMBase origin → Cloudflare → your container

What's a CDN / why is Cloudflare in the middle?

A CDN (Content Delivery Network) is a service that puts servers all over the world in front of an actual application. Two reasons SaaS companies use one:

Speed. Your request lands at the nearest Cloudflare server (in your case Amsterdam, 50ms away) instead of going all the way to Germany.
Hiding the origin. Attackers can't see the real IP of LLMBase's servers, only Cloudflare's. Makes DDoS attacks much harder.

Side effect: Cloudflare needs to inspect the request to apply firewall rules and rate limits, so it decrypts the HTTPS at their edge and re-encrypts it before forwarding. This is the industry-standard pattern — every Stripe, Slack, Notion, GitHub request you've ever made works the same way.

What does 'briefly visible to Cloudflare' mean for me?

Cloudflare is a US company, so technically your request bytes pass through US-owned infrastructure for a few milliseconds, even though the physical server doing the decryption sits in Amsterdam. Whether that matters depends on what you're building:

Most projects: doesn't matter. Cloudflare has a GDPR-compliant DPA, EU-data-stays-on-EU-PoPs configuration, and is used by virtually every European SaaS company.
Regulated industry / government: may matter. A US company touching your data — even at an EU PoP — can be subject to US law (CLOUD Act, FISA). For those workloads LLMBase sells dedicated endpoints that bypass the shared Cloudflare edge.

What this changes for your data

Your request goes to api.llmbase.ai instead of api.z.ai.
It lands on Cloudflare's Amsterdam edge, then a German origin server.
The model (an open-source weights file) runs locally on that German GPU.
The response comes back the same way.
Nothing your agent sees crosses the China border at any point.

For purposes of GDPR you become the data controller for a workflow that stays entirely inside the EU.

What's a DPA, and why does it matter?

A Data Processing Agreement is a contract between you (the "controller" — you decide what happens with the data) and a service you use (the "processor" — they handle the data on your behalf). Under GDPR, you legally need one with every processor that touches your users' personal data.

LLMBase publishes a DPA on request. If you build something on top of Orchestra that touches personal data, you should request and review it before you ship.

What this does NOT change

Routing a Chinese-origin model through an EU GPU does not make the model itself non-Chinese. The weights were trained in China on Chinese-sourced data by Chinese teams. Any biases, opinions, training artifacts, or value alignment baked into those weights stay baked in regardless of where you run them.

If your concern is "I don't want my code touching anything ideologically Chinese at all," EU hosting doesn't address that — your alternative is to use only Western-origin models (Claude, GPT, Mistral, Llama).

If your concern is "I don't want my proprietary code stored on Chinese-mainland servers or subject to Chinese national security law," EU hosting does address that, completely.

Which models we expose

We expose whatever LLMBase serves from their EU infrastructure: GLM-5.1 / 4.6 / 4.5 (Zhipu), DeepSeek V3, Qwen3.6 Plus (Alibaba), Kimi K2.6 (Moonshot). LLMBase publishes the up-to-date list on their models page.

LLMBase's position is that every model they expose runs on their own GPUs in Germany — that's the whole pitch and the reason the platform exists. We take that at face value. If your project needs hard certainty (regulated industry, compliance audit), request their DPA and verify per-model in writing before relying on it.

What we could verify from the outside

Honest about what we've actually checked vs. what we're taking on trust:

Verified by us: api.llmbase.ai is fronted by Cloudflare. The edge POP that terminated our test request was in Amsterdam (still EU). The DNS records confirm Cloudflare IP ranges.
Not verifiable externally: the origin server behind Cloudflare. That's the whole point of a CDN front door — the origin IP is hidden from the public internet. LLMBase states their GPUs are in Germany; we can't confirm this with a DNS lookup or a traceroute. They don't publicly disclose which GPU host they use (Hetzner, OVH, Nebius, or self-owned hardware).

For most projects this is fine — Cloudflare-in-front-of-EU-origin is the standard pattern for any GDPR-compliant SaaS. For high-sensitivity workloads you should ask LLMBase directly: which GPU provider, which city, and whether you should use their dedicated-endpoint product.

What's a 'dedicated endpoint'?

LLMBase's default product (what Orchestra uses by default) is a shared API: lots of customers send requests to the same GPU pool behind the same Cloudflare front door. Cheap, but you share infrastructure with everyone else.

A dedicated endpoint means LLMBase spins up GPUs that only YOU talk to. Usually optional Cloudflare, usually a fixed monthly fee, usually with stronger residency guarantees. It's the right product if you need an audit trail saying "our data never touched any infrastructure outside [specific data-center]."

LLMBase's dedicated endpoints page has details.

The trust assumption (their claims)

Everything below comes from LLMBase's own marketing and docs, not from independent verification. They are a German limited company (GmbH) bound by EU and German law, which is a meaningful constraint — false claims here would be a contractual breach plus GDPR violation plus potential consumer-fraud liability under German law. Still, claims:

GPUs physically in Germany
GDPR-compliant; DPA available on request
Zero request logging by default
No training on customer data
Outbound calls from their infrastructure go to their own systems only (logging, billing, monitoring) — they don't mention any third-party telemetry or analytics destinations
Customers include Fraunhofer-Gesellschaft and other named German institutions. Both are public claims on their site; faking these would be a serious legal issue.

What's a 'subprocessor'?

A subprocessor is a third party your processor uses. If LLMBase rents GPUs from Hetzner, then Hetzner is LLMBase's subprocessor and (transitively) yours. Under GDPR you have the right to know who they are.

Cloudflare is one (we verified that ourselves). Whoever provides the actual GPU compute is another. LLMBase's DPA should list all of them. If you have real compliance obligations, read that list carefully.

If your project has real compliance requirements (regulated industries, government contracts, audit trail), don't take this page's word — or LLMBase's marketing copy — at face value. Concretely:

Request their DPA and review the full subprocessor list
Ask which GPU subprocessor they use and which physical city
Ask whether a dedicated endpoint (no shared Cloudflare edge) is required for your compliance posture
If you can't get satisfying answers, use Orchestra with Western-origin models (Claude, GPT, Grok) where provenance is fully transparent

Cost angle

Aside from the data-sovereignty story, the practical reason to use EU-hosted Chinese models is cost. GLM-4.6 and DeepSeek-V3 hit Claude Sonnet-class quality on coding tasks at ~10-20% of the price per million tokens. For agentic workflows that fire 50+ turns per task, that compounds quickly. LLMBase's markup over raw open-source compute cost is small — typically pay-per-use starting around $0.15/M tokens with no monthly fee. We make these the default choice for cost reasons; EU hosting is a bonus, not a tax.

Want the gory details?

See LLMBase's own pages:

About LLMBase — infrastructure and company background
European DeepSeek Alternative — their pitch on the data-sovereignty angle
Dedicated AI Endpoints in Europe — for orgs that need isolated GPUs

Back to Orchestra