Cloud or on-premise? How to pick an AI deployment model you won't regret

The most common question after a first Ragen demo: “OK, so where does this actually run?” And the second one, right behind it: “Can we host it ourselves?”

The answer is: yes, you can host it yourself. And yes, you can run it in our cloud. These are two different scenarios — each for a different kind of company, a different IT team, a different budget. In this article we’ll show you when each choice makes sense, so you aren’t picking blind.

Two worlds, one platform

Ragen is the same platform regardless of how you deploy it. Same features. Same interface. Same admin panel. Same knowledge base, assistants, integrations, chatbot.

The difference is one question: who runs the infrastructure it sits on.

In Ragen Cloud — we do. In on-premise — you do.

Every other difference flows from that single question: costs, time to start, compliance requirements, depth of data control, the load on your IT team.

Ragen Cloud — fast start, predictable bill

In the cloud model, you go to ragen.ai, log in, upload documents, ask your first question. All the infrastructure — databases, servers, AI models, integrations — sits on our side. One account, one invoice, end of story.

What that means in practice:

Fast start. Instead of weeks preparing the environment yourself — a day, maybe two, to roll out your first assistant for the team.

AI model costs included in the subscription. You don’t set up an OpenAI account, negotiate an Anthropic agreement, or watch quotas in Google Cloud. No separate invoices from five AI vendors. One budget line the CFO understands from day one.

Automatic updates. When we ship a new feature or a better model becomes available, you get it immediately. Your team doesn’t update anything, test anything, or migrate anything. New things simply appear.

Zero headcount requirements on your side. You don’t need a DevOps engineer to watch the servers. You don’t need a database administrator. You don’t need anyone who knows Kubernetes — because nobody has to.

Integrations ready out of the box. Google Drive, Gmail, Calendar, HubSpot, ClickUp, Slack, Notion — everything enabled with one click. No OAuth configuration, no token wrangling, no IT team involvement.

Who is this for? Companies that want to start using AI quickly and focus on the business, not the infrastructure. Scale-ups, professional services firms, product teams, agencies, companies up to ~200 people without a dedicated DevOps team. And every company for which “data in an encrypted EU cloud” is a sufficient level of security.

On-premise — full control, your rules

On-premise means we deliver the software and you run it in your own environment. In your data centre. In your private cloud. In your Kubernetes cluster. In your network, under your security policy.

What you gain:

Data stays with you. Documents, conversations, customer content, search results — all inside your infrastructure. They don’t pass through any external service. They don’t land in a third party’s logs. They don’t leave your network for a millisecond.

Choice of AI vendors without an intermediary. You decide which models to work with. Azure OpenAI for data that requires contracts with Microsoft. AWS Bedrock if you already have a contract. Google Vertex if your cloud is GCP. European providers like Scaleway and OVH if you want open-source models hosted in the EU at a fraction of the price. Or all of the above — each model for a different task.

Encryption and security on your terms. Encryption keys on your side. Integration with your identity management system (Active Directory, Okta, Keycloak). Integration with your HSM, if you have one. Encryption at rest and in transit, with keys we don’t have access to.

Compliance without compromise. GDPR, national financial regulators, sector-specific regulation, your internal company policies — all under your control. When an auditor asks “where is the data?”, the answer is “with us, on our servers, encrypted with our keys”. End of conversation.

Full configurability. Your own models, your own limits, your own retrieval pipeline tuning parameters, your own integrations with internal systems. If you have specific requirements — you don’t negotiate them. You just deploy them.

Who is this for? Banks, insurers, healthcare, law firms, consultancies serving competing clients, public institutions, companies in regulated industries. Anyone whose only possible answer to “can we send this data outside?” is “no”.

Costs — what you actually pay

Ragen Cloud

One monthly subscription priced by number of users, knowledge base size and number of queries. AI models — included. Infrastructure — included. Updates — included. Support — included.

One invoice. You know upfront what you’ll spend per month and per year. The CFO can plan, the board can approve, and you don’t wake up to a surprise on the company account.

On-premise

Here you pay separately for each piece of the puzzle:

Infrastructure. Servers, databases, storage, monitoring — what Ragen runs on. For a simple deployment on your own cloud (AWS/Azure/GCP), a few hundred to about €350 per month. For large deployments with high availability — several multiples of that.

AI models. You pay vendors directly per token. For a thousand queries a month that’s typically €15–17 (with a full pipeline using Claude Sonnet and Cohere). At ten thousand queries the model cost scales linearly, but the infrastructure stays roughly the same — so cost per query falls.

Work out exact numbers for your case in our AI cost control guide.

Your team’s time. This is the cost most companies forget — because it isn’t on an invoice. Running on-premise means monitoring, updates, backups, incident response. Typically 4–16 engineer-hours per month, depending on the scale of the deployment. Multiply by your internal rate and add it to the bill.

When does it pay off? When you already have the infrastructure and team, when your scale exceeds 10,000 queries per month, when regulations mandate on-premise, or when you use European open-source providers (Scaleway, OVH) with models at a fraction of frontier pricing.

When to choose cloud — checklist

Pick Ragen Cloud if at least three of the following are true for your company:

You want to move fast. First assistant for the team in a week, not a quarter.
You don’t have a DevOps team or they’re too busy to take on more infrastructure.
You need a predictable budget. One invoice, zero surprises.
Your scale is up to ~5,000 queries per month. In that range, cloud is cheaper than running it yourself.
You want to use ready integrations (Gmail, Drive, HubSpot, Slack) without configuration.
Your data isn’t subject to regulations that mandate locality.

When to choose on-premise — checklist

Pick on-premise if at least one of the following is true:

Regulation mandates it. Banking, healthcare, public sector, law firms — data has to stay in your network, full stop.
You serve competing clients. You’re a consultancy, agency, or law firm — your clients compete with each other. Data isolation isn’t an option, it’s a contract requirement.
You’re running at 10,000+ queries per month. Your own infrastructure stops being a fixed cost and becomes a margin lever.
You already have a DevOps team and infrastructure. Adding Ragen to an existing ecosystem is marginal in cost.
You want to use open-source models from Scaleway or OVH and pay a fraction of frontier-model prices.
Your security policy rules out sending company data to SaaS services — even encrypted ones.

The hybrid scenario — for larger organisations

Not every company has to pick one. In many larger organisations a mixed model works:

Ragen Cloud for marketing, sales, customer service — fast start, ready integrations, low operational costs.
On-premise for legal, finance, HR, compliance — anywhere data can’t leave the company.

It’s the same platform, the same assistants, the same interface. The difference is only at the infrastructure layer — invisible to the end user.

For larger organisations this is often the most sensible approach: use cloud where you can and save; keep full control where you can’t. End of dilemma.

Summary — what you’re actually choosing

The choice between cloud and on-premise isn’t a choice between “good” and “bad”. It’s a choice between two paths, each right for a different context.

Cloud is for those who want to use AI quickly and focus on the business. On-premise is for those who must (or want to) keep full control over data and infrastructure.

Most of the companies we talk to work out which path is theirs within the first 15 minutes of the call. Three questions are usually enough: who manages data in your company, what compliance requirements apply, and what scale of usage you expect.

How to find out what fits your company

Two ways:

Start by asking whether Ragen fits your company at all. Check who Ragen is for — customer profiles, a readiness checklist, and fit boundaries.

Talk to us. Book a 30-min fit call — in 30 minutes we’ll analyse your case and recommend a deployment model that makes sense for your company. No commitment, no pressure.

Pick well once — this is a decision you’ll live with for the next several years.

If you’re not yet sure the “cloud vs on-premise” discussion applies to you at all, start with the pillar post: 4 levels of AI sovereignty — which one fits your European company. It sets the frame in which this choice makes sense.

Once you’ve picked a deployment model, the practical path for a first pilot is in RAG in 4 weeks — a playbook for your first knowledge assistant. The same four-week schedule works in cloud and on-premise — only the infrastructure layer changes.