A security architecture for putting AI to work on Protected Health Information and other critical data.
Knowledge Spaces lets regulated organizations build AI applications on sensitive data without sending whole records, whole corpora, or uncontrolled source systems to the model, and lets partners do the same for every client they serve.
Organizations that hold regulated data keep reaching the same conclusion: the data with the most analytic value is the data they are least allowed to expose. Protected Health Information is the clearest case. Claims, enrollment, and quality data are where the value sits, and all of it is PHI.
The usual workaround is to strip records down until they are no longer regulated, then send what is left to a model. It is lawful, but it is slow and it removes the dates, geography, and linkage that made the analysis worth doing. The value leaves with the identifiers.
Knowledge Spaces helps turn regulated data from a protected asset that sits unused into a controlled source for analysis, decision support, and custom applications. It places a governed boundary between your data and any AI model, so the model receives only the scoped context the application is allowed to send for that user, workflow, and question while the full, linked, sensitive data stays inside a line you control. The controls a regulated deployment needs (tenancy, least privilege, retrieval scoping, audit logging, and configurable data minimization) already exist and run, so a scoped first deployment can focus on integration, workflow fit, and evidence rather than rebuilding the control layer from scratch.
Knowledge Spaces rests on three ideas:
Knowledge Spaces is a control layer, not a model and not an application.
It scopes data before inference. The model sees the prompt, the current exchange, and the retrieved context allowed for that bot and user.
It produces records a compliance officer can inspect: configuration changes, sharing events, administrative actions, document activity, bot activity, and conversations when logging is enabled.
AI is no longer a place you visit. It is becoming the way knowledge work gets done. The most capable models are moving off their own websites and into the tools your teams already use: chat, documents, email, shared channels, internal systems. Instead of opening an app to ask a question, people now expect AI to be present where the work is, aware of context, and able to act. This shift is real, it is fast, and for most companies it will be a genuine gain in speed and output.
It also changes what your AI systems can see. To be useful inside a company, these systems read widely: conversations, files, processes, customer records, the working memory of the business. The more value they create, the more they reach into your most sensitive data, and the more of that reach happens quietly, in the background, without a deliberate decision each time. More systems, more access, less visibility. That is the trade most organizations are making by default, often without a record of who saw what.
The answer is not to slow AI down. Companies that hesitate will fall behind the ones that adopt. The answer is to decide the terms of access before the data moves. Put a governed control layer between your data and any model, so each AI application is limited to the sources, roles, and context it is allowed to use, platform actions are recorded, and sensitive fields can be minimized or removed before they reach a model. Adoption stays fast. The data exposure does not have to grow with it.
Models change fast. Governance, data boundaries, and policy should not have to be rebuilt every time the best model changes.
There is a strategic reason to own this layer, not just a security one. As frontier models keep improving and competing, model choice will keep changing. What should not change every quarter is the place where your governance, your data boundaries, and your policy live. If that context sits inside a single model vendor's stack, your switching cost rises every month. If it sits in a control layer you own, model choice stays a procurement decision, not a rebuild.
That layer is what Knowledge Spaces provides. It is model-agnostic by design and works with the strongest frontier models rather than against them. The result is the upside of AI inside your business, on terms you set and can prove, independent of which model is in fashion next quarter.
The risk in putting AI on regulated data is specific, and it is disclosure.
Sending regulated data to a third-party model does three things at once. It may disclose the data to a vendor, which under HIPAA can make that vendor a business associate and require a written Business Associate Agreement before anything moves. It can send more than the task needs, which runs against the minimum-necessary standard. Consumer and default commercial endpoints may also retain inputs, route them through subprocessors, or reserve rights the customer has not reviewed. Enterprise model terms can solve part of that contractually, but contract terms do not decide which internal source, user, or record should be used for a given answer.
A public chatbot is the wrong control point for this work. Even when an enterprise AI tool offers better terms, it usually does not carry the client's source map, workflow permissions, partner-client segmentation, retrieval rules, and application-specific audit trail.
HIPAA does not ban AI, cloud, or large language models. It is technology-neutral. The task is not to avoid AI. It is to put a governed boundary between the data and the model, so that what crosses the boundary is scoped, minimized, contractually governed, and recorded.
HIPAA governs how protected information is used and disclosed and who it may be disclosed to. Knowledge Spaces is built around that distinction.
Knowledge Spaces reduces the tradeoff between using your data and protecting it. The result is a working AI capability on the data that matters most, under controls your compliance team can inspect.
The records, claims, and case files where the value sits stay usable inside the governed boundary instead of being pushed into an uncontrolled model path.
It stays inside your boundary, and only scoped context reaches a model for the allowed workflow.
Tenancy, role scoping, audit logging, and retrieval controls already exist. The work is to connect sources, map roles, define allowed uses, and test the workflow against real data.
Administrative, configuration, and sharing actions are logged and exportable, so the answer is in the record, not in someone's memory.
Add or switch models per workload without rebuilding governance, because the controls live in the layer, not the model.
Each client sits in its own organization, so a partner can build on the platform while client boundaries remain explicit and testable.
Knowledge Spaces is the control layer between an organization's data and the AI applications built on top of it. It is not a single model, and it is not a finished application. It is the governed middle: where the organization decides what data exists, who can reach it, which model sees what, and what gets written to the record.
Data is not bulk-loaded into the model. Source material is ingested into the governed Knowledge Spaces boundary, indexed for retrieval, and then scoped excerpts are sent to the model when the application asks a question. What reaches the model is the configured prompt, current exchange, and retrieved context allowed for that bot and user. Retrieval scoped this way is the technical control that supports minimum-necessary policies at inference time.
Knowledge Spaces is model-agnostic. The model is chosen per workload from commercial and hosted open models, and that choice can change without rebuilding anything, because the governance lives in the layer, not in the model.
A secure environment is not a single feature. It is a set of choices in the platform and a way of working that backs them, and both are built to give the same confidence to an organization trusting us with its own data and to a partner trusting us with its clients' data.
Data minimization starts in the retrieval path. A bot can search only its bound spaces, and the model receives only the prompt, current exchange, and retrieved context allowed for that request.
Least privilege runs throughout. Access to data, configuration, and keys is scoped to role and purpose; keys can be bound to specific bots and spaces and then expired or revoked; and model validation is designed to reject a missing, unauthorized, or unavailable configuration instead of silently falling back to an unintended provider.
Isolation is enforced in the application layer on request paths. Every object belongs to an organization, and authorization checks gate access before retrieval or configuration changes proceed. For engagements that need stronger infrastructure isolation, scope a dedicated deployment.
Change is governed. Every change to how a bot behaves is versioned and recorded, and restoring an earlier version writes forward without erasing history, so the full lineage stays recoverable.
Shared responsibility is stated plainly. The platform carries the controls it can carry, we are clear about what remains the organization's responsibility, and we do not claim certifications that do not exist.
Platform controls are built into Knowledge Spaces: organization tenancy, role scoping, bot-to-space binding, scoped API keys, model selection, versioned prompts, activity logging, and exportable records.
Deployment controls are scoped for the engagement: masking, tokenization, Business Associate Agreement terms, no-retention model terms, dedicated infrastructure, customer-specific retention, and any single-tenant or self-managed pattern.
Customer controls remain with the organization: permitted purpose, workforce policy, legal basis, approval workflows, downstream use, and the final compliance program. A serious deployment names all three before sensitive data moves.
Senior Sprinklenet engineers and advisors stay involved through architecture, integration, and launch. As deployments mature, support moves into a documented operating model instead of disappearing into a ticket queue.
We work embedded and hands-on, fitting the controls to your environment and to the questions your compliance team will ask.
Transparency is itself a control. The architecture is documented, the audit trail is queryable, and the boundary behavior is specified rather than asserted, so you can verify what the platform does.
We separate platform controls from customer responsibilities. The paper says what Knowledge Spaces controls directly, what must be configured during the engagement, and what remains the customer's legal, policy, and operational responsibility.
Partners who build on Knowledge Spaces face a sharper version of the same problem: they hold data for many clients at once, and every client expects its data to stay its own. The platform is built for this. A partner operates as a principal organization and holds each client in its own child organization. Records, bots, and conversations belong to an organization, and the governed retrieval path checks those boundaries before a bot can reach data, memory, or results. When a client wants the partner's application to use its knowledge, the client shares a space by read-only reference: the partner's bot can ground its answers on that knowledge, while the underlying files, storage, and vectors stay owned by the client and are never copied across.
A partner can build and run a client-specific application on top of Knowledge Spaces while the platform keeps each client's data boundary explicit. The partner brings the methodology, customer relationship, and market. Sprinklenet helps turn that into a client-specific application, connects the data, configures the governed boundary, and supports the system as usage grows.
The build pattern is practical. Start with one client workflow and one source map. Connect the data the workflow actually uses. Define who can ask which questions. Build the front end users will touch. Bind the application to the right spaces, prompts, models, and review path. Launch a narrow version, watch the questions and failures, then expand the boundary as the system proves value.
Knowledge Spaces is built from a small set of bounded objects: organizations, knowledge spaces, documents, and bots, plus the request path that connects them.
The Organization is the tenancy boundary. Every user, bot, knowledge space, document, and conversation is owned by exactly one organization, and access is enforced by a role-based middleware ladder on every request.
A governing organization can be designated a Principal Organization that provisions and manages child organizations beneath it, for example a firm that holds each of its clients in a separate organization. The principal sets inherited limits and curates the knowledge spaces shared to its children by default. Within an organization, members hold one of four roles: owner, admin, member, or viewer. A principal administrator can manage governed child organizations according to configured role and policy boundaries, and sensitive client-data access should be scoped explicitly and audited.
A knowledge space is a named container for an organization's knowledge: documents, structured data, and connected sources such as APIs and websites, configured on the space. When a document is ingested, its text is extracted, embedded into a vector index, and its content stored in object storage, so a bot can retrieve relevant passages without loading whole files. A document can belong to more than one space.
A bot is a configured, stateless surface: a model, a system prompt, branding, retention and memory settings, and the specific knowledge spaces it is allowed to search. Its behavior is governed as much by the system prompt, which can be highly technical, as by the spaces it is bound to, and through those spaces it can draw on connected sources such as APIs and websites. A bot can retrieve only from the spaces it is bound to. It cannot reach another organization's data, another bot's memory, or any space it was not granted.
Organizations often need to let a partner's bot use their knowledge without surrendering the underlying files. Knowledge Spaces does this by reference, not by copy. Sharing a space grants another organization read-only access to it inside the governed retrieval path. The documents, stored files, and vectors stay owned by the source organization. The recipient can ground a bot's answers on the shared space but cannot browse, download, edit, or move the raw files through that share.
A single end-user question moves through the platform as a governed sequence. The request arrives through an authenticated interface and is tied to a session. The bot retrieves scoped context from only the knowledge spaces it is bound to. The configured model is called to generate a response. The answer is returned, and platform events are logged.
What reaches the model is the system prompt, the retrieved context slice, and the current exchange. What does not reach the model is the full corpus, other spaces, or any other tenant's data. A dedicated deployment can add boundary masking, so that identifiers are withheld or replaced before context leaves the boundary.
How a bot behaves is governed and versioned. When a system prompt changes, the platform writes a new prompt version. When a tracked configuration field changes, it writes a new configuration snapshot. Restoring an earlier version never overwrites history; it writes a new version on top, so the full lineage is preserved. Every change is recorded in the activity log with a field-level record of what changed. A system prompt can be highly technical, and it is often where most of a bot's behavior is set. Knowledge Spaces includes built-in AI assistance to draft, improve, or review a prompt before applying it, and Sprinklenet's professional-services team works with you on prompt optimization, one of the highest-value parts of an engagement.
Each bot uses the model chosen for it, drawn from a catalog kept current against the major providers. Validation confirms both that the model is in the catalog and that the organization holds the required provider credentials, so a misconfigured model fails closed rather than calling an unintended endpoint.
Programmatic access is scoped and least-privilege. Customer-facing scoped API keys carry named permissions, can be restricted to specific bots and spaces, support IP allow-lists, per-key rate limits, and expiry, and are stored only as hashes. Provider credentials are governed separately and should be reviewed during security diligence. Sign-in supports multi-factor authentication with one-time recovery codes, and passwords are stored using bcrypt. API defenses include security headers, strict cross-origin rules, request validation, and rate limiting.
The platform keeps activity and audit logs across users, members, organizations, API keys, knowledge spaces, documents, bots, and system events, with a configurable retention policy and export to common formats. For data-subject obligations, an organization can export conversations for a single end user and purge a user's stored memory and sessions. Active stored files and vectors are removed when a document or space is deleted. Backup, log-retention, and legal-hold behavior should be documented for the deployment.
Analytics on PHI is the case that stalls most healthcare and CMS-adjacent projects. It can be done lawfully without gutting the data.
HIPAA offers two ways to remove data from its scope entirely. Safe Harbor de-identification removes eighteen categories of identifiers, including names, most date detail, and geography below the level of a three-digit ZIP (with sparsely populated three-digit ZIPs reduced further). Expert Determination uses a qualified expert to determine and document that re-identification risk is very small. Safe Harbor is cheap and mechanical but blunt: it strips out admission and discharge timing, fine geography, and the linkage keys that longitudinal, cohort, and utilization analytics depend on. Expert Determination can preserve more, but it costs money and expertise and produces a probabilistic result that must be re-justified as data and techniques change. Both methods can buy privacy by destroying information that made the analysis worth doing.
A middle instrument exists: the limited data set. It keeps dates and some geography while removing direct identifiers. It is still PHI, so it cannot be sent to a model freely, but it can be used for analytics under a Data Use Agreement that forbids re-identification and limits further disclosure. For analyses that genuinely need dates or geography, a limited data set inside a governed environment is often the right tool.
The control-layer model changes the tradeoff. The full, linked, date-precise data stays inside the governed boundary. Only scoped input context crosses to the model. The value stays inside the line; the model receives the context the workflow is allowed to send.
Analytics that need that richness run against the full data inside the boundary. What crosses to the model is a retrieval-scoped excerpt, and in a regulated deployment a masked field, a surrogate token, or an aggregate. Disclosure risk is concentrated on the narrow slice that actually leaves the boundary.
Tokenization is easy to get wrong. Replacing an identifier with a token is only safe if the token is not derived from the identifier, cannot be reversed by whoever receives it, and the mapping back to the person never leaves the governed layer. A token that is a hash of a Social Security number is still PHI. Where tokenization is part of the deployment, the design should be confirmed so the model and downstream systems see only non-reversible surrogates while full linkage stays inside the boundary.
Data tied to the Centers for Medicare and Medicaid Services carries a higher bar than HIPAA alone. Claims, enrollment, and quality data handled for or with CMS typically implicate federal information-security requirements: FISMA, the CMS Acceptable Risk Safeguards baseline that implements NIST SP 800-53, and FedRAMP for the cloud layer, usually under a CMS Data Use Agreement. Knowledge Spaces controls can be mapped to NIST 800-53 control families and the platform can be scoped for deployment on FedRAMP-authorized cloud infrastructure. The authority to operate for any given federal system is earned per deployment and per environment. It is not a property of the software.
Each control below supports an obligation a compliance team already enforces. The platform does not replace policies, risk analysis, contracts, workforce training, or legal review.
| Control | What it does | Obligation it serves |
|---|---|---|
| Organization tenancy | Every record belongs to one organization; principal organizations hold their clients in separate child organizations | Separation of one client's data from another's prompts, retrieval, and results |
| Minimum-necessary retrieval | Only context authorized for a user, workflow, and question is retrieved and sent to the model | Technical support for applying the HIPAA minimum-necessary standard at inference time |
| Boundary masking | When configured for the deployment, identifiers can be withheld or replaced before context leaves the boundary | Limiting what is disclosed to a model for a regulated workflow |
| Role-based access control | Access to data is gated by role and scope; in regulated deployments, access to any masking or re-identification capability is gated the same way | The Security Rule access-control standard and least privilege |
| Scoped API access | Keys carry named scopes, bot and space restrictions, IP allow-lists, expiry, and per-key limits, and are stored hashed | Controlled, least-privilege access for integrations and partners |
| Audit logging | Administrative, configuration, sharing, document, bot, member, and key actions are recorded in a queryable activity and audit log with retention and export; end-user conversations are captured per bot when conversation logging is enabled | The Security Rule audit-controls standard and information-system activity review |
| Versioned change control | Prompt and configuration changes are versioned and never overwritten, with a field-level record of each change | Demonstrable control over how an AI system behaves over time |
| Encryption | Data is encrypted in transit and at rest through the underlying managed cloud services; key ownership, backup retention, logs, and subprocessors should be documented during diligence | Transmission security and at-rest protection |
| Model governance | The model is chosen per workload and can be required to carry no-retention, no-training terms | Closing the secondary-use and retention gaps consumer endpoints leave open |
| Deployment flexibility | Managed service today; dedicated, self-managed, or isolated patterns can be scoped for regulated engagements | Matching data residency and isolation to the sensitivity of the workload |
Two controls carry the most weight for regulated data. The first is the audit trail and versioned change control: a compliance officer can answer what changed, who changed it, and when from the record rather than from memory, and any change to how a bot behaves is itself recorded and recoverable. The second is model governance: because the organization chooses the model and the deployment, it can require an enterprise model under no-retention, no-training terms and scope a dedicated deployment pattern for the most sensitive work.
| Capability | Consumer AI tool | Build it yourself | Generic AI vendor | Knowledge Spaces |
|---|---|---|---|---|
| Data boundary before the model | None in consumer use | Yours to build | Varies by product and contract | Governed retrieval boundary before model calls |
| Per-client segmentation | Not designed for it | Yours to build | Often workspace-based, not partner-client native | Principal and child organization pattern |
| Minimum-necessary support | Manual prompt discipline | Yours to build | Varies | Retrieval-scoped by bot, user, space, and workflow |
| Custom application front end | Generic chat surface | Yours to design and maintain | Usually constrained to vendor patterns | Custom apps can be built on top of the governed layer |
| Existing-system integration | Limited | Yours to connect and govern | Often connector-led | Designed for client source maps, workflows, and integrations |
| Audit trail | None or opaque | Yours to build | Limited or product-specific | Queryable, exportable, retained according to policy |
| Model choice and retention control | Usually fixed by the tool | Yours to integrate | Often tied to one vendor | Chosen per bot and contract path |
| Regulated deployment pattern | No | Yours to build | Sometimes | Managed service today; dedicated patterns can be scoped |
A capable engineering team could build much of this. Knowledge Spaces is the governance an organization would otherwise have to build, operate, and keep current itself, delivered as a platform.
There is no such thing as a HIPAA certification. The Department of Health and Human Services does not endorse or recognize private HIPAA-compliant or Security Rule certifications. A vendor claiming to be HIPAA certified is misstating how the law works. HIPAA compliance is a property of an organization's entire safeguards program, not a feature of a piece of software.
Knowledge Spaces supports HIPAA-relevant safeguards: access control, audit controls, transmission protection, data minimization, and documented change control. For PHI workloads, the contracting path must include a Business Associate Agreement with the relevant parties before ePHI is processed, and Sprinklenet reviews that path during scoping. The platform runs on managed cloud infrastructure with role-based access control, organization-level tenant separation, and queryable, exportable audit logging. Sprinklenet holds a GSA Multiple Award Schedule. Knowledge Spaces is not FedRAMP authorized today; for federal cloud use cases, Sprinklenet can scope a deployment on FedRAMP-authorized infrastructure and work with the customer or sponsor agency on the authorization path. Knowledge Spaces does not make any organization HIPAA compliant on its own. That responsibility remains with the organization, across its full administrative, physical, and technical safeguards program. Knowledge Spaces is built to support that work.
If we run AI on claims data through Knowledge Spaces, are we making an unlawful disclosure?
That depends on purpose, contracts, configuration, and deployment pattern. Knowledge Spaces is designed to keep the analysis inside the governed boundary and send only scoped context to a model under appropriate terms. For PHI, the parties must have the right Business Associate Agreement path in place before ePHI is processed.
Does our data get used to train the model?
Not under the deployment pattern Sprinklenet recommends for regulated data. The model path should use enterprise no-retention, no-training terms, and a dedicated deployment can be scoped when the workload requires stronger isolation.
Can one client's data show up in another client's results?
That should not happen through the governed retrieval path. Records belong to an organization, principal organizations hold clients in separate child organizations, and a bot can reach only the knowledge spaces it was granted.
How quickly can we have a governed capability running?
The control layer already includes tenancy, access control, retrieval scoping, audit logging, and model configuration. Timing depends on source connections, role mapping, workflow design, contracting, and validation against real data.
How do you keep one partner's clients separate from another's?
Each client sits in its own organization, and the governed retrieval path checks ownership before data, memory, or results are used. When a client shares knowledge with a partner's application, it shares a read-only reference, and the underlying files remain owned by the source organization.
How do we satisfy minimum necessary when an analyst is asking open-ended questions?
The system constrains what can be retrieved by bot, user, space, and workflow, then sends only the allowed context to the model. It does not send whole records unless the deployment is configured to do so.
Do we still have to de-identify everything up front?
Not always. Some use cases still require de-identification or a limited data set. Knowledge Spaces gives you another pattern: keep richer data inside the governed boundary and minimize or mask the context that leaves for model reasoning.
Is encrypted cloud storage exempt from a Business Associate Agreement?
No. Under HHS guidance, any cloud provider that handles electronic PHI is a business associate and needs a BAA, even for encrypted no-view storage. Encryption is necessary, not sufficient. The boundary is designed to run on BAA-covered infrastructure.
Can we prove who accessed what?
You can produce the platform-recorded events: administrative, configuration, sharing, key, document, bot, and member actions, plus end-user conversations when conversation logging is enabled. Retention and export settings should be set during deployment.
How do we control what partners and integrations can reach?
Programmatic access uses scoped API keys that can be limited to specific bots and spaces, restricted by IP, given an expiry and a rate limit, and revoked. Access to data and configuration is gated by role.
Are you HIPAA certified? Are you FedRAMP authorized?
No vendor is HIPAA certified, because no such certification exists. Knowledge Spaces supports HIPAA-relevant safeguards, and PHI workloads require the right Business Associate Agreement path before ePHI is processed. Knowledge Spaces is not FedRAMP authorized today; federal cloud deployments can be scoped on FedRAMP-authorized infrastructure with the customer or sponsor agency.
Where does our data live, and can we self-host?
Knowledge Spaces runs as a managed service today. For regulated engagements, dedicated, self-managed, or isolated patterns can be scoped so deployment, residency, and operational controls match the workload.
If your team holds regulated data it cannot safely use in AI workflows, that is the problem Knowledge Spaces was built to solve. The next step is a focused conversation about the application you want to build, the sources it must touch, and the controls your compliance team will ask about.
Request a Knowledge Spaces data-protection and application walkthrough. Bring the workflow, the source map, and the questions your compliance team would ask. We will answer directly.