Island Mountain AI server frequently asked questions
Home Why Local AI Products Technology Pricing Solutions FAQ Blog Request Quote
Answers

Frequently Asked Questions

Straight answers about local AI hardware, data sovereignty, and what it costs.

Is Local AI Right for Us?

What is local AI inference hardware and how is it different from cloud AI?

Local AI inference hardware is a physical server with GPUs that runs AI models on your premises instead of sending your data to someone else's data center. "Inference" means the system processes your prompts and generates responses. It does not train new models. It runs pre-trained models locally.

When you use a cloud AI service (ChatGPT, Claude, Gemini, Azure OpenAI), your prompts leave your network, get processed on the cloud provider's GPUs, and the response comes back over the internet. The provider sees your data, processes it on shared infrastructure, and may use it according to their terms of service.

With local hardware, the entire cycle happens inside your building. Prompt goes to your GPU. Response comes back from your GPU. Nothing leaves your network. You own the hardware, you own the models (MIT licensed, open weights), and you control every variable.

The tradeoff is straightforward: cloud AI gives you access to the latest proprietary models (GPT-4o, Claude Opus) without hardware investment. Local AI gives you data sovereignty, zero recurring costs after purchase, and models you control entirely, but you are limited to open-source models that typically lag cloud releases by weeks or months. For a detailed comparison, see our Why Local AI page.

Does local AI hardware work without an internet connection?

Yes. Every Island Mountain system is designed to operate completely air-gapped. The models, the inference engine (vLLM/Ollama), and the interface (OpenWebUI) all run locally. Once the system is set up and models are loaded, you can disconnect the ethernet cable entirely and everything still works.

Users access the system through a web browser on your local network. As long as the device and the server are on the same network (even an isolated, non-internet-connected network), inference runs normally. No phone-home. No license checks. No cloud dependencies.

The only things that require internet access are downloading new models and applying OS security updates. Both are optional and can be done on a schedule that fits your security posture. Some organizations in high-security environments load models via physical media and apply security patches through an isolated update process. The system supports that workflow. For more on the technology stack, see our Technology page.

Island Mountain applies a documented air-gap configuration before delivery, disabling all outbound features at the environment variable level: OFFLINE_MODE, telemetry, community sharing, and web search are all turned off during the 72-hour burn-in. Your IT team can independently verify every setting on the delivered system. See our Technology page for the full configuration reference.

Who should NOT buy local AI hardware?

Local AI hardware is the wrong purchase for several types of buyers, and we would rather tell you now than after you have spent $75,000.

If your organization needs the absolute latest proprietary models (GPT-4o, Claude Opus, Gemini Ultra) on the day they release, local hardware will not deliver that. Open-source models lag proprietary releases. They close the gap over time, but there is always a gap.

If you need to serve more than 50 concurrent heavy users, a dual-GPU system at this price point is not the right architecture. You need enterprise cloud infrastructure or multiple local systems.

If your organization has no IT capability at all and cannot manage a server, be aware that after 30 days of included support, your team owns the maintenance. OS updates, network configuration, and troubleshooting fall to you. We offer ongoing support retainers, but the system lives in your server room.

If your data sensitivity does not require local processing and cloud AI is working fine for your team at $200/month, spending $75,000 on hardware does not make financial sense. Cloud is cheaper for low-volume, non-sensitive use cases. We say this on our Why Local AI page because it is true.

Compliance & Regulations

Does running AI locally comply with HIPAA?

Running AI on local hardware eliminates the data transmission vector that creates HIPAA exposure in the first place. When you use a cloud AI service, Protected Health Information (PHI) leaves your network and is processed on shared infrastructure controlled by a third party. Under 45 CFR §164.402, that transmission can constitute a reportable disclosure event. With local inference hardware, PHI never leaves your premises. There is no cloud provider to execute a Business Associate Agreement with because no business associate is involved. The prompts stay on your server. The responses stay on your server. The model weights sit on drives you physically control.

That said, local hardware does not automatically make your entire AI workflow HIPAA-compliant. You still need proper access controls, audit logging, encryption at rest, and staff training. The hardware eliminates the cloud transmission risk; your organization still owns the operational compliance. Island Mountain is not a compliance attorney, and we do not certify HIPAA compliance. What we build is hardware that removes the third-party processing variable from the equation entirely. For a full breakdown of how local infrastructure fits into a compliance strategy, see our Medical Practices page.

Can a law firm use local AI without violating attorney-client privilege?

The core issue with cloud AI for law firms is this: client data sent to a third-party API is data disclosed to a third party. Under ABA Model Rule 1.6, attorneys have a duty to make reasonable efforts to prevent unauthorized disclosure of client information. Courts have found privilege waived when confidential information is processed through systems outside the firm's control.

Local AI hardware keeps every prompt, every document, and every response on a server inside your office. No data travels to an external API. No third party processes your client's information. The model runs on hardware you own, connected to your network, behind your firewall.

This does not mean local AI automatically preserves privilege in every scenario. Your firm still needs internal policies governing who accesses the system, what data can be input, and how outputs are treated in the work product. But the fundamental architectural risk of third-party disclosure is eliminated. The system sits in your server room, not someone else's data center. For more on how different organization types benefit from local infrastructure, see our Law Firms page.

Can a tribal government run AI without cloud providers?

Yes. That is one of the primary use cases Island Mountain hardware is built for. Tribal nations exercise inherent sovereignty over constituent data, enrollment records, health information, and internal governance documents. Routing that data through cloud AI providers means processing sovereign data through jurisdictions and infrastructure outside tribal authority.

Local AI hardware keeps everything on tribal premises, under tribal law, processed by hardware the nation owns outright. No data leaves the reservation. No cloud provider terms of service apply to your data. No federal or state jurisdiction touches the processing.

The models pre-installed on every system (DeepSeek V4-Flash quantized, Llama 3.1 70B, Mixtral 8x22B) are MIT licensed, meaning the tribe owns the models with no usage restrictions. The system works fully air-gapped if your security posture requires it. Disconnect the ethernet cable and it still runs.

Tribal IT infrastructure varies widely. If you have a server room or even a climate-controlled data closet with the right electrical circuit (208V/30A), the system fits. If you don't, we can discuss what site preparation looks like. See our Tribal Nations page for governance, emergency management, and health services workflows, or visit our contact page to start a conversation about your specific setup.

Does local AI hardware meet ITAR requirements for defense contractors?

ITAR restricts the processing and storage of controlled technical data to environments that prevent foreign access. Cloud infrastructure operated by providers with multinational operations, overseas data centers, or foreign-national employees creates structural compliance risk for ITAR-regulated data.

Local AI hardware that you own, operate on US soil, and control with US-person-only access eliminates the cloud processing vector entirely. The data stays on your premises, processed by hardware under your physical control, with no third-party infrastructure involved.

Island Mountain systems run open-source models (MIT licensed, publicly available weights), which are not themselves controlled under ITAR. The controlled element is your data, and local hardware ensures that data never transits infrastructure you do not control.

However, Island Mountain does not provide ITAR compliance certification, documentation, or legal guidance. ITAR compliance is an organizational responsibility that covers physical security, personnel screening, access controls, and documentation well beyond the hardware layer. What we provide is infrastructure that removes the cloud processing variable from your compliance posture. Consult your ITAR compliance officer or legal counsel for guidance specific to your program. For CUI, CMMC, and ITAR-specific workflows, see our Defense Contractors page.

Is DeepSeek V4-Flash safe to use with sensitive organizational data?

The safety question about DeepSeek V4-Flash usually comes from two concerns: the model's Chinese origin and data handling. Here are the facts.

DeepSeek V4-Flash is an open-weights model released under the MIT license. When you run it on Island Mountain hardware, the model executes entirely on your local GPUs. No data is sent to DeepSeek, to any Chinese server, or to any external endpoint. The model weights are publicly available and have been independently audited by the open-source community. The model does not phone home. It cannot phone home. It is a file sitting on your hard drive being processed by your GPUs.

The risk with DeepSeek exists only when you use it through DeepSeek's cloud API, which routes your data through their servers. That is not how Island Mountain systems work. Our systems run the model locally, air-gapped if you want, with zero external communication.

The model itself does not contain backdoors or data exfiltration code. It is a set of numerical weights processed through standard inference engines (vLLM/Ollama). Treat it like any other software asset: verify the download hash, review your organization's acceptable use policy, and run it on infrastructure you control. For more on DeepSeek and the full model stack, see our Technology page.

How the Hardware Works

What VRAM do I need to run DeepSeek V4-Flash?

DeepSeek V4-Flash (284B parameters) requires approximately 160GB of VRAM to run quantized or 282GB VRAM to run at full FP16 precision. Running it across two GPUs using tensor parallelism delivers faster inference because the model layers are split and processed in parallel.

All Island Mountain systems ship with at least 160GB of combined VRAM (two GPUs). On our Summit Base tier (2x H100 80GB refurbished, pre-made), you can run DeepSeek V4-Flash in quantized form with approximately 60-90 tokens per second for single-user inference thanks to the H100's 3.35 TB/s memory bandwidth. On the Summit Ridge tier (2x H100 80GB, build-to-order), performance is similar with the same quantized version.

If you need to run DeepSeek V4-Flash at full FP16 quality with maximum fidelity, you will need our Summit Pinnacle tier with dual H200 141GB GPUs (282GB VRAM), coming Q3 2026. The Summit Base and Summit Ridge tiers can run quantized versions, but not full FP16. See our full GPU comparison on the Products page.

What models come pre-installed on Island Mountain hardware?

Every Summit Base and Summit Ridge system ships with three models installed, configured, and burn-tested:

DeepSeek V4-Flash (quantized) is a reasoning and general-purpose model built for complex analysis, multi-step problem solving, code generation, and structured thinking. The quantized version uses approximately 160GB of VRAM on the Summit Base tier.

Llama 3.1 70B is Meta's general-purpose model, strong across writing, summarization, question answering, and conversational tasks. It uses approximately 40-48GB of VRAM.

Mixtral 8x22B is Mistral's mixture-of-experts model with strong multilingual capability and efficient multi-task inference. It uses approximately 80GB of VRAM.

All three are MIT licensed with no usage restrictions. You own them outright. You can switch between them instantly through the OpenWebUI dropdown menu. Additional open-source models can be downloaded and installed after delivery through Ollama or the OpenWebUI interface. For the first 30 days, we walk you through model management directly. See our Technology page for detailed specifications on each model.

What is OpenWebUI and how does it work?

OpenWebUI is a free, open-source, browser-based interface for interacting with local AI models. Think of it as a private version of ChatGPT that runs entirely on your network. No account with any cloud provider. No external data transmission.

You open a web browser on any device connected to your network (Chrome, Firefox, Safari, Edge), navigate to the server's local address, and start prompting. The interface includes a dropdown menu to switch between installed models (DeepSeek V4-Flash quantized, Llama 3.1 70B, Mixtral 8x22B), full conversation history stored locally on the server, and an admin panel for user management and model access controls.

Your administrator can create accounts for each team member, control which models each user can access, and monitor usage. Conversations are stored on the server's local drive, not in any cloud. Search, organize, and reference past conversations just like you would in any chat application.

OpenWebUI requires no command-line knowledge to use. If someone can use ChatGPT, they can use OpenWebUI. The interface is pre-configured on every Island Mountain system before it ships. For more on OpenWebUI and the full software stack, see our Technology page.

What is the difference between H100 and H200 GPUs for AI inference?

The two GPUs differ in VRAM capacity, memory bandwidth, and compute performance. Here is the comparison:

The NVIDIA H100 80GB (PCIe, refurbished on Summit Base tier) uses HBM3 memory with 3.35 TB/s bandwidth and delivers 989 TFLOPS of FP16 compute. Two H100s give you 160GB of total VRAM. The Summit Base tier (pre-made) costs $75,000-$85,000. The Summit Ridge tier (build-to-order H100) costs $150,000-$160,000 and offers comparable performance with new units and custom configuration.

The NVIDIA H200 141GB uses HBM3e memory with 4.8 TB/s bandwidth. Two H200s give you 282GB of total VRAM, which is the minimum required to run DeepSeek V4-Flash at full FP16 quality with its 1M token context window. This is our Summit Pinnacle tier ($350,000-$400,000), coming Q3 2026.

The H100 Summit Base tier is the best value for most organizations needing good performance at lower cost. The H100 Summit Ridge tier (build-to-order) offers custom configuration and new units. The H200 Summit Pinnacle tier is for organizations that need the largest models at full quality. See our Products page for the complete specification table.

How many users can simultaneously access an Island Mountain system?

A dual-GPU Island Mountain system comfortably serves a team of 5-15 simultaneous users for typical business inference tasks like document drafting, research queries, contract review, and summarization.

The exact number depends on the model being used, prompt complexity, and how long responses need to be. Simple queries return fast and queue efficiently. Long-form document generation with high token counts takes more time per request, which means fewer concurrent users before response times stretch.

The Summit Ridge tier (H100 GPUs, build-to-order) handles concurrent workloads with full new hardware. The Summit Base tier (H100 GPUs refurbished, pre-made) handles concurrent workloads well, with the H100's 3.35 TB/s memory bandwidth allowing fast processing of queued requests across vLLM tensor parallelism.

This system is not designed to serve 50 or 100 simultaneous heavy users. If your organization has that kind of demand, you either need multiple systems or enterprise cloud infrastructure is the better fit. Island Mountain is built for the team of 5-15 who need data sovereignty and personal service, not for mass-scale deployment. See our Products page for a full comparison of how each tier handles concurrency.

What power requirements does local AI inference hardware need?

All Island Mountain systems require a dedicated 208V/30A power circuit with a NEMA L6-30R outlet. This is the same kind of circuit found in server rooms, data closets, and commercial facilities. The power supply operates at 200-240V only. It will not run on a standard 120V wall outlet.

If your facility does not already have this circuit, a licensed electrician can install one. Typical installation cost ranges from $500-$2,000 depending on your building's electrical infrastructure and how far the new circuit needs to run from the panel.

Average power draw under typical inference loads is 1.5-2.5 kW. At $0.12/kWh, that translates to roughly $100-$200 per month in electricity. The system runs at standard server room temperatures (64-80°F / 18-27°C) and does not require specialized cooling beyond normal HVAC. The chassis includes redundant 2000W power supplies for reliability. See our Products page for complete hardware specifications.

Can I add models after the system ships?

Yes. The system is not locked to the three pre-installed models. You can download and install any compatible open-source model through the Ollama interface or directly through OpenWebUI's model management panel.

The constraint is VRAM. Your system has a fixed amount of GPU memory, and each model consumes VRAM when loaded. On the Summit Base and Summit Ridge tiers (160GB total VRAM), you can run any model that fits within that memory. Most 70B-parameter models require 40-48GB of VRAM at FP16 precision. Smaller models (7B, 13B, 30B) use less and can run alongside larger models if you want multiple loaded simultaneously.

You cannot run models that exceed your total VRAM. For example, DeepSeek V4-Flash (284B parameters) requires approximately 282GB of VRAM, which only fits on the Summit Pinnacle tier (282GB). Quantized versions of larger models can sometimes fit on lower tiers, but with reduced quality.

For the first 30 days after delivery, we walk you through the model installation process directly. After that, it is a download-and-configure process your IT team can handle. We publish tested compatibility lists for each hardware tier. See our Technology page for current model details.

How can OpenWebUI be air-gapped if it is a web application?

The "web" in OpenWebUI refers to the browser-based interface, not internet dependency.

OpenWebUI is a self-hosted application that runs entirely on the server hardware. Users access it through a web browser pointed at the server's local network address, the same way you access a router's admin panel. No cloud account required. No external API calls for inference.

Out of the box, OpenWebUI does include features that make outbound network connections: model downloads, version update checks, HuggingFace Hub access for embedding models, optional community sharing, web search integration for RAG, and anonymized telemetry. None of these are required for AI inference.

Island Mountain disables every outbound feature before shipping. The key environment variables (OFFLINE_MODE, HF_HUB_OFFLINE, ENABLE_COMMUNITY_SHARING, ANONYMIZED_TELEMETRY, ENABLE_RAG_WEB_SEARCH, and SAFE_MODE) are all set to their air-gapped states during the build process. The system is tested in this configuration during the 72-hour burn-in.

Your IT team can verify it independently. Inspect the container environment variables on the delivered system. Or connect it to a monitored network segment and confirm zero outbound connections during operation. If your security policy requires belt-and-suspenders, configure host-level firewall rules to block all outbound traffic. The system continues to operate normally. See our Technology page for the full air-gap configuration reference.

Cost & Pricing

How much does it cost to set up a private AI server for a small organization?

Island Mountain systems range from $75,000 to $400,000 depending on GPU configuration. The Summit Base tier with 2x NVIDIA H100 80GB PCIe (refurbished, pre-made) costs $75,000-$85,000. The Summit Ridge tier with 2x H100 80GB (build-to-order) costs $150,000-$160,000. The Summit Pinnacle tier with 2x H200 141GB GPUs costs $350,000-$400,000, coming Q3 2026.

Every system is a one-time purchase. There are no subscription fees, no per-token charges, and no recurring software licensing costs. The models are MIT licensed and free to use. The interface (OpenWebUI) is open-source.

Beyond the hardware purchase, plan for a dedicated 208V/30A electrical circuit if you don't already have one in your server room. A licensed electrician typically installs this for $500-$2,000. Ongoing electricity costs run approximately $100-$200 per month depending on usage and your local power rates. That's it. No hidden fees. For a detailed breakdown of total cost of ownership versus cloud alternatives, see our Pricing page.

What is the total cost of cloud AI versus local AI hardware over five years?

The math depends on your team size and usage, but here is a representative comparison for a 10-user organization:

Cloud AI subscriptions cost $12,000-$120,000 over five years for a 10-user team. An Island Mountain Summit Base system costs $75,000-$85,000 once, with a five-year total cost of ownership of approximately $81,000-$97,000 including electricity.

Cloud options (ChatGPT Enterprise, Azure OpenAI, Anthropic) run $20-$200 per user per month, plus per-token overages. For 10 users, that's $2,400-$24,000 per year, with prices that historically only increase. The Island Mountain Summit Base system's annual electricity runs $1,200-$2,400, and that is your only recurring cost.

The crossover point where local hardware becomes cheaper than cloud depends on your cloud spend. At $500/month in cloud AI costs (modest usage), local hardware pays for itself in roughly 13 years. At $2,000/month, the payback period is under 4 years. At $5,000/month or above, you break even in under two years.

The cost advantage of local hardware grows every month it runs, because the marginal cost of each additional inference is just electricity. Cloud costs compound. Hardware costs don't. See the full comparison table on our Pricing page.

What are the ongoing costs after purchasing local AI hardware?

After the one-time hardware purchase, your recurring costs are electricity ($100-$200/month) and optional support. There are no software licensing fees.

Electricity runs approximately $100-$200 per month, or $1,200-$2,400 per year. This estimate assumes 1.5-2.5 kW average power draw at $0.12/kWh, which is the US national average. Your actual cost depends on local utility rates and how heavily the system is used.

There are no software licensing fees. The models (DeepSeek V4-Flash quantized, Llama 3.1 70B, Mixtral 8x22B) are MIT licensed. OpenWebUI is open-source. The operating system is Ubuntu Server LTS, also free.

After the included 30-day support period, ongoing support is available on a per-incident basis or through an annual retainer. This covers model configuration, performance tuning, troubleshooting, and remote diagnostics.

The only additional cost you might encounter is if you want a GPU upgrade in the future. GPU upgrades are a hardware swap where your existing cards are credited at secondary market value toward the replacement. This is not an annual cost; it is a one-time decision you make if and when you need more capability. See our Pricing page for the full cost breakdown.

Why should I pay $75,000+ when I can build a similar AI server for $5,000-$10,000?

You can. The guides exist, and the software stack is the same. Ollama, vLLM, Open WebUI, and open-weight models like DeepSeek V4-Flash and Llama 3.1 70B. If you have the technical staff to build, configure, test, and maintain it, a DIY system running consumer RTX 4090s will run inference.

Here is what that build does not include:

Enterprise GPU provenance documentation. Every A100 and H100 in an Island Mountain system has a documented procurement chain: purchase receipts from authorized NVIDIA channels, serial number registry, and RMA history. If your compliance officer, auditor, or contracting officer asks where your GPU hardware came from and who has touched it, you need that paper trail. Consumer GPUs purchased from Amazon or Newegg do not carry it.

72-hour continuous burn-in testing. DIY guides test whether the system boots. Island Mountain tests whether it sustains full-load inference for 72 continuous hours without thermal throttling, memory errors, or performance degradation. That is the difference between a personal project and production infrastructure that your organization depends on.

NVIDIA enterprise RMA chains. When an A100 or H100 fails, the replacement GPU enters the same provenance documentation chain. Consumer GPU RMAs do not maintain that continuity, which matters for HIPAA, ITAR/DFARS, and CMMC audit trails.

Direct builder support. You talk to the person who built your system. Not a tier-1 support ticket queue, not a chatbot, not a forum post. The person who assembled it and ran the burn-in.

The honest answer: if you are a developer running experiments, build your own. If you are a regulated organization putting AI into production workflows where compliance documentation and hardware provenance matter, that is what the price delta covers. See our Products page for full system specifications.

Purchase & Setup Process

How long does it take to receive and set up an Island Mountain system?

From deposit to delivery, expect 3-5 weeks for the Summit Base tier (pre-made). The process has four phases: component sourcing and verification (GPUs sourced from verified enterprise resellers with documented provenance), assembly and configuration (15-phase build process), 72-hour continuous burn-in testing, and full benchmarking with delivery manifest.

The system arrives pre-configured. Models are installed, OpenWebUI is set up, and the server is ready to run. Setup on your end means racking the server, connecting power (208V/30A circuit), connecting to your network, and opening a browser. Most organizations are running their first prompts within hours of receiving the hardware.

The Summit Ridge tier may take 4-6 weeks because H100 GPUs are sourced on a build-to-order basis with custom configuration. The Summit Pinnacle tier with H200 GPUs is coming Q3 2026. We include 30 days of hands-on setup support with every purchase, so if your IT team hits a snag during network configuration or user setup, you have direct access to the person who built the system. See our Pricing page for the full purchase process.

How does the 50% deposit and payment structure work?

The purchase process starts with a conversation about your workload, followed by a custom quote with a 14-day price lock. A 50% deposit initiates the build; the remaining 50% is due upon delivery. Quotes include a 14-day price lock, meaning the quoted price holds for 14 days from acceptance. If GPU market prices spike above 10% during that window, you can cancel with a full deposit refund.

Once you accept the quote, a 50% deposit initiates component sourcing and the build process. We do not build speculatively. Your deposit triggers the purchase of your specific GPUs and components.

The remaining 50% is due upon delivery. You do not pay the balance until the system arrives, burn-tested and benchmarked, with a complete delivery manifest documenting every component serial number, test result, and configuration detail.

Payment methods and financing options can be discussed during the quote process. Some buyers have successfully financed hardware purchases through equipment financing lenders who treat AI servers like any other capital equipment purchase. Island Mountain does not currently offer direct financing, but we can point you toward lenders who work with technology equipment purchases. See our Pricing page for the complete purchase timeline.

What does the 72-hour burn-in test verify?

Every Island Mountain system runs 72 hours of continuous stress testing before it ships. This is not a quick benchmark or a 10-minute smoke test. It is three straight days of sustained GPU compute at high load with automated monitoring.

The test verifies thermal stability under continuous load (GPUs maintain safe operating temperatures without throttling), memory integrity (no VRAM errors across billions of operations), inference consistency (model outputs remain stable and correct across thousands of sequential prompts), power supply reliability under sustained draw, and storage performance under continuous read/write activity.

Automated monitoring tracks temperature, clock speeds, error rates, and performance metrics throughout the entire 72-hour window. Any anomaly triggers an alert. Systems that show instability, thermal throttling, VRAM errors, or any deviation from expected performance do not ship. Components are swapped and the test restarts.

You receive a complete benchmark report and delivery manifest with your system, documenting every test result, component serial number, and configuration detail. This is not a certificate of perfection; it is proof that the system ran hard for three days without failing. See our Technology page for more on the build and test process.

What happens if a component fails during the warranty period?

Every Island Mountain system ships with a 1-year hardware warranty covering all components, including GPUs, CPU, RAM, storage drives, and power supplies. If a component fails within the first year, we handle the replacement.

GPU failures are managed through supplier RMA (Return Merchandise Authorization) agreements with documented replacement timelines. We maintain a 20% warranty reserve per unit specifically to ensure we can cover replacements without delay. You are not waiting for us to find budget.

The process works like this: you contact us (direct phone at 1-801-609-1130 or email to the builder), we diagnose the issue remotely, and if a physical component needs replacement, we ship the part or arrange a swap. For GPU replacements, turnaround depends on the specific card and current availability, but our supplier agreements include prioritized RMA processing.

Extended warranty options beyond the first year are available at the time of purchase. After warranty, ongoing support is available on a per-incident or annual retainer basis. For full warranty and support details, see our Pricing page.

Risks & Limitations

Island Mountain is a new company with no shipped units. Why should I trust a first purchase?

This is a fair question and you should ask it. Here is the honest answer.

Island Mountain is a Colorado company co-founded by John Dougherty (hardware engineer, 25-year technology veteran) and Basho Parks (marketing and sales). The company is new. The team is not. Every system is built, tested, and delivered by people with decades of combined experience deploying technology infrastructure in demanding environments.

The specific protections built into the purchase process exist because we know trust has to be earned: the 14-day price lock lets you cancel if GPU prices spike above 10%. The 50% deposit / 50% on delivery structure means you do not pay in full until you have a working, tested system in your hands. The 1-year hardware warranty is backed by a 20% warranty reserve per unit. The 30-day setup support gives you direct access to the builder, not a call center.

The hardware is standard enterprise components (NVIDIA GPUs, AMD EPYC CPUs, Supermicro chassis) with established supplier RMA agreements. The software is open-source (OpenWebUI, vLLM, Ubuntu Server LTS). Nothing is proprietary, nothing is locked, and if Island Mountain disappeared tomorrow, your system would keep running because every piece of it is built on publicly available, open-source infrastructure. See our Why Local AI page for our honest positioning on what we are and what we are not.

What happens if Island Mountain goes out of business after I buy?

Your system keeps running. This is by design, not by accident.

Every component of the system is built on open-source software and standard enterprise hardware. The operating system is Ubuntu Server LTS (free, community-supported). The inference engines are vLLM and Ollama (open-source). The interface is OpenWebUI (open-source). The models are MIT licensed with no usage restrictions. None of it depends on Island Mountain's continued existence.

If Island Mountain were to close, you would lose access to our specific support, warranty coverage, and GPU upgrade program. But the system itself requires no license, no activation, and no connection to any Island Mountain server. It is a standalone server running open-source software on standard hardware.

OS updates come from Ubuntu's repositories. CUDA driver updates come from NVIDIA. Model updates come from Hugging Face and Ollama's public model library. Your IT team or any competent Linux administrator can maintain the system independently.

We built it this way because vendor lock-in is exactly the problem we are solving. If we built a system that depended on us to function, we would be no different from the cloud vendors we are replacing. See our Technology page for details on the full open-source stack.

Can local AI keep up with cloud AI as models improve?

Not on day one. But within weeks or months, yes.

Cloud providers release proprietary models (GPT-4o, Claude Opus, Gemini) immediately because they control the infrastructure. You get access the day they launch. Open-source models lag that release cycle. When OpenAI ships a new flagship, you will not have an equivalent open-source model on your local hardware that afternoon.

What has consistently happened over the past two years is that open-source models close the gap rapidly. DeepSeek V4-Flash, Llama 3.1, and Mixtral all reached or exceeded the performance of the proprietary models they followed within weeks to months of those models' releases. The open-source AI development community is massive, well-funded, and accelerating.

Your Island Mountain system can run any new open-source model that fits within its VRAM. When a new model releases, you download it through Ollama or the OpenWebUI interface, and it is available to your team. No hardware swap needed unless the new model exceeds your VRAM capacity.

The risk is that a future model generation requires significantly more VRAM than your system has. That is why we offer a GPU upgrade path. Your existing GPUs are credited at secondary market value toward next-generation cards. See our Technology page for roadmap details, including V4-Flash and the Summit Pinnacle tier timeline.

Summary: Island Mountain builds on-premises AI inference servers with NVIDIA H100 and H200 GPUs, priced from $75K to $395K as a one-time purchase. The hardware ships pre-configured with open-source models and requires no cloud connection, making it suitable for organizations bound by HIPAA, ITAR, attorney-client privilege, tribal data sovereignty (OCAP), and FERPA requirements.

Still Have Questions?

One conversation, no sales pitch. Tell us about your organization and what you are trying to accomplish with local AI. We will give you a straight answer.

Or call directly: 1-801-609-1130