📋Provenance & Quality

Data Quality & Provenance

Every record returned by every Silicon Analysts tool — REST or MCP — carries a provenance block describing where the data came from, how confident we are in it, and when it was last refreshed. This page is the canonical reference for what those fields mean.

Provenance Taxonomy

Two enums classify every record: source_type (how the record was produced) and confidence_tier (how much to trust it). The type contract lives in lib/tools/types.ts.

source_type — how this record was produced

research

Sourced from a published external report (Morgan Stanley, TrendForce, public earnings, etc.). Direct attribution.

derived

Computed by Silicon Analysts from public inputs via a documented methodology (Monte Carlo cost models, yield equations, etc.).

computed

Pure-function output from user-supplied inputs (e.g. calculate_chip_cost results). No data lookup involved.

estimated

Analyst judgment where data is sparse or unobservable.

confidence_tier — qualitative confidence

high

Multiple independent sources agree; methodology is well-tested.

medium

Single authoritative source or moderate methodology uncertainty.

low

Sparse data, significant assumptions, or rapidly changing.

Why a tier and not a 0–1 number? A spurious 0.85 is worse than a reasoned "high" — agents will treat the number as more precise than it is. We will tighten to a numerical confidence once the methodology for computing it is documented.

Update Cadence by Dataset

Last-updated dates below are dataset-level. Per-record overrides are supported in the API; future migrations will populate them as individual chips, nodes, and packaging types refresh asynchronously. Cadence is a target — historical refresh history will appear on a planned /changelog page.

Dataset	Last Updated	Target Cadence	Source Types
AI Accelerators (chipSpecs) Per-chip cost breakdowns derived from Monte Carlo models against public teardown data and analyst reports.	2026-04-07	Monthly	derivedresearch
Foundry & Wafer Pricing (foundryData) Wafer price ranges, defect density, NRE/mask-set costs, and node maturity status. Synthesized from TrendForce, Morgan Stanley, CSET, public filings.	2026-04-07	Quarterly	derivedresearch
Packaging & HBM Specs (packagingData) Per-tech packaging cost benchmarks (CoWoS-S/L, EMIB, SoIC, FC-BGA, etc.) plus HBM2 → HBM4 cost-per-stack and bandwidth/capacity.	2026-04-07	Monthly	derivedresearch
HBM Market Analysis (hbmData) 9 sub-tables: accelerators, specs, market share, spot prices, leading indicators, qualification feed, revenue forecast, supplier revenue, validation checks.	2026-04-07	Monthly	researchderived
Supply-chain Headlines (marketPulse) Curated supply-chain headlines with trend direction and impact analysis. Per-item dates parsed to ISO 8601 in the API.	—	Weekly	research

Methodology Notes by Source Type

research

Research

Records attributed directly to a public external publication. Primary sources include TrendForce quarterly reports, Morgan Stanley semiconductor research, Raymond James analyst notes, CSET, IEDM proceedings, and earnings releases. Each record carries a per-row source string in addition to the structured provenance block.

derived

Derived

Records produced by a Silicon Analysts model from public inputs. Examples: per-accelerator cost breakdowns combine Epoch AI Monte Carlo models with TrendForce and Raymond James inputs; wafer price ranges synthesize multiple foundry-pricing sources with the Murphy yield model. See Semiconductor Cost Guide for the methodology in depth.

computed

Computed

Pure-function output from user-supplied inputs. The calculate_chip_cost tool is the only example today: given die dimensions and process parameters, returns an estimated chip cost. No data lookup is involved at the record level (though input defaults pull from derived wafer-pricing data — that data's freshness sets the ceiling on result freshness via the conservative-pick rule).

estimated

Estimated

Analyst judgment where data is sparse or unobservable. Rare today; reserved for fields like packaging cost on early-life tech where no published price exists. Always paired with low or medium.

Example: Provenance in a Tool Response

Every record across all 6 tools carries the same shape. Below is a representative get_accelerator_costs record (truncated).

{
  "chip": "NVIDIA B200",
  "vendor": "NVIDIA",
  "processNode": "TSMC N4P",
  "estMfgCostUsd": 8500,
  "estSellPriceUsd": 30000,
  "costBreakdown": { "logicDieCostUsd": 220, "hbmCostUsd": 4200, ... },
  "provenance": {
    "last_updated":   "2026-04-07T00:00:00.000Z",
    "source_type":    "derived",
    "confidence_tier": "high",
    "dataset_version": "chipSpecs-v1.0"
  }
}

See the Developer API page for full per-tool response shapes and the /api/v1 manifest for the machine-readable contract.

Developer API

Per-tool response shapes, MCP setup, API key management.

System Status

Live uptime + 90-day history for `/api/v1`, `/api/mcp`, web.

Data Corrections Changelog

Coming soon — historical record of data updates and corrections.