Data Quality & Provenance
Every record returned by every Silicon Analysts tool — REST or MCP — carries a provenance block describing where the data came from, how confident we are in it, and when it was last refreshed. This page is the canonical reference for what those fields mean.
Provenance Taxonomy
Two enums classify every record: source_type (how the record was produced) and confidence_tier (how much to trust it). The type contract lives in lib/tools/types.ts.
source_type — how this record was produced
Sourced from a published external report (Morgan Stanley, TrendForce, public earnings, etc.). Direct attribution.
Computed by Silicon Analysts from public inputs via a documented methodology (Monte Carlo cost models, yield equations, etc.).
Pure-function output from user-supplied inputs (e.g. calculate_chip_cost results). No data lookup involved.
Analyst judgment where data is sparse or unobservable.
confidence_tier — qualitative confidence
Multiple independent sources agree; methodology is well-tested.
Single authoritative source or moderate methodology uncertainty.
Sparse data, significant assumptions, or rapidly changing.
0.85 is worse than a reasoned "high" — agents will treat the number as more precise than it is. We will tighten to a numerical confidence once the methodology for computing it is documented.Update Cadence by Dataset
Last-updated dates below are dataset-level. Per-record overrides are supported in the API; future migrations will populate them as individual chips, nodes, and packaging types refresh asynchronously. Cadence is a target — historical refresh history will appear on a planned /changelog page.
| Dataset | Last Updated | Target Cadence | Source Types |
|---|---|---|---|
AI Accelerators (chipSpecs) Per-chip cost breakdowns derived from Monte Carlo models against public teardown data and analyst reports. | 2026-04-07 | Monthly | derivedresearch |
Foundry & Wafer Pricing (foundryData) Wafer price ranges, defect density, NRE/mask-set costs, and node maturity status. Synthesized from TrendForce, Morgan Stanley, CSET, public filings. | 2026-04-07 | Quarterly | derivedresearch |
Packaging & HBM Specs (packagingData) Per-tech packaging cost benchmarks (CoWoS-S/L, EMIB, SoIC, FC-BGA, etc.) plus HBM2 → HBM4 cost-per-stack and bandwidth/capacity. | 2026-04-07 | Monthly | derivedresearch |
HBM Market Analysis (hbmData) 9 sub-tables: accelerators, specs, market share, spot prices, leading indicators, qualification feed, revenue forecast, supplier revenue, validation checks. | 2026-04-07 | Monthly | researchderived |
Supply-chain Headlines (marketPulse) Curated supply-chain headlines with trend direction and impact analysis. Per-item dates parsed to ISO 8601 in the API. | — | Weekly | research |
Methodology Notes by Source Type
Research
Records attributed directly to a public external publication. Primary sources include TrendForce quarterly reports, Morgan Stanley semiconductor research, Raymond James analyst notes, CSET, IEDM proceedings, and earnings releases. Each record carries a per-row source string in addition to the structured provenance block.
Derived
Records produced by a Silicon Analysts model from public inputs. Examples: per-accelerator cost breakdowns combine Epoch AI Monte Carlo models with TrendForce and Raymond James inputs; wafer price ranges synthesize multiple foundry-pricing sources with the Murphy yield model. See Semiconductor Cost Guide for the methodology in depth.
Computed
Pure-function output from user-supplied inputs. The calculate_chip_cost tool is the only example today: given die dimensions and process parameters, returns an estimated chip cost. No data lookup is involved at the record level (though input defaults pull from derived wafer-pricing data — that data's freshness sets the ceiling on result freshness via the conservative-pick rule).
Estimated
Analyst judgment where data is sparse or unobservable. Rare today; reserved for fields like packaging cost on early-life tech where no published price exists. Always paired with low or medium.
Example: Provenance in a Tool Response
Every record across all 6 tools carries the same shape. Below is a representative get_accelerator_costs record (truncated).
{
"chip": "NVIDIA B200",
"vendor": "NVIDIA",
"processNode": "TSMC N4P",
"estMfgCostUsd": 8500,
"estSellPriceUsd": 30000,
"costBreakdown": { "logicDieCostUsd": 220, "hbmCostUsd": 4200, ... },
"provenance": {
"last_updated": "2026-04-07T00:00:00.000Z",
"source_type": "derived",
"confidence_tier": "high",
"dataset_version": "chipSpecs-v1.0"
}
}See the Developer API page for full per-tool response shapes and the /api/v1 manifest for the machine-readable contract.