Building a Design System: Component Library, Token System, and Versioning Strategy

We had 4 product teams, 3 different button components, and a modal that looked different on every page. The final straw was a design review where our CPO put two screens side by side and asked, “Are these the same product?” They were. That’s when we got the mandate to build a design system.

Eight months later, we had a system serving 4 teams across 6 applications, with 87% component adoption and a token architecture that survived a complete rebrand. But the path there was messy. We made bad API decisions that forced a major version bump at the worst possible time, picked a token structure that couldn’t handle theming until we rewrote it, and almost killed adoption by shipping breaking changes in a patch release.

Token Architecture: Two Rewrites to Get It Right

Most design system articles start with components. We made that mistake too. Our first attempt was a component library with hardcoded colors — bg-blue-500 scattered across 40 components. When the brand team changed the primary color from #0070f3 to #0055d4, we had a 200-line diff touching every component file.

Tokens should come first. They’re the foundation everything else sits on.

Attempt 1: Flat Tokens (Failed at Theming)

Our first token file looked like every tutorial example:

{
  "color": {
    "primary": { "$value": "#0055d4" },
    "secondary": { "$value": "#6c47ff" },
    "error": { "$value": "#dc2626" },
    "background": { "$value": "#ffffff" },
    "text": { "$value": "#111827" }
  },
  "spacing": {
    "xs": { "$value": "4px" },
    "sm": { "$value": "8px" },
    "md": { "$value": "16px" },
    "lg": { "$value": "24px" }
  }
}

This worked for a week — until someone asked about dark mode. With flat tokens, dark mode means duplicating every color with a -dark suffix and conditionally switching. We tried it. The result was components littered with isDark ? tokens.backgroundDark : tokens.background. Every new component needed to handle this branching, and developers forgot half the time.

Attempt 2: Three-Tier Tokens (What Actually Scaled)

We restructured into three tiers after studying how Salesforce Lightning and Adobe Spectrum handle this:

{
  "primitive": {
    "blue": {
      "50":  { "$value": "#eff6ff" },
      "500": { "$value": "#0055d4" },
      "600": { "$value": "#0044aa" },
      "900": { "$value": "#1e3a5f" }
    },
    "neutral": {
      "0":   { "$value": "#ffffff" },
      "50":  { "$value": "#f9fafb" },
      "900": { "$value": "#111827" },
      "1000": { "$value": "#000000" }
    }
  },
  "semantic": {
    "color": {
      "bg-primary":   { "$value": "{primitive.neutral.0}" },
      "bg-secondary": { "$value": "{primitive.neutral.50}" },
      "text-primary": { "$value": "{primitive.neutral.900}" },
      "interactive":  { "$value": "{primitive.blue.500}" },
      "interactive-hover": { "$value": "{primitive.blue.600}" },
      "danger":       { "$value": "{primitive.red.500}" }
    }
  },
  "component": {
    "button": {
      "primary-bg":    { "$value": "{semantic.color.interactive}" },
      "primary-hover":  { "$value": "{semantic.color.interactive-hover}" },
      "primary-text":  { "$value": "{primitive.neutral.0}" },
      "border-radius": { "$value": "{semantic.radius.md}" },
      "padding-x":    { "$value": "{semantic.spacing.lg}" },
      "padding-y":    { "$value": "{semantic.spacing.sm}" }
    }
  }
}

The key insight: primitives never appear in component code. Components reference semantic tokens, which reference primitives. Dark mode becomes a semantic layer swap — the primitive palette stays identical, you just remap bg-primary from neutral.0 to neutral.1000.

const themes = {
  light: {
    "color-bg-primary": "var(--primitive-neutral-0)",
    "color-bg-secondary": "var(--primitive-neutral-50)",
    "color-text-primary": "var(--primitive-neutral-900)",
    "color-interactive": "var(--primitive-blue-500)",
  },
  dark: {
    "color-bg-primary": "var(--primitive-neutral-1000)",
    "color-bg-secondary": "var(--primitive-neutral-900)",
    "color-text-primary": "var(--primitive-neutral-0)",
    "color-interactive": "var(--primitive-blue-400)",
  },
} as const;

Components don’t know which theme is active. They reference --color-bg-primary and get the right value. When we did the rebrand six months in, we changed 14 primitive values and everything propagated automatically. Zero component changes.

Style Dictionary Transform Pipeline

We used Style Dictionary to transform tokens into platform outputs. Our config handled CSS custom properties, TypeScript constants, and Tailwind theme extension from a single source:

// style-dictionary.config.js
module.exports = {
  source: ["tokens/**/*.json"],
  platforms: {
    css: {
      transformGroup: "css",
      buildPath: "dist/css/",
      files: [{
        destination: "variables.css",
        format: "css/variables",
        options: { outputReferences: true },
      }],
    },
    ts: {
      transformGroup: "js",
      buildPath: "dist/ts/",
      files: [{
        destination: "tokens.ts",
        format: "javascript/es6",
      }],
    },
    tailwind: {
      transformGroup: "js",
      buildPath: "dist/tailwind/",
      files: [{
        destination: "theme.js",
        format: "custom/tailwind",
      }],
    },
  },
};

The outputReferences: true flag is critical — it preserves the alias chain in CSS output, so --button-primary-bg resolves to var(--color-interactive) rather than a hardcoded hex. This means you can inspect token relationships in DevTools, which saved us hours of debugging.

Component API Design: The Decisions That Mattered

We built 34 components over 8 months. The API decisions we made in the first month shaped (and sometimes haunted) us for the rest.

Polymorphic `as` Prop vs. Dedicated Components

Our first Button component had an as prop for rendering as different elements:

// Version 1: Polymorphic — seemed flexible
<Button as="a" href="/dashboard">Go to Dashboard</Button>
<Button as="button" type="submit">Submit</Button>
<Button as={Link} to="/settings">Settings</Button>

The problem: TypeScript inference for polymorphic components is a nightmare. The href prop should only be valid when as="a", to only when as={Link}, type only when as="button". We spent 3 days on the type definition and it still had edge cases where autocompletion broke.

We switched to dedicated components:

// Version 2: Explicit — boring but correct
<ButtonLink href="/dashboard">Go to Dashboard</ButtonLink>
<Button type="submit">Submit</Button>
<ButtonRouterLink to="/settings">Settings</ButtonRouterLink>

Each component has tight, correct types. No runtime errors from passing href to a <button>. The trade-off is more exports, but IDE autocompletion works perfectly and wrong usage is a compile error, not a runtime surprise.

Compound Components for Complex State

For components with shared internal state — Tabs, Accordion, Dropdown — we used compound components with context. But we added one thing most examples skip: controlled and uncontrolled modes.

interface TabsProps {
  value?: string;             // controlled
  defaultValue?: string;      // uncontrolled
  onValueChange?: (value: string) => void;
}

function Tabs({ value, defaultValue, onValueChange, children }: TabsProps) {
  const [internalValue, setInternalValue] = useState(defaultValue ?? "");
  const isControlled = value !== undefined;
  const currentValue = isControlled ? value : internalValue;

  const handleChange = useCallback((newValue: string) => {
    if (!isControlled) setInternalValue(newValue);
    onValueChange?.(newValue);
  }, [isControlled, onValueChange]);

  return (
    <TabsContext.Provider value={{ value: currentValue, onChange: handleChange }}>
      <div role="tablist">{children}</div>
    </TabsContext.Provider>
  );
}

function Tab({ value, children }: { value: string; children: ReactNode }) {
  const ctx = useContext(TabsContext);
  if (!ctx) throw new Error("Tab must be used within Tabs");

  return (
    <button
      role="tab"
      aria-selected={ctx.value === value}
      onClick={() => ctx.onChange(value)}
    >
      {children}
    </button>
  );
}

function TabPanel({ value, children }: { value: string; children: ReactNode }) {
  const ctx = useContext(TabsContext);
  if (!ctx) throw new Error("TabPanel must be used within Tabs");
  if (ctx.value !== value) return null;

  return <div role="tabpanel">{children}</div>;
}

Supporting both modes matters because some consumers manage tab state in URL params (controlled), while others just need a tabbed UI (uncontrolled). If you force one pattern, half your users will build a wrapper around your component.

The Styling Decision: Why We Chose CSS Modules + Token Variables

We evaluated four approaches:

Approach	Pros	Cons
Tailwind classes	Fast iteration, familiar	Hard to override, utility explosion in complex components
CSS-in-JS (styled-components)	Colocated, dynamic	Runtime cost (~8KB), SSR complexity, React 19 concerns
CSS Modules + custom properties	Zero runtime, overridable	Separate files, less dynamic
Vanilla Extract	Type-safe, zero runtime	Build tooling requirement, smaller ecosystem

We chose CSS Modules with token-based custom properties. The deciding factor: consumers need to override styles without fighting specificity wars.

/* Button.module.css */
.root {
  display: inline-flex;
  align-items: center;
  gap: var(--button-gap, var(--spacing-sm));
  padding: var(--button-padding-y) var(--button-padding-x);
  border-radius: var(--button-border-radius);
  font-weight: 600;
  transition: background-color 150ms ease;
}

.primary {
  background: var(--button-primary-bg);
  color: var(--button-primary-text);
}

.primary:hover {
  background: var(--button-primary-hover);
}

Every visual property references a CSS custom property with a fallback. Consumers can override any specific button token without touching the component:

/* In consuming application */
.checkout-cta {
  --button-primary-bg: var(--color-success);
  --button-primary-hover: var(--color-success-hover);
  --button-padding-x: var(--spacing-xl);
}

No !important, no specificity hacks. This pattern came from studying how Radix UI and Open Props handle customization.

Versioning: How a Patch Release Almost Killed Adoption

We followed semver from day one. Major for breaking changes, minor for new components, patch for bug fixes. Simple, right?

The Incident

In v1.4.2 (a patch release), we fixed an accessibility bug in the Select component — the dropdown wasn’t getting focus when opened via keyboard. The fix added autoFocus to the listbox element. Correct behavior per WAI-ARIA.

What we didn’t anticipate: 3 teams had built custom focus management on top of our Select. Our “fix” meant focus jumped twice — once from their code, once from ours. Two teams had e2e test suites that broke. One team’s form wizard started skipping steps because their focus logic conflicted with ours.

A bug fix in a patch release broke production for 3 teams. Technically we were right — it was a bug fix. Practically, any behavioral change that existing code depends on is a breaking change, regardless of what semver says.

What We Changed

After that incident, we adopted stricter rules:

Any behavioral change gets a minor version, even bug fixes. We tag the PR as fix but release as minor. The only patches are typo fixes in docs, dependency bumps, and build tooling changes.
Changesets over Lerna. We switched from Lerna to Changesets because it forces developers to describe the user-facing impact of every PR:

---
"@acme/design-system": minor
---

Select: dropdown now receives focus when opened via keyboard.

**Migration**: If you have custom focus management on the Select component,
you may need to remove your manual `focus()` call to avoid double-focus.

Canary releases for risky changes. Any PR that touches event handling, focus management, or layout gets a canary release first:

# CI publishes to a scoped tag
npm publish --tag canary

# Teams can test before it hits latest
npm install @acme/design-system@canary

Codemods for breaking changes. When we did ship v2.0.0 (renamed 12 props for consistency), we shipped a codemod alongside it:

npx @acme/design-system-codemods v2 ./src

The codemod handled 90% of migrations automatically. Teams that would have delayed upgrading for months did it in a day.

Measuring Adoption: Beyond “Who’s Using It”

We tracked three metrics that actually drove decisions:

1. Component Coverage

A script that scans consuming repos for native HTML elements that have design system equivalents:

// Simplified version of our coverage scanner
const COMPONENT_MAP: Record<string, string> = {
  "<button": "Button/ButtonLink",
  "<input": "Input/TextField",
  "<select": "Select",
  "<dialog": "Modal/Dialog",
  "<table": "DataTable",
};

function scanFile(content: string, filePath: string): CoverageIssue[] {
  const issues: CoverageIssue[] = [];
  for (const [htmlTag, dsComponent] of Object.entries(COMPONENT_MAP)) {
    const regex = new RegExp(`${htmlTag}[\\s>]`, "g");
    let match;
    while ((match = regex.exec(content)) !== null) {
      if (!isInsideDesignSystemComponent(filePath)) {
        issues.push({
          file: filePath,
          line: getLineNumber(content, match.index),
          htmlTag,
          suggestion: dsComponent,
        });
      }
    }
  }
  return issues;
}

We ran this weekly and published results per team. Starting at 34% coverage, we hit 87% in 6 months. The remaining 13% was mostly edge cases where teams legitimately needed custom elements (canvas-heavy features, third-party widget wrappers).

2. Token Drift

A lint rule that flags hardcoded values where tokens exist:

// ESLint rule: no-hardcoded-colors
// Flags: style={{ color: '#333' }}
// Suggests: style={{ color: 'var(--color-text-primary)' }}

Token drift dropped from 240 instances to 18 over 4 months. The remaining ones were all in test fixtures and storybook decorators — acceptable.

3. Override Frequency

We instrumented our CSS to detect when consumers override design system custom properties. High override frequency on a specific token means the default isn’t working for real use cases — it’s a signal to revisit the token value or add a variant.

function trackOverrides(componentName: string, element: HTMLElement) {
  const computedStyle = getComputedStyle(element);
  const dsTokens = Array.from(document.styleSheets)
    .flatMap(sheet => Array.from(sheet.cssRules))
    .filter(rule => rule.cssText.includes(`--${componentName}-`));

  // Compare computed values against design system defaults
  // Report significant deviations to analytics
}

When we saw that 60% of consumers overrode --button-padding-x to a larger value, we added a size="lg" variant instead of leaving everyone to hack the token.

What I’d Do Differently

Start with fewer components. We launched with 20 components. Half of them had design issues that required breaking API changes within the first two months. If we’d launched with 8 well-tested components and added the rest incrementally, we’d have avoided the painful v2.0.0.

Invest in a Figma-to-token pipeline from day one. We manually synced Figma variables and JSON tokens for 4 months. They drifted constantly. When we finally set up Tokens Studio to sync Figma variables directly to our token JSON in Git, the “designer changed it but forgot to tell engineering” category of bugs dropped to zero.

Don’t build what Radix/Headless UI already solved. We built our own headless Combobox, Tooltip, and Dialog from scratch. Each one had accessibility issues that took weeks to fix. For complex interactive patterns (focus trapping, scroll locking, portal rendering), use a headless library and build your styled layer on top. Save your engineering time for the components that are truly unique to your product.

Make the design system a product, not a project. Projects end. Our system started losing momentum around month 5 when the initial mandate energy faded. What saved it was treating the design system like a product: dedicated backlog, sprint reviews with consuming teams, a Slack channel where teams could request components, and a quarterly roadmap. The moment you stop actively maintaining it, teams start building around it instead of with it.