Semantic ID Generator - NPM Package

GitHub NPM version Open Source Love

Logo

Contents

Introduction

Semantic ID Generator is a Node.js package for minting human-readable, machine-understandable identifiers composed of configurable “compartments.” Each compartment has a semantic meaning and a generation strategy so IDs stay unique, recognizable, and consistent across systems.

Latest Updates (v1.3.0)

Quick Start

Install

npm install semantic-id-generator

Basic usage

import SemanticIDGenerator from 'semantic-id-generator';

const generator = new SemanticIDGenerator();
const id = generator.generateSemanticID('person');
console.log(id);

Runnable Code Samples

Every major feature now has a self-contained sample under code_samples/:

Each folder contains a README.md plus a runnable script, e.g.

node code_samples/domain-presets/generate-contract-id/sample.js
node --import ./scripts/register-ts-node.mjs code_samples/typescript-tooling/builder-pattern/sample.ts

MCP Integration (Server + Client)

Need to let AI copilots such as Cursor or Claude invoke the Semantic ID Generator directly? A dedicated MCP companion package now lives under packages/semantic-id-generator-mcp. It keeps the core library lean (no MCP dependencies) while providing:

Usage

cd packages/semantic-id-generator-mcp
npm install

# Start the stdio server (register this command inside Cursor/Claude)
npx --package semantic-id-generator-mcp semantic-id-generator-mcp-server --default-preset dataset

# Optional: drive the tools from a terminal shell
npx semantic-id-generator-mcp-client --tool generate-semantic-id --args '{"dataConceptName":"contract","preset":"contract"}'

The server follows the latest MCP specification so any compatible AI model client can connect, discover the tools, and receive structured responses.

Testing

# Inside packages/semantic-id-generator-mcp
npm test              # runs in-memory + stdio MCP integration tests

The MCP suite covers protocol-level behavior (tool/resource calls) and also spawns the actual stdio server to verify initialization works exactly the way Cursor or Claude would invoke it.

Key Features

Domain Presets & Schema Export

Pick from 30 presets—Person, Contract, Dataset, Device, FinancialAccount, and more—to bootstrap identifiers and schemas instantly.

Preset Schema Subclass of
person Person schema:Person
individual_customer IndividualCustomer schema:Person
corporate_customer CorporateCustomer schema:Organization
employee Employee schema:Person
supplier Supplier schema:Organization
partner Partner schema:Organization
organization Organization schema:Organization
department Department schema:Organization
role Role schema:Role
product Product schema:Product
product_category ProductCategory schema:CategoryCodeSet
device Device schema:Product
asset Asset schema:Product
inventory_item InventoryItem schema:Product
contract Contract schema:Contract
order Order schema:Order
purchase_order PurchaseOrder schema:Order
invoice Invoice schema:Invoice
shipment Shipment schema:ParcelDelivery
payment_transaction PaymentTransaction schema:PaymentService
financial_account FinancialAccount schema:FinancialProduct
budget Budget schema:FinancialProduct
project Project schema:Project
task Task schema:Action
support_case SupportCase schema:Action
document Document schema:CreativeWork
policy_document PolicyDocument schema:CreativeWork
location Location schema:Place
event Event schema:Event
dataset Dataset schema:Dataset

Programmatic benefits

  1. Single-flag configuration – no repeated separators/compartments across services.
  2. Metadata-aware tooling – retrieve schema names, descriptions, and inheritance on demand.
  3. Graph-ready exports – JSON-LD and OWL stay in lockstep with the IDs you generate.
import SemanticIDGenerator, {
  getPresetMetadata,
  buildSchemaForPreset
} from 'semantic-id-generator';

const generator = new SemanticIDGenerator({ preset: 'contract' });
const id = generator.generateSemanticID('contract');

const metadata = getPresetMetadata('contract');
const { jsonld } = buildSchemaForPreset('contract');

console.log(id);
console.log(metadata.schemaClass); // schema:Contract
console.log(jsonld['sig:entitySchema']); // Contract

👉 Full catalog, metadata APIs, and schema export details live in docs/domain-presets.md.

Semantic ID Inspector

Use SemanticIDInspector when you need to validate or explain IDs that were minted elsewhere (partner systems, historical datasets, governance bots, etc.). The inspector automatically detects domain presets from the data concept prefix, rehydrates the expected configuration, and returns a structured report per compartment.

import SemanticIDGenerator, { SemanticIDInspector } from 'semantic-id-generator';

const generator = new SemanticIDGenerator({ preset: 'contract' });
const inspector = new SemanticIDInspector();

const id = generator.generateSemanticID('contract');
const report = inspector.inspect(id);

console.log(report.isValid);          // true
console.log(report.preset);           // "contract"
console.log(report.metadata.schemaClass); // "schema:Contract"

const [concept, rest] = id.split('|');
const [first, second, third] = rest.split('-');
const tamperedSecond = `${second.slice(0, 2)}|${second.slice(2)}`;
const tamperedId = `${concept}|${first}-${tamperedSecond}-${third}`;

const failure = inspector.inspect(tamperedId);
console.log(failure.isValid); // false
console.log(failure.compartments[1].issues);
// ["Value contains reserved separator \"|\".", "..."]

👉 Run the dedicated sample in code_samples/semantic-id-inspector/basic-validation to see both a passing and failing inspection end-to-end.

Usage & Configuration

👉 See docs/usage.md for the complete guide, plus the runnable sample in examples/typescript-example.ts.

Testing & Performance

👉 Full details in docs/testing-and-performance.md.

Documentation Map

| Topic | Location | | — | — | | Domain presets & schemas | docs/domain-presets.md | | Usage & configuration recipes | docs/usage.md | | Testing & performance | docs/testing-and-performance.md | | TypeScript example app | examples/typescript-example.ts | | Test-suite overview | test/Test_Suite.md |

License

This project is licensed under the MIT License.

About the Author

Semantic ID Generator is created by Yannick Huchard (CTO).
More information: yannickhuchard.com · Podcast · Medium · YouTube · amase.io

Semantic ID Generator - NPM Package

GitHub NPM version Open Source Love

Logo

Table of Contents

Introduction

Semantic ID Generator is a Node.js package designed to generate structured and meaningful unique identifiers, named “Semantic ID”. These identifiers are composed of different “compartments” each having a specific “semantic meaning” and generation strategy.

Latest Updates (v1.3.0):

What is a Semantic Identifier?

A Semantic ID is an identifier that implements following AMASE data architecture principles:

A semantic id follows the pattern:

{date concept name}{name separator}{compartment 1}{compartment separator}{compartment 2}{compartment separator}...{compartment N}

Examples

Here a few examples of generated semantic identifiers:

Domain Presets & Schema Export

To remove repetitive configuration plumbing, the generator now ships 30 core presets that cover the most common data entities. Each preset pairs a ready-to-use generator configuration with metadata and schemas whose names match the entity while inheriting from a well-known vocabulary:

Preset Schema Subclass of Description
person Person schema:Person Individual people
individual_customer IndividualCustomer schema:Person Customers who are people
corporate_customer CorporateCustomer schema:Organization Customers that are organizations
employee Employee schema:Person Workforce members
supplier Supplier schema:Organization Vendors or suppliers
partner Partner schema:Organization Strategic/channel partners
organization Organization schema:Organization Generic legal entities
department Department schema:Organization Internal cost centers
role Role schema:Role Functional or security roles
product Product schema:Product Catalog items
product_category ProductCategory schema:CategoryCodeSet Product taxonomies
device Device schema:Product Physical/IoT devices
asset Asset schema:Product Managed assets
inventory_item InventoryItem schema:Product Stock units
contract Contract schema:Contract Legal agreements
order Order schema:Order Customer orders
purchase_order PurchaseOrder schema:Order Procurement POs
invoice Invoice schema:Invoice A/R or A/P invoices
shipment Shipment schema:ParcelDelivery Logistics movements
payment_transaction PaymentTransaction schema:PaymentService Settlements/payments
financial_account FinancialAccount schema:FinancialProduct Accounts, wallets, ledgers
budget Budget schema:FinancialProduct Budget envelopes
project Project schema:Project Initiatives/projects
task Task schema:Action Tasks or work items
support_case SupportCase schema:Action Support/issue cases
document Document schema:CreativeWork Documents/files
policy_document PolicyDocument schema:CreativeWork Policies/standards
location Location schema:Place Physical/logical locations
event Event schema:Event Events or campaigns
dataset Dataset schema:Dataset Analytical/operational datasets

Programmatic benefits

import SemanticIDGenerator, {
  getPresetMetadata,
  buildSchemaForPreset
} from 'semantic-id-generator';

const generator = new SemanticIDGenerator({ preset: 'contract' });
const id = generator.generateSemanticID('contract');

const metadata = getPresetMetadata('contract'); // { schemaName: 'Contract', ... }
const { jsonld } = buildSchemaForPreset('contract');

console.log(id);
console.log(metadata.schemaClass); // schema:Contract
console.log(jsonld['sig:entitySchema']); // Contract

Use the preset field when instantiating the generator:

import SemanticIDGenerator from 'semantic-id-generator';

const generator = new SemanticIDGenerator({ preset: 'person' });
const id = generator.generateSemanticID('person');
console.log(id);

You can also inspect preset definitions programmatically:

import { getDomainPreset, getPresetMetadata, listDomainPresets } from 'semantic-id-generator';

console.log(listDomainPresets()); // ['person', 'individual_customer', ... , 'dataset']
console.log(getDomainPreset('device')); // { dataConceptSeparator: '|', compartmentSeparator: '-', ... }
console.log(getPresetMetadata('device'));
/* {
 *   key: 'device',
 *   schemaName: 'Device',
 *   schemaClass: 'schema:Product',
 *   description: 'Identifiers for physical or IoT devices.'
 * }
 */

Finally, export the accompanying JSON-LD or OWL schemas when you want to feed knowledge graphs:

import { buildSchemaForPreset, exportSchema } from 'semantic-id-generator';

const { jsonld, owl } = buildSchemaForPreset('contract');
console.log(jsonld['sig:entitySchema']); // "Contract"
console.log(jsonld['sig:domainClass']);  // "schema:Contract"

const owlXml = exportSchema('contract', 'owl');
// Persist jsonld / owl artifacts or push them to your graph store.

String Generation Strategies

Semantic ID Generator uses different string generation strategies to generate each compartment of the semantic ID. Here are the currently available strategies:

These strategies can be assigned to each compartment in the Semantic ID Generator configuration. This allows you to customize the generation of each part of the semantic ID according to your requirements.

Installation

Prerequisites

Before you can use the Semantic ID Generator, you must have certain software installed on your computer:

After installing Node.js and NPM, you need to install the dependencies of the Semantic ID Generator:

npm install uuid

Then install the Semantic ID Generator library

npm install semantic-id-generator

ES Module Support

This package is now a pure ES module. To use it in your project:

For ES Module projects (recommended):

import SemanticIDGenerator from 'semantic-id-generator';

For CommonJS projects:

const SemanticIDGenerator = await import('semantic-id-generator');
const generator = new SemanticIDGenerator.default();

TypeScript Support

The library includes full TypeScript support with comprehensive type definitions. If you’re using TypeScript, you’ll get:

import SemanticIDGenerator, { 
  SemanticIDGeneratorConfig, 
  Compartment, 
  GenerationStrategy, 
  LanguageCode 
} from 'semantic-id-generator';

// Type-safe configuration
const config: SemanticIDGeneratorConfig = {
  dataConceptSeparator: '|',
  compartmentSeparator: '-',
  compartments: [
    { name: 'prefix', length: 8, generationStrategy: 'visible characters' },
    { name: 'base64_part', length: 24, generationStrategy: 'base64' },
    { name: 'suffix', length: 12, generationStrategy: 'hexadecimal' }
  ]
};

const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('document');

Usage

Basic Usage

import SemanticIDGenerator from 'semantic-id-generator';
const generator = new SemanticIDGenerator();
const id = generator.generateSemanticID('person');
console.log(id); // Outputs looks like 'person|abcd-abcdefgh-abcdefghijkl'

Advanced Usage

import SemanticIDGenerator from 'semantic-id-generator';

const config = { 
    dataConceptSeparator: '|', 
    compartmentSeparator: '-', 
    compartments: [
        { name: 'part1', length: 10, generationStrategy: "visible characters"},
        { name: 'part2', length: 10, generationStrategy: "numbers"},
        { name: 'part3', length: 32, generationStrategy: "hexadecimal"}
    ] 
};

const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('person');
console.log(id);

Example with Base64 strategy:

import SemanticIDGenerator from 'semantic-id-generator';

const config = { 
    dataConceptSeparator: '|', 
    compartmentSeparator: '-', 
    compartments: [
        { name: 'prefix', length: 8, generationStrategy: "visible characters"},
        { name: 'base64_part', length: 24, generationStrategy: "base64"},
        { name: 'suffix', length: 12, generationStrategy: "hexadecimal"}
    ] 
};

const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('document');
console.log(id); // Example: 'document|Kj8mNx2-AbCdEfGhIjKlMnOpQrStUv-1a2b3c4d5e6f'

Example with Passphrase strategy:

import SemanticIDGenerator from 'semantic-id-generator';

const config = { 
    dataConceptSeparator: '|', 
    compartmentSeparator: '-', 
    compartments: [
        { name: 'prefix', length: 8, generationStrategy: "visible characters"},
        { name: 'passphrase_part', length: 25, generationStrategy: "passphrase"},
        { name: 'suffix', length: 12, generationStrategy: "hexadecimal"}
    ] 
};

const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('user_session');
console.log(id); // Example: 'user_session|Kj8mNx2-applebananacherrydragon-1a2b3c4d5e6f'

**Example with language configuration:**

```javascript
import SemanticIDGenerator from 'semantic-id-generator';

// Default behavior: uses all languages
const defaultConfig = { 
    dataConceptSeparator: '|', 
    compartmentSeparator: '-', 
    compartments: [
        { name: 'passphrase', length: 25, generationStrategy: "passphrase"}
    ] 
};

const defaultGenerator = new SemanticIDGenerator(defaultConfig);
const mixedId = defaultGenerator.generateSemanticID('session');
console.log(mixedId); // Example: 'session|applepommemanzanaapfel'

// Specific language configuration
const englishConfig = { 
    dataConceptSeparator: '|', 
    compartmentSeparator: '-', 
    languageCode: 'eng',
    compartments: [
        { name: 'passphrase', length: 25, generationStrategy: "passphrase"}
    ] 
};

const frenchConfig = { 
    dataConceptSeparator: '|', 
    compartmentSeparator: '-', 
    languageCode: 'fra',
    compartments: [
        { name: 'passphrase', length: 25, generationStrategy: "passphrase"}
    ] 
};

const englishGenerator = new SemanticIDGenerator(englishConfig);
const frenchGenerator = new SemanticIDGenerator(frenchConfig);

const englishId = englishGenerator.generateSemanticID('session'); // English only
const frenchId = frenchGenerator.generateSemanticID('session');   // French only

console.log(englishId); // Example: 'session|applebananacherrydragon'
console.log(frenchId);  // Example: 'session|pommebananejardinmaison'

Supported Languages:

Note: By default, the passphrase strategy uses words from all languages. To restrict to a specific language, add languageCode to the configuration.


### TypeScript Examples

For comprehensive TypeScript examples, see the `examples/typescript-example.ts` file. Here are some highlights:

**Type-safe configuration building:**
```typescript
class ConfigurationBuilder {
  private config: SemanticIDGeneratorConfig = {};

  setDataConceptSeparator(separator: string) {
    this.config.dataConceptSeparator = separator;
    return this;
  }

  setCompartmentSeparator(separator: string) {
    this.config.compartmentSeparator = separator;
    return this;
  }

  addCompartment(compartment: Compartment) {
    if (!this.config.compartments) {
      this.config.compartments = [];
    }
    this.config.compartments.push(compartment);
    return this;
  }

  setLanguageCode(languageCode: LanguageCode) {
    this.config.languageCode = languageCode;
    return this;
  }

  build() {
    return { ...this.config };
  }
}

const config = new ConfigurationBuilder()
  .setDataConceptSeparator('|')
  .setCompartmentSeparator('-')
  .addCompartment({ name: 'prefix', length: 6, generationStrategy: 'visible characters' })
  .addCompartment({ name: 'uuid', length: 32, generationStrategy: 'hexadecimal' })
  .setLanguageCode('eng')
  .build();

Type validation utilities:

function validateStrategy(strategy: GenerationStrategy): boolean {
  const allowed: GenerationStrategy[] = [
    'all characters',
    'visible characters',
    'numbers',
    'alphanumeric',
    'hexadecimal',
    'base64',
    'passphrase',
  ];
  return allowed.includes(strategy);
}

function validateLanguageCode(code: LanguageCode): boolean {
  const allowed: LanguageCode[] = ['eng', 'fra', 'spa', 'ita', 'deu', 'nld', 'wol'];
  return allowed.includes(code);
}

// Validate generation strategies
console.log(validateStrategy('base64')); // true
console.log(validateStrategy('invalid' as GenerationStrategy)); // false

// Validate language codes
console.log(validateLanguageCode('eng')); // true
console.log(validateLanguageCode('invalid' as LanguageCode)); // false

Error handling with TypeScript:

try {
  const generator = new SemanticIDGenerator();
  const id = generator.generateSemanticID('test');
} catch (error) {
  // TypeScript knows this is an Error
  console.error((error as Error).message);
}

Graph-friendly Schema Export

Each preset includes a machine-readable specification so you can publish identifier semantics to knowledge graphs:

import { buildSchemaForPreset, exportSchema } from 'semantic-id-generator';

const { jsonld, owl } = buildSchemaForPreset('person');
console.log(jsonld['sig:compartments'][0]['schema:name']); // semantic_prefix

const owlString = exportSchema('person', 'owl');
// => RDF/XML string ready for Neo4j, Neptune, Blazegraph, etc.

Default Values

If you do not specify certain configuration options when creating a new Semantic ID Generator, the library uses the following default values:

Performance

The Semantic ID Generator is designed for high performance and security. Here are the performance benchmarks from our test suite:

Performance Benchmarks

General Performance:

Unicode String Generation:

String Generation Strategies Performance:

Security Features

All string generation uses cryptographically secure random number generation:

Test Coverage

The library includes comprehensive test coverage:

Test Suites:

Test Categories:

Total Tests: 48+ passing tests covering all aspects of the library.

Current Dependencies

Production Dependencies:

Development Dependencies:

All dependencies are updated to their latest secure versions as of August 2025.

Run unit tests

Run from your command line interface (Bash, Ksh, Windows Terminal, etc.):

npm test

For TypeScript-specific tests:

npm run test:typescript

License

This project is licensed under MIT License.

About the Author

Semantic ID Generator is created by Yannick Huchard - CTO. For more information, visit: