Semantic ID Generator - NPM Package
Table of Contents
Introduction
Semantic ID Generator is a Node.js package designed to generate structured and meaningful unique identifiers, named “Semantic ID”. These identifiers are composed of different “compartments” each having a specific “semantic meaning” and generation strategy.
Latest Updates (v1.1.0):
- ✅ ES Module Support: Full ES module (ESM) support with
"type": "module"
- ✅ Security Updates: All dependencies updated to latest secure versions
- ✅ TypeScript Support: Enhanced TypeScript definitions and examples
- ✅ Performance Optimizations: Improved string generation algorithms
- ✅ Comprehensive Testing: 48+ passing tests covering all functionality
What is a Semantic Identifier?
A Semantic ID is an identifier that implements following AMASE data architecture principles:
- Unique
- Recognizable by humans and artificial intelligence
- Semantically coherent according to semantic rules defined by the data architects/engineers
- Consistent across all data spaces, including historical changes
- Generated from a configurable Factory
A semantic id follows the pattern:
{date concept name}{name separator}{compartment 1}{compartment separator}{compartment 2}{compartment separator}...{compartment N}
Examples
Here a few examples of generated semantic identifiers:
- A person ID:
person|AbCd-12345678-9ABCDEF0123456789ABCDEF0
- An organization ID:
organization|7890-MNO56789-123456789ABCDEF01234
- A multicurrency account:
multicurrency_account|XYZ2-87654321-ABCD5678901234567890
- A device ID:
device_id|A1B2-135792468-EDCBA098765432109876
String Generation Strategies
Semantic ID Generator uses different string generation strategies to generate each compartment of the semantic ID. Here are the currently available strategies:
- all characters: This strategy generates a string that includes all Unicode characters.
- visible characters: This strategy generates a string that only includes visible Unicode characters.
- numbers: This strategy generates a string that only includes numeric characters (0-9).
- alphanumeric: This strategy generates a string that includes both alphabetic (A-Z, a-z) and numeric (0-9) characters.
- hexadecimal: This strategy generates a string that includes hexadecimal characters (0-9, a-f).
- base64: This strategy generates a string that includes Base64 characters (A-Z, a-z, 0-9, +, /).
- passphrase: This strategy generates a string using common words from multiple languages (English, French, Spanish, Italian, German, Dutch, Wolof), creating human-readable passphrases.
These strategies can be assigned to each compartment in the Semantic ID Generator configuration. This allows you to customize the generation of each part of the semantic ID according to your requirements.
Installation
Prerequisites
Before you can use the Semantic ID Generator, you must have certain software installed on your computer:
-
Node.js: This is the JavaScript runtime in which the Semantic ID Generator runs. You can download it from https://nodejs.org.
-
NPM: This is the package manager for Node.js. It is included with the Node.js installation.
After installing Node.js and NPM, you need to install the dependencies of the Semantic ID Generator:
- uuid: This is an NPM package that allows you to generate UUIDs. You can install it with the following command:
npm install uuid
Then install the Semantic ID Generator library
npm install semantic-id-generator
ES Module Support
This package is now a pure ES module. To use it in your project:
For ES Module projects (recommended):
import SemanticIDGenerator from 'semantic-id-generator';
For CommonJS projects:
const SemanticIDGenerator = await import('semantic-id-generator');
const generator = new SemanticIDGenerator.default();
TypeScript Support
The library includes full TypeScript support with comprehensive type definitions. If you’re using TypeScript, you’ll get:
- Full IntelliSense support in your IDE
- Type checking for all configuration options
- Autocomplete for generation strategies and language codes
- Compile-time error detection
import SemanticIDGenerator, {
SemanticIDGeneratorConfig,
Compartment,
GenerationStrategy,
LanguageCode
} from 'semantic-id-generator';
// Type-safe configuration
const config: SemanticIDGeneratorConfig = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
compartments: [
{ name: 'prefix', length: 8, generationStrategy: 'visible characters' },
{ name: 'base64_part', length: 24, generationStrategy: 'base64' },
{ name: 'suffix', length: 12, generationStrategy: 'hexadecimal' }
]
};
const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('document');
Usage
Basic Usage
import SemanticIDGenerator from 'semantic-id-generator';
const generator = new SemanticIDGenerator();
const id = generator.generateSemanticID('person');
console.log(id); // Outputs looks like 'person|abcd-abcdefgh-abcdefghijkl'
Advanced Usage
import SemanticIDGenerator from 'semantic-id-generator';
const config = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
compartments: [
{ name: 'part1', length: 10, generationStrategy: "visible characters"},
{ name: 'part2', length: 10, generationStrategy: "numbers"},
{ name: 'part3', length: 32, generationStrategy: "hexadecimal"}
]
};
const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('person');
console.log(id);
Example with Base64 strategy:
import SemanticIDGenerator from 'semantic-id-generator';
const config = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
compartments: [
{ name: 'prefix', length: 8, generationStrategy: "visible characters"},
{ name: 'base64_part', length: 24, generationStrategy: "base64"},
{ name: 'suffix', length: 12, generationStrategy: "hexadecimal"}
]
};
const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('document');
console.log(id); // Example: 'document|Kj8mNx2-AbCdEfGhIjKlMnOpQrStUv-1a2b3c4d5e6f'
Example with Passphrase strategy:
import SemanticIDGenerator from 'semantic-id-generator';
const config = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
compartments: [
{ name: 'prefix', length: 8, generationStrategy: "visible characters"},
{ name: 'passphrase_part', length: 25, generationStrategy: "passphrase"},
{ name: 'suffix', length: 12, generationStrategy: "hexadecimal"}
]
};
const generator = new SemanticIDGenerator(config);
const id = generator.generateSemanticID('user_session');
console.log(id); // Example: 'user_session|Kj8mNx2-applebananacherrydragon-1a2b3c4d5e6f'
**Example with language configuration:**
```javascript
import SemanticIDGenerator from 'semantic-id-generator';
// Default behavior: uses all languages
const defaultConfig = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
compartments: [
{ name: 'passphrase', length: 25, generationStrategy: "passphrase"}
]
};
const defaultGenerator = new SemanticIDGenerator(defaultConfig);
const mixedId = defaultGenerator.generateSemanticID('session');
console.log(mixedId); // Example: 'session|applepommemanzanaapfel'
// Specific language configuration
const englishConfig = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
languageCode: 'eng',
compartments: [
{ name: 'passphrase', length: 25, generationStrategy: "passphrase"}
]
};
const frenchConfig = {
dataConceptSeparator: '|',
compartmentSeparator: '-',
languageCode: 'fra',
compartments: [
{ name: 'passphrase', length: 25, generationStrategy: "passphrase"}
]
};
const englishGenerator = new SemanticIDGenerator(englishConfig);
const frenchGenerator = new SemanticIDGenerator(frenchConfig);
const englishId = englishGenerator.generateSemanticID('session'); // English only
const frenchId = frenchGenerator.generateSemanticID('session'); // French only
console.log(englishId); // Example: 'session|applebananacherrydragon'
console.log(frenchId); // Example: 'session|pommebananejardinmaison'
Supported Languages:
eng
- Englishfra
- Frenchspa
- Spanishita
- Italiandeu
- Germannld
- Dutchwol
- Wolof
Note: By default, the passphrase strategy uses words from all languages. To restrict to a specific language, add languageCode
to the configuration.
### TypeScript Examples
For comprehensive TypeScript examples, see the `examples/typescript-example.ts` file. Here are some highlights:
**Type-safe configuration building:**
```typescript
import { ConfigurationBuilder } from 'semantic-id-generator';
const config = new ConfigurationBuilder()
.setDataConceptSeparator('|')
.setCompartmentSeparator('-')
.addCompartment({ name: 'prefix', length: 6, generationStrategy: 'visible characters' })
.addCompartment({ name: 'uuid', length: 32, generationStrategy: 'hexadecimal' })
.setLanguageCode('eng')
.build();
Type validation utilities:
import { validateStrategy, validateLanguageCode } from 'semantic-id-generator';
// Validate generation strategies
console.log(validateStrategy('base64')); // true
console.log(validateStrategy('invalid')); // false
// Validate language codes
console.log(validateLanguageCode('eng')); // true
console.log(validateLanguageCode('invalid')); // false
Error handling with TypeScript:
try {
const generator = new SemanticIDGenerator();
const id = generator.generateSemanticID('test');
} catch (error) {
// TypeScript knows this is an Error
console.error((error as Error).message);
}
Default Values
If you do not specify certain configuration options when creating a new Semantic ID Generator, the library uses the following default values:
-
dataConceptSeparator:
|
. It separates the data concept name from the rest of the semantic ID. -
compartmentSeparator:
-
. It separates the different compartments within the semantic ID. -
compartments: By default, the Semantic ID Generator uses three compartments. The compartments’ names are ‘part1’, ‘part2’, and ‘part3’, and their lengths are 4, 8, and 12 characters, respectively.
-
generationStrategy: “visible characters”. This strategy generates strings using visible Unicode characters (excluding separators).
Performance
The Semantic ID Generator is designed for high performance and security. Here are the performance benchmarks from our test suite:
Performance Benchmarks
General Performance:
- 100,000 IDs generation: ~11 seconds (0.11ms per ID)
- Default configuration: 3 compartments with visible characters strategy
Unicode String Generation:
- 10,000 Unicode strings (10 chars each): ~565ms (0.056ms per string)
- 100 large Unicode strings (1000 chars each): ~700ms (7ms per string)
- 1000 Unicode strings (50 chars each): ~300ms (0.3ms per string)
String Generation Strategies Performance:
- Visible characters: Fastest, uses ASCII range (0x0020-0x007E)
- Numbers: Very fast, uses only digits (0-9)
- Alphanumeric: Fast, uses A-Z, a-z, 0-9
- Hexadecimal: Fast, uses 0-9, a-f
- Base64: Fast, uses A-Z, a-z, 0-9, +, /
- Passphrase: Fast, uses common English words (a-z only)
- All characters: Slower due to Unicode complexity, uses full Unicode range
Security Features
All string generation uses cryptographically secure random number generation:
- Uses Node.js
crypto.randomBytes()
andcrypto.randomInt()
- Provides high entropy and unpredictability
- Suitable for production environments requiring security
- Latest security updates: All dependencies updated to latest secure versions (August 2025)
Test Coverage
The library includes comprehensive test coverage:
Test Suites:
- 01-semantic-id-generator.test.js: Basic functionality and ID generation
- 02-check-configuration.test.js: Configuration validation and error handling
- 03-check-performances.test.js: Performance benchmarking
- 04-unicode-string-generation.test.js: Unicode string generation optimization
- 05-base64-strategy.test.js: Base64 string generation strategy testing
- 06-passphrase-strategy.test.js: Passphrase string generation strategy testing
- typescript-compilation.test.ts: TypeScript compilation verification
- typescript-definitions.test.ts: TypeScript definitions testing
Test Categories:
- Functionality tests (ID generation, validation)
- Configuration tests (error handling, edge cases)
- Performance tests (speed benchmarks)
- Unicode optimization tests (security and efficiency)
- TypeScript compilation and type checking tests
Total Tests: 48+ passing tests covering all aspects of the library.
Current Dependencies
Production Dependencies:
uuid
: ^11.1.0 (Latest secure version)
Development Dependencies:
@types/chai
: ^5.2.2@types/mocha
: ^10.0.10@types/node
: ^24.2.0chai
: ^5.2.1mocha
: ^11.7.1ts-node
: ^10.9.2typescript
: ^5.9.2
All dependencies are updated to their latest secure versions as of August 2025.
Run unit tests
Run from your command line interface (Bash, Ksh, Windows Terminal, etc.):
npm test
For TypeScript-specific tests:
npm run test:typescript
License
This project is licensed under MIT License.
About the Author
Semantic ID Generator is created by Yannick Huchard - CTO. For more information, visit:
-
yannickhuchard.com Podcast Medium Youtube - More about AMASE, enterprise engineering/architecture system for businesses and startups: amase.io.