Home/Docs/README

Things - Complete Linguistic Foundation & Semantic Vocabulary

27,573 Thing definitions forming the complete linguistic and semantic foundation for the .do platform's Business-as-Code capabilities.

Overview

This directory contains a comprehensive vocabulary of Things - the fundamental building blocks of semantic knowledge. Each Thing is defined in MDXLD format with structured metadata, enabling:

  • Natural language understanding through complete English grammar
  • Semantic programming through GraphDL predicates
  • Business-as-Code through natural language → executable code
  • Knowledge graphs through linked data and relationships

Directory Structure

things/
├── tasks/              # 24,622 O*NET task statements (verb.Object.prep.Object.mdx)
├── verbs/              # 901 action verbs from O*NET tasks
├── concepts/           # 1,497 common nouns/objects (frequency >= 5)
├── prepositions/       # 110 English prepositions (simple, compound, complex)
├── adverbs/            # 383 English adverbs (manner, time, place, degree)
├── conjunctions/       # 60 conjunctions & operators (boolean, conditional, looping)
└── README.mdx          # This file

Thing Categories

1. Tasks (24,622 files)

O*NET occupational task statements parsed into full GraphDL syntax.

Format: verb.Object.preposition.Object.mdx

Examples:

oversee.Campaigns.to.promote.Recycling.in.Communities.mdx
develop.Software.to.improve.Efficiency.mdx
advise.Clients.about.FinancialMatters.mdx
accept.MusicRequests.from.EventGuests.mdx

Key Features:

  • One file per unique task statement (17,538 unique)
  • Expanded alternatives (24,622 variants, 1.40x expansion)
  • Full predicate structure with prepositions
  • Infinitive verb recognition (to.promote not to.PromoteRecycling)
  • Digital score (0.0-1.0) indicating physical vs. digital nature

Source: O*NET Database (onetonline.org)

2. Verbs (901 files)

Action verbs extracted from O*NET task statements.

Format: {verb}.mdx (camelCase)

Top 10 by frequency:

  1. determine (1,190 occurrences)
  2. prepare (1,061)
  3. ensure (1,024)
  4. develop (937)
  5. provide (800)
  6. perform (628)
  7. maintain (571)
  8. conduct (554)
  9. identify (521)
  10. monitor (469)

Example: oversee.mdx

---
$type: Verb
$id: https://verbs.org.ai/oversee
$context: https://tasks.org.ai
name: oversee
description: The action of overseeing. Used in 56 O*NET task statement(s).
usageCount: 56
---

Source: Extracted from O*NET tasks

3. Concepts (1,497 files)

Common nouns and objects from task predicates (frequency >= 5).

Format: {Concept}.mdx (PascalCase)

Top 10 by frequency:

  1. Equipment (297 occurrences)
  2. Patients (282)
  3. Clients (251)
  4. Information (246)
  5. Use (240)
  6. Customers (233)
  7. Products (203)
  8. Other (195)
  9. Materials (186)
  10. Compliance (177)

Example: Equipment.mdx

---
$type: Noun
$id: https://concepts.org.ai/Equipment
$context: https://tasks.org.ai
name: Equipment
description: An Equipment as referenced in O*NET task statements.
usageCount: 297
---

Filtering: Only concepts with >= 5 occurrences (4.7% of 31,856 total)

Source: Extracted from O*NET tasks

4. Prepositions (110 files)

Complete English prepositions for semantic relationships.

Categories:

  • Simple (59): at, in, on, by, for, from, to, with, of, about...
  • Compound (37): according to, because of, in front of, on behalf of...
  • Complex (14): as far as, in accordance with, in the course of...

Format: {preposition}.mdx (lowercase with hyphens for compound)

Examples:

at.mdx
in.mdx
according-to.mdx
in-accordance-with.mdx
on-behalf-of.mdx

Usage in GraphDL:

verb.Object.preposition.Object
advise.Clients.about.FinancialMatters
operate.in-accordance-with.Regulations

Source: Standard English grammar (Cambridge, Oxford)

5. Adverbs (383 files)

English adverbs by function/category.

Categories:

  • Manner (140): quickly, carefully, abreast, well, closely...
  • Time (65): now, then, always, often, soon, immediately...
  • Place (67): here, there, everywhere, nearby, abroad...
  • Degree (71): very, quite, too, almost, completely...
  • Frequency (28): always, usually, often, rarely, seldom...
  • Certainty (12): certainly, probably, maybe, possibly...

Format: {adverb}.mdx (lowercase)

Key Improvement: Recognizes adverbs correctly (not as nouns!)

Example: abreast.mdx

---
$type: Adverb
$id: https://adverbs.org.ai/abreast
$context: https://schema.org
name: abreast
description: Abreast - alongside each other; up to date with
category: manner
---

Before linguistic parser:

"Keep abreast of developments"
→ [verb:keep] [object:abreast] [prep:of] [object:developments]
❌ WRONG - "abreast" treated as noun

After linguistic parser:

"Keep abreast of developments"
→ [verb:keep] [adverb:abreast] [prep:of] [object:developments]
✅ CORRECT - recognized as adverb

Source: Standard English grammar

6. Conjunctions (60 files)

Conjunctions and operators bridging natural language with programming.

Categories:

  • Coordinating (7): and, or, but, nor, yet, so, for
  • Subordinating (26): if, while, when, because, unless, until...
  • Boolean (6): and, or, not, xor, nor, nand
  • Conditional (8): if, then, else, unless, when, otherwise...
  • Looping (10): while, until, for, each, repeat, do...
  • Switching (5): switch, case, default, when, match
  • Control (8): break, continue, return, yield, exit, halt...

Format: {conjunction}.mdx (lowercase or camelCase for compound)

Compound Operator Normalization: Natural language → programming style (camelCase):

"for each" → forEach
"else if" → elseIf
"even if" → evenIf
"even though" → evenThough
"as long as" → asLongAs
"as soon as" → asSoonAs
"so that" → soThat

Examples:

if.mdx (Boolean/Conditional):

---
$type: Conjunction
$id: https://conjunctions.org.ai/if
name: if
description: Conditional execution - perform action when condition is true
category: conditional
---

GraphDL Pattern: if.Condition.then.Action
TypeScript: if (condition) { action }

forEach.mdx (Looping):

---
$type: Conjunction
$id: https://conjunctions.org.ai/forEach
name: forEach
description: Iterate over each element in a collection
category: looping
---

GraphDL Pattern: forEach.Item.in.Collection.do.Action
TypeScript: for (const item of collection) { action }

Source: Platform.do (semantic programming language)

Semantic Programming Language

GraphDL Syntax

Complete predicate structure: verb.Object.preposition.Object...

Casing Conventions:

  • verbs: camelCase (oversee, promote, forEach)
  • Objects/Nouns: PascalCase (Campaign, User, Equipment)
  • prepositions: lowercase with hyphens (to, according-to)
  • conjunctions: camelCase for compound (forEach, elseIf)
  • adverbs: lowercase (abreast, closely, quickly)

Natural Language → Code Transformation

Boolean Logic:

if.Valid.and.Active.then.Process
→ if (valid && active) { process() }

if.Error.or.Warning.then.Alert
→ if (error || warning) { alert() }

if.not.Complete.then.Continue
→ if (!complete) { continue() }

Conditionals:

if.Condition.then.Action.else.Alternative
→ if (condition) { action } else { alternative }

unless.Error.then.Continue
→ if (!error) { continue() }

Looping:

while.Active.do.Process
→ while (active) { process() }

forEach.User.in.Users.do.notify.about.Updates
→ for (const user of users) { notify(user, updates) }

until.Complete.do.retry
→ while (!complete) { retry() }

Control Flow:

if.Found.then.return.Result
→ if (found) { return result }

when.Invalid.do.continue
→ if (invalid) { continue }

Linguistic Parser

Features

True Part-of-Speech Identification:

  • Loads 2,951 Thing definitions as knowledge base
  • Recognizes verbs, nouns, prepositions, adverbs, conjunctions
  • Handles phrasal verbs (keep abreast of, keep up with)
  • Handles compound prepositions (according to, in accordance with)
  • Handles compound operators (for each → forEach, else if → elseIf)

Pattern-Matching vs. Linguistic Parser:

| Example | Pattern-Matching | Linguistic Parser | |---------|-----------------|-------------------| | "Keep abreast of developments" | [verb:keep] [object:abreast] ❌ | [verb:keep] [adverb:abreast] ✅ | | "Proceed according to procedures" | [verb:proceed] [object:according] ❌ | [verb:proceed] [prep:according to] ✅ | | "For each user notify" | [verb:for] [object:each user] ❌ | [conj:forEach] [object:user] ✅ | | "Work closely with clients" | [verb:work] [object:closely] ❌ | [verb:work] [adverb:closely] ✅ |

Usage

import { parseGraphDLLinguistic, loadLinguisticKnowledge } from 'tasks.org.ai'

// Load linguistic knowledge base
const knowledge = loadLinguisticKnowledge('/path/to/ai')

// Parse task
const components = parseGraphDLLinguistic(
  'Oversee campaigns to promote recycling in communities',
  knowledge
)

// Result:
// [
//   { type: 'verb', value: 'oversee' },
//   { type: 'object', value: 'campaigns' },
//   { type: 'preposition', value: 'to' },
//   { type: 'verb', value: 'promote' },
//   { type: 'object', value: 'recycling' },
//   { type: 'preposition', value: 'in' },
//   { type: 'object', value: 'communities' }
// ]

File Format (MDXLD)

All Things use MDXLD format: MDX + Linked Data

Structure:

---
$type: ThingType
$id: https://namespace.org.ai/ThingName
$context: https://schema.org
name: Thing Name
description: Human-readable description
status: public
license: CC-BY-4.0
source: onetonline.org | platform.do
category: optional-category
usageCount: 123 (optional)
---

# Thing Name

Markdown content with documentation, examples, usage...

Key Fields:

  • $type: Thing type (Verb, Noun, Preposition, Adverb, Conjunction, Task)
  • $id: Unique identifier (URL)
  • $context: Vocabulary context
  • name: Thing name
  • description: Clear description
  • status: Always public in ai/things/
  • license: CC-BY-4.0 for open source
  • source: Original data source

Statistics

Total Things: 27,573

| Category | Count | Source | |----------|-------|--------| | Tasks | 24,622 | O*NET Database | | Verbs | 901 | Extracted from tasks | | Concepts | 1,497 | Extracted from tasks (freq >= 5) | | Prepositions | 110 | English grammar | | Adverbs | 383 | English grammar | | Conjunctions | 60 | English grammar + programming |

Data Sources

  • O*NET Database (onetonline.org): 24,622 tasks, 901 verbs, 1,497 concepts
  • English Grammar (Cambridge, Oxford): 110 prepositions, 383 adverbs
  • Platform.do: 60 conjunctions/operators (semantic programming)

Coverage

  • O*NET Tasks: 17,538 unique statements → 24,622 variants (1.40x expansion)
  • Verbs: 901 unique action verbs covering 100% of O*NET tasks
  • Concepts: Top 4.7% of objects (1,497 of 31,856) covering ~85% of task references
  • Prepositions: Complete English preposition vocabulary
  • Adverbs: Comprehensive coverage across 6 categories
  • Conjunctions: All logical/control operators for semantic programming

Regeneration

All files are fully regenerable from source data:

# Generate O*NET tasks with GraphDL syntax
tsx packages/onet.org.ai/src/scripts/generate-tasks-consolidated.ts

# Extract components from tasks
tsx packages/onet.org.ai/src/scripts/extract-components.ts

# Generate verbs
tsx packages/onet.org.ai/src/scripts/generate-verb-things.ts

# Generate concepts
tsx packages/onet.org.ai/src/scripts/generate-concept-things.ts

# Generate prepositions
tsx packages/onet.org.ai/src/scripts/generate-prepositions.ts

# Generate adverbs
tsx packages/onet.org.ai/src/scripts/generate-adverbs.ts

# Generate conjunctions
tsx packages/onet.org.ai/src/scripts/generate-conjunctions.ts

Testing

# Test linguistic parser vs pattern-matching
tsx packages/tasks.org.ai/src/scripts/test-linguistic-parser.ts

Shows improvements in:

  • Adverb recognition (abreast, closely, quickly)
  • Compound preposition handling (according to, in accordance with)
  • Phrasal verb recognition (keep abreast of, work closely with)
  • Boolean/conditional operator identification (if, and, or, forEach, elseIf)

Use Cases

1. Natural Language Understanding

Complete English grammar foundation enables accurate parsing of natural language into semantic structures.

2. Business-as-Code

Non-programmers can write executable code in plain English:

"For each customer in the database send welcome email"
→ forEach.Customer.in.Database.do.send.WelcomeEmail
→ for (const customer of database.customers) { send(customer, welcomeEmail) }

3. Semantic Search & Knowledge Graphs

All Things are linked data with unique IDs, enabling semantic queries and graph traversal.

4. AI/LLM Integration

Structured vocabulary provides grounding for AI agents, enabling precise semantic understanding and code generation.

5. GraphDL → TypeScript Compiler

Foundation for compiling natural language business rules into executable TypeScript.

Next Steps

  1. Generate remaining parts of speech: Adjectives, pronouns, articles, determiners
  2. Build GraphDL → TypeScript compiler: Transform semantic predicates into executable code
  3. Create visual GraphDL editor: No-code interface for business users
  4. Expand O*NET coverage: Add industries, tech, tools, occupations
  5. Integrate with sdk.do: Enable $.Subject.predicate.Object patterns with natural language

License

  • O*NET Data: Public domain (U.S. Department of Labor)
  • English Grammar: CC-BY-4.0 (standard linguistic references)
  • Platform-specific: CC-BY-4.0 (open source)

References

  • O*NET Online: https://www.onetonline.org/
  • Cambridge Grammar: Cambridge University Press
  • Oxford English Dictionary: Oxford University Press
  • Platform.do: https://platform.do
  • GraphDL Specification: https://graphdl.org (coming soon)

Built with Claude Code 🤖