NeuralWeaves — Deep-Tech AI Research

Open Standard

Can you trust a
photograph anymore?

AI generates convincing fake photos in seconds. EIF is an open standard that evaluates whether a photograph constitutes reliable evidence for a specific claim — not just "is this real?" but "is this trustworthy?"

EIF Analysis Pipeline 9 probe categories active

photograph.jpg

JPEG Quantization0.92

Noise Consistency0.87

Lighting Physics0.78

Metadata Integrity0.95

Sensor Signature0.84

GAN Fingerprint0.96

Edge Frequency0.81

Color Distribution0.89

Compression Chain0.91

Forensic Metrics

Probe Categories

Target Domains

Apache 2.0

License

C2PA tracks provenance — who created it, what device, what edits. Deepfake detectors classify real or fake. Neither evaluates whether a photo is reliable evidence for a specific claim.

EIF analyses the photograph itself: compression artifacts, noise patterns, lighting physics, and statistical signatures that separate real sensors from neural networks.

A standard for truth cannot itself be opaque. EIF is open.

Identity Documents Vehicle Damage Property Inspection Product Authenticity

Patent pending — EIF multi-dimensional metric framework for domain-specific evidence integrity evaluation.

arXiv Published

Tokenizers weren't built
for these languages.

BPE shatters agglutinative words into meaningless bytes. VerChol's grammar-first approach decomposes words at morpheme boundaries — preserving grammatical meaning for 500M+ speakers globally.

VerChol — Grammar-First Tokenizer Morpheme-level decomposition

Tamil · பொறுப்பாளர்களுக்கு

"for the responsible persons" — one word, clause-level meaning

BPE (Standard)

பொறுப்பாள

ர்களுக்கு

6 tokens · no meaning preserved

→

VerChol

பொறுப்புஆள

அர்கள்உக்கு

4 tokens · each morpheme meaningful

Kannada · ಅಭಿವೃದ್ಧಿಹೊಂದುತ್ತಿರುವವರಿಗೆ

"for those who are developing" — single agglutinated word

BPE (Standard)

ಅಭಿವೃದ್ಧ

ಿಹೊಂದುತ್ತ

ಿರುವವರಿಗೆ

9 tokens · grammar destroyed

→

VerChol

ಅಭಿವೃದ್ಧಿ

ಹೊಂದುತ್ತಿರು

ವವರಇಗೆ

5 tokens · grammar preserved

Existing tokenizers — SentencePiece, BPE, WordPiece — were designed for isolating languages like English. They systematically fail on languages where a single word carries clause-level meaning through grammatical suffixing.

VerChol's grammar-first approach achieves a 3.1% fertility improvement over BPE on agglutinative language benchmarks — by decomposing words into grammatically meaningful morphemes instead of statistically frequent byte-pairs. Published on arXiv.

BharatMini — Low-Cost Domain Training

Alongside the tokenizer, we demonstrated narrow-domain model training at ₹2,700 — proving domain-specific AI for manufacturing and robotics doesn't require massive compute.

Tamil Kannada Turkish Finnish Korean Hungarian

Novel Architecture

What's missing is
the signal.

Every AI safety framework detects what IS present. SenseAi inverts this: it detects what SHOULD be present but ISN'T. A fundamentally different computational problem.

The architecture uses a four-state processing model that classifies signal streams by the absence of expected patterns. In autism: the absence of expected physiological variability predicts crisis. In manufacturing: a missing sensor reading means failure.

Applications

Autism Meltdown Prediction

On-device wearable detecting absence of expected physiological patterns. nRF52840 + 5 sensors. ₹2,999 with BPL subsidy.

Safety-Critical Monitoring

Industrial systems where missing signals indicate failure — sensor arrays, pipeline monitoring, structural health.

Surveillance Gap Detection

Identifying what camera networks are NOT covering — blind zones, degraded sensors, time windows.

Behavioural Analysis

Detecting omitted disclosures, missing responses — financial compliance, workplace safety, child protection.

SenseAi — Four-State Signal Model

Normal

Expected variability present — all sensors reporting

Reduced

Variability decreasing — pattern flattening detected

Absent

Expected signal missing — predictive window open

Crisis

Absence confirmed — intervention recommended

Patent in preparation — absence-detection signal processing architecture

AI-Native Operating System

Intelligence as a
file operation.

Any program, in any language, that can read and write files can now use AI. No SDKs. No API keys. No cloud dependency.

tharai — /dev/ai

# Listen and transcribe

$ cat audio.wav > /dev/ai/hear

→ "Turn off the lights in the living room"

# Understand an image

$ cat photo.jpg > /dev/ai/see

→ "A chest X-ray showing mild pleural effusion"

# Speak

$ echo "Report complete" > /dev/ai/speak

→ [audio plays through system speaker]

/hear

Voice & Audio

Speech-to-text, audio classification, sound event detection. Write audio data to a file — get intelligence back.

/dev/ai/hear

/see

Vision

Image classification, object detection, document understanding. Any camera, any image, any format.

/dev/ai/see

/think

Language & Reasoning

Summarization, Q&A, translation, analysis. Local models by default. Cloud when you choose.

/dev/ai/think

Research & Publications

Published. Open. Cited.

We build from first principles and publish our research. Open standards, open code, open papers.

Standard

Evidence Integrity Framework

Open standard for forensic image verification. 83 metrics, 9 probe categories, Apache 2.0. Patent pending.

eif-format.org →

Paper

VerChol — Grammar-First Tokenization

3.1% fertility improvement over BPE for agglutinative languages. Grammar-first morpheme decomposition. Published on arXiv.

arXiv →

Arch

SenseAi — Absence-Detection Architecture

Four-state signal processing for safety-critical systems. Patent in preparation.

Patent pending

TharAI — AI-Native Operating System

POSIX /dev/ai primitives for intelligence. Local-first. Model-agnostic. Open-core.

tharai.dev →

Open standards. Novel architectures.Edge deployment.

Can you trust aphotograph anymore?

Tokenizers weren't builtfor these languages.

What's missing isthe signal.

Intelligence as afile operation.

Published. Open. Cited.

Open standards. Novel architectures.
Edge deployment.

Can you trust a
photograph anymore?

Tokenizers weren't built
for these languages.

What's missing is
the signal.

Intelligence as a
file operation.