AWA Data Flow β every layer, every action
How a piece of information becomes graph truth. Core Knowledge Contract: source_log is the single entry; enrichment is automatic; agent-origin data never auto-promotes β you approve. Node labels name the real tables / modules.
1 Β· INGESTION β capture, tag origin
gmail
signal
whatsapp / WAD
imessage
sms
calendar
web fetch
task thinking
integrator decision
autocal / resolver
audit finding
ORCH "remember X"
action: poll / webhook / agent-write β log_source(content, origin)
external = ground truthagent = LLM-derived (review-only)user = you asserted
βΌ stamped with metadata_origin
2 Β· SINGLE ENTRY
source_log origin Β· source_name Β· source_ref Β· content Β· metadata
action: INSERT-OR-IGNORE dedup on (source_name, source_ref) β triggers maybe_refresh_db_flow_for_source
βΌ
3 Β· DB FLOW β enrichment pipeline (rebuild_universe)
a Β· mechanical dedupdedup_key, content_fingerprint, evidence_id
β
b Β· entity recognitiondeterministic_key β entities (status=candidate)
β
c Β· enrichmentnormalize metadata, scope, artifact_kind, sender_display
β
d Β· conversation groupingsource_conversations, conversation_segments
β
e Β· semantic threadingsemantic_threads, thread_sources, thread_outputs, event_candidates
raw_sourcesenriched evidence
entitiescandidate
source_entitieslinks
semantic_threads
βΌ enrichment feeds BOTH branches
4A Β· KNOWLEDGE β patterns
observationsdated datapoints: mention Β· measurement Β· transaction Β· relation_start/end Β· reference
action: classifier routes into the temporal model
βΌ accumulate over time
pattern_detectorscores forming hypotheses, decides promotion
βΌ
Intuitions knowledge table
status: forming β promoted / discarded / dormant
type: hard_fact Β· preference Β· judgment
idempotent via intuition_key
review: /api/intuitions/proposed β confirm / dismiss
4B Β· PROPOSALS β transient corrections
issue_type: identity_mismatch Β· autocal_ambiguity Β· autocal_policy Β· entity_link Β· spam_candidate
+ specialised proposal tables
identity_resolution_actionsmerge X=Y
entity_link_proposalsrelations
thread_hub_proposals
source_blacklist_proposals
review: /api/proposals β apply / dismiss
dismiss β rejected_proposals (tombstone, dedup)
βΌ rule 8 β agent-origin NEVER auto-promotes Β· you approve Β· safe hard-id links auto only for external/user
5 Β· PROMOTION (on your approval)
promotion.pywrites the approved attribute / relation onto the canonical entity
βΌ
6 Β· CANONICAL GRAPH (truth)
contacts
places
organizations
things
services
comm_channels
events
entity_linksrelations
entity_featurestags
knowledgepromoted attributes
Agent-world (separate, NOT in this flow): after a task, the retro (7-question post-mortem) β post_mortems.md + activity_log. That is the agent reflecting on its own work, not data about your world β it never enters the pipeline.
Known gaps (do not yet flow): file / image attachments live on disk (/var/lib/awa/attachments), not source_log. Calendar also has a direct events-table route alongside source_log.