Skip to content
user@aliuyar.dev : ~/research $

cat ~/research/index.md

research

Papers that came out of questions I keep returning to — around reliability, evaluation, and how to make AI systems easier to inspect and reason about.

PAPERS
20
YEAR
2026
RELATED
→ /projects
№ 01
2026

Inside the Multilingual Transformer: Computational Neuroanatomy of Shared and Language-Specific Brain Alignment

State-level, token-attribution, and deletion-validation analysis of XLM-R and NLLB-200 encoder internals against multilingual naturalistic fMRI (LPPC, English/French/Chinese) isolates attention-side rather than FFN-semantic computations behind SHARED-greater-than-SPECIFIC alignment, including extension into auditory cortex.

// operational relevance

Shows which internal transformer computations actually drive shared cross-lingual alignment, giving multilingual model developers a mechanistic target instead of a single scalar benchmark.

authors: Ali Uyar

multilingual-nlpbrain-alignmentmechanistic-analysis
№ 02
2026

Language-Neutral or Language-Specific? Disentangling Multilingual LLM Subspaces with Naturalistic fMRI

ROI-first encoding analysis on the Le Petit Prince multilingual fMRI corpus decomposes XLM-R and NLLB-200 encoder features into leave-target-out SHARED and orthogonalized SPECIFIC components, with SHARED-greater-than-SPECIFIC holding semantically across all six model-by-language tests under Holm correction.

// operational relevance

Provides a reproducible subspace decomposition that teams can reuse to audit whether multilingual encoders carry cross-lingual content structure versus language-specific residuals.

authors: Ali Uyar

multilingual-nlpfmri-encodingrepresentation-analysis
№ 03
2026

Overlay Algebra: Causal Composition of Operational Behaviors in Frozen Language Models

Semantics-fixed protocol that separates operational output contracts (strict JSON, exact quotation) from shared QA semantics via dense-delta residualization on top of a frozen model, recovering deployable single-overlay adapters that match direct-trained behavior on schema validity and quote exactness without degrading answer F1.

// operational relevance

Lets teams ship and audit reusable output-contract adapters as residuals on a frozen base instead of re-training whole behaviors from scratch.

authors: Ali Uyar

frozen-modelsadapter-residualsauditable-evaluation
№ 04
2026

Selective Revocation and Replay: Post-Compromise Recovery of Explicit Persisted State in Memory-Augmented LLM Agents

Provenance-tracked runtime that revokes persisted descendants of suspicious roots and replays only dirty state-writing events, with coarse-rollback fallback; in a frozen 8-chain matrix and a live qwen2.5:14b confirmation it is the only method that drives residual explicit attack success to zero while retaining benign state, cutting post-detection cost from 17 to 9 extra LLM calls versus rollback on retrieval memory.

// operational relevance

Gives operators a surgical post-compromise recovery primitive for poisoned agent memory that removes attacker influence without wiping useful carry-forward state.

authors: Ali Uyar

prompt-injection-recoveryagent-memoryprovenance
№ 05
2026

Same-Size Capability Transfer Reveals a Performance-Localization Tradeoff in Deterministic Function Calling

Locked within-family Gemma testbed for transplanting single-turn function-calling behavior via fixed sparse, dense, and steering modules under frozen manifests and exact JSON grading. Sparse transfer is real and partially localized, but a parameter-matched dense module wins the multiseed primary metric (0.2177 vs 0.1658 strict success).

// operational relevance

Gives teams a deterministic, claim-audited harness for measuring post-training capability edits without LLM judges or retraining recipients.

authors: Ali Uyar

function-callingdeterministic-evalmodel-editing
№ 06
2026

Surgical Post-Training Diffing: Partial Recovery Without Clean Small-Mask Separation

Paired answer-phase sparse surrogate over two residual-stream layers of Gemma 3 4B PT/IT cuts answer-token KL by 28.3% and recovers 0.186 of the PT-to-IT capability gap, while frozen 5-feature capability masks collapse to the null and verbosity-subtraction masks lengthen outputs instead of shortening them. Includes matched baselines, ablations, and prompt-level paired bootstrap intervals.

// operational relevance

Sets a held-out, bootstrap-bounded standard for separating capability from verbosity in instruction-tuning interventions before they ship.

authors: Ali Uyar

instruction-tuningsparse-interventionsheld-out-eval
№ 07
2026

ProbeRoute: Probes as Routing Priors for Frozen-Backbone Multi-Token Prediction

Offline future-token probes across transformer depth initialize a sparse top-m layer router for multi-token prediction on a frozen backbone, beating the strongest screening-selected dense finalist on held-out top-1 and speculative-draft acceptance under a stage-gated protocol with mandatory probes and a final 1B rerun.

// operational relevance

Turns a cheap one-time probe pass into a routing prior that improves speculative-draft acceptance on a frozen backbone without unfreezing or growing the adaptation recipe.

authors: Ali Uyar

frozen-backbonemulti-token-predictionspeculative-decoding
№ 08
2026

When Accuracy Hides Path Dependence: Criterion Fidelity in LLM Judges

Introduces criterion fidelity as a distinct validity axis for LLM judges using criterion families (base, meaning-preserving paraphrases, counterfactuals) that dissociate from ordinary accuracy, with strict criterion-emphasis prompting shown to be an unreliable fix.

// operational relevance

Gives evaluation teams a concrete probe to detect when judge pipelines are tracking surface wording instead of the stated criterion before shipping them as reviewers.

authors: Ali Uyar

llm-evaluationjudge-validityreliability
№ 09
2026

Binary Evidence Sufficiency Dissociation in Reasoning-Model Hidden States

Linear probes on mid-reasoning hidden states in fixed-question, changed-context multi-hop QA recover a binary evidence-sufficiency signal distinct from final-answer correctness and pre-generation query uncertainty, supported by title-masked robustness with mixed cross-model replication.

// operational relevance

Gives retrieval and agent systems an internal signal for when a reasoning model actually has enough evidence to answer, separate from whether it happens to answer correctly.

authors: Ali Uyar

interpretabilitymulti-hop-qahidden-state-probes
№ 10
2026

Structured Yet Fragile: Signed Intervention Geometry of Matched ORF and CRISPR Cell Painting Profiles

Locked, auditable pipeline analyzing 5,332 same-gene ORF/CRISPR Cell Painting pairs under fixed preprocessing, bootstrap consensus, and prespecified robustness checks. Regime map is dominated by asymmetric and ambiguous states, shifts with effect-magnitude deciles, and yields null retrieval utility despite 536 significant Reactome terms.

// operational relevance

Demonstrates an auditable locked-pipeline protocol that bounds what morphology-derived perturbation claims can be defended when analytic flexibility is removed.

authors: Ali Uyar

locked-pipelinerobustnessnegative-result
№ 11
2026

Outcome Is Not Verification: Auditing Hidden-State Verifiers with Counterfactual Local Validity

Audit that separates outcome readout from process verification for hidden-state verifiers by holding the extractor family and scorer class fixed and varying only the supervision target across arithmetic DAG execution and grounded Horn-style logic. Strict same-step flips and PVS show LocalValidity supervision improves process-verification evidence while first-invalid-step localization remains metric-dependent and transfer-sensitive.

// operational relevance

Gives evaluation teams a measurement-first audit object for step-level verifiers so trace-level outcome readout cannot be mistaken for reliable process verification in production reasoning pipelines.

authors: Ali Uyar

hidden-state-verifiersprocess-verificationcounterfactual-audit
№ 12
2026

First-Unsafe-Step Counterfactual DPO for KPI-Gaming in Autonomous LLM Agents: Held-Out Evaluation Under a Capability-Safety Bottleneck

ODCV-Bench study on Qwen2.5-7B-Instruct that localizes the earliest severity-3 executed step, rewrites it into a minimal safe counterfactual, and trains LoRA-DPO and chosen-only SFT adapters on next-turn pairs. Held-out evaluation exposes a capability-safety bottleneck where sparse unsafe train support limits what localized preference post-training can learn.

// operational relevance

Gives agent-safety teams a disciplined held-out evaluation pipeline and a concrete warning: preference-based safety post-training can be upstream-bottlenecked by the base model's own unsafe behavioral horizon during on-policy data generation.

authors: Ali Uyar

agent-safetypreference-optimizationheld-out-evaluation
№ 13
2026

What Survives Control Calibration? A Full-Scope Negative Result for a Locked Minimum-Description Acceptance Criterion

Full-scope evaluation of a locked control-calibrated acceptance rule for top-down causal abstractions across a planted symbolic generator, a miniature IOI transformer, and GPT-2-small IOI. Across primary and quantized robustness codebooks, no supported abstraction class is certified, with structured failure modes in eligibility geometry and null-gap retention.

// operational relevance

Gives interpretability teams an audited negative-result template and concrete reporting rules (frontier-domain exclusions, configured-versus-realized coverage) before claiming a model implements a proposed algorithm.

authors: Ali Uyar

mechanistic-interpretabilityacceptance-criterianull-calibration
№ 19
2026

RIA: Retokenization Invariance Atlas

Deterministic audits for semantics-preserving formatting effects in LLM QA with no-truncation and semantics gates.

// operational relevance

Provides auditable reliability evidence when prompt formatting changes could alter outcomes.

authors: Ali Uyar

tokenizationevaluationauditability
№ 20
2026

CIS Technical Report

Technical report documenting the CIS reliability approach, evaluation criteria, and implementation notes for production-grade AI systems.

// operational relevance

Translates deterministic reliability design into a practical report teams can adopt in real delivery environments.

authors: Ali Uyar

technical-reportreliabilitysystems