aboutsummaryrefslogtreecommitdiffhomepage
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* ocda public surface + dub.json import-path and dyaml cleanupsHEADmainRalph Amissah12 days1-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Three small follow-ups to the ocda/outputs split: 1. Add src/sisudoc/ocda/package.d (module sisudoc.ocda) as a 2-line public re-export of sisudoc.ocda.abstraction. Provides downstream consumers with a canonical "import sisudoc.ocda;" entry point and a stable handle for eventual peer-repo packaging of the abstraction library. 2. Fix the D import-path root in dub.json so it matches the declared module names: - spine:abstraction sub-package "importPaths": [ "./src/sisudoc" ] -> [ "./src" ] - main package buildTypes (dmd, ldc2, ldmd2, gdc, gdmd) "-I=src/sisudoc" -> "-I=src" The modules are named sisudoc.ocda.* / sisudoc.outputs.* / sisudoc.* so the filesystem-based resolver needs to see ./src as the root (so <root>/sisudoc/ocda/X.d resolves). 3. Replace dyaml sub-package's destructive preGenerateCommands ("rm -rf ./src/ext_depends/D-YAML/{examples,testsuite}") with declarative excludedSourceFiles globs. The two directories do not exist in the vendored D-YAML tree, so the rm was a no-op in practice; the glob form is defensive (would silently skip them if they were ever re-introduced) and removes the destructive side-effect from every build. (assisted by Claude-Code)
* ocda + outputs split: module/import + dub.json fixupsRalph Amissah12 days50-271/+271
| | | | | | | | | | | Modules and imports rewritten to sisudoc.ocda.* and sisudoc.outputs.*; dub.json excludedSourceFiles and the spine:abstraction sub-package sourcePaths collapsed to ./src/sisudoc/ocda. Verified: nix build .#spine-overlay-ldc clean. (assisted by Claude-Code)
* separate abstraction lib from output processingRalph Amissah12 days49-0/+0
| | | | | | create new directories under ./src/sisudoc ocda & outputs in order to separate the document abstraction library from downstream output processing (stuff broken till paths & modules fixed)
* org files out of sync, fixsisudoc-spine_v0.20.0Ralph Amissah14 days3-1963/+0
| | | | (also cgi_sqlite_search_form.d did not belong here)
* css: html (additional) tags alignmentRalph Amissah2026-05-221-0/+204
| | | | | | | | | | | | css: align body-flow <ul>/<li> & <details>/<summary> with <p> Not used by sisudoc-spine but for hand-authored body-flow markup such as the current homepage / body-flow, added block to each of the four html CSS string heredocs in src/sisudoc/io_out/xmls_css.d Existing tags are left in place and untouched. (assisted by Claude-Code)
* decouple abstraction phase1:2Ralph Amissah2026-05-223-7/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | phase1 step2: move SSP serialiser into sisudoc.abstraction package git mv src/sisudoc/io_out/create_abstraction_txt.d to src/sisudoc/abstraction/ssp.d Module rename: sisudoc.io_out.create_abstraction_txt -> sisudoc.abstraction.ssp Completes phase1: after this commit the sisudoc.abstraction package has zero outgoing edges into sisudoc.io_out. The library produces both the in-memory document object model AND the .ssp text serialisation without referencing any output-side module. The serialiser previously imported sisudoc.io_out.paths_output for the single purpose of constructing the .ssp output path. That import is dropped; the path construction is inlined as three lines of std.path (chainPath / asNormalizedPath / array) producing <output_path>/<language>/abstraction/<doc_uid_out>.ssp - byte-for-byte the same path the previous spineOutPaths!() call produced. Updated: - src/sisudoc/abstraction/ssp.d - module decl + inline path - src/sisudoc/abstraction/package.d - public import .ssp - src/sisudoc/spine.d - import sisudoc.abstraction.ssp (x2) Completes decouple abstraction phase1 (assisted by Claude-Code)
* decouple abstraction phase1:1Ralph Amissah2026-05-221-0/+85
| | | | | | | | | | | | | | | | | phase1 step1: introduce sisudoc.abstraction package re-export surface Create src/sisudoc/abstraction/package.d as a library-facing re-export module for the document-abstraction stage. The surface currently re-exports: - sisudoc.meta.metadoc (spineAbstraction, A-layer entry) - sisudoc.meta.metadoc_from_src (docAbstraction, B-layer entry) No code moves; no behaviour change. The package exists so external consumers can `import sisudoc.abstraction;` and reach the entry points without depending on spine's internal directory layout. (assisted by Claude-Code)
* decouple abstraction phase0:2Ralph Amissah2026-05-224-6/+6
| | | | | | | | | | | | | | | phase0 step2: move curation modules from meta/ to io_out/curate/ Curation modules moved to src/sisudoc/io_out/curate/, module declarations renamed sisudoc.io_out.curate.metadoc_curate* from sisudoc.meta.metadoc_curate* and updated spine.d imports. File contents are otherwise unchanged. Completes phase0: meta/ now has zero io_out imports - the abstraction core's outgoing deps are now only: meta/ internals + io_in/ + ext_depends/D-YAML (assisted by Claude-Code)
* decouple abstraction phase0:1Ralph Amissah2026-05-221-2/+0
| | | | | | | | | | | | | phase0: drop vestigial io_out.hub coupling from meta/metadoc.d phase0 step1: abstraction-library extraction/decoupling: meta/ should not import io_out/. Removed unused call to `import sisudoc.io_out.hub;` `mixin outputHub;` from `template spineAbstraction()`. (the load-bearing UFCS site is spine.d:92 which has its own `mixin outputHub). (assisted by Claude-Code)
* source_pod: --pod2 include (doc abstraction) .sspsisudoc-spine_v0.19.0Ralph Amissah2026-05-161-23/+32
| | | | | | | | | | | | | | | - include all (doc abstraction) .ssp in pod zip and in digests - fixed: for multi-language pods built with --pod2, only the last language's .ssp file was being written into pod.zip and listed in .digests.txt each languages' .ssp files were on disk in the pod directory (copied during their own per-language passes) but were not in final zip as it was being built once for each language and writing over previous, (only the last one remaining). The solution is to follow the pattern already used to avoid this by .sstm and .ssi, namely wait for the last language and iterate the manifest_list_of_languages internaly. (assisted by Claude-Code)
* latex: some fixes for xelatex 2025Ralph Amissah2026-05-161-4/+4
| | | | (assisted by Claude-Code)
* sqlite: stop on missing/unwritable sqlite-db-pathRalph Amissah2026-05-161-4/+24
| | | | | | - fatal error on missing/unwritable --sqlite-db-path (assisted by Claude-Code)
* org headers rearranged (& odd hilighting issue)Ralph Amissah2026-05-042-3/+1
| | | | | - odd hilighting issue ... must result from my org config, but "fix" makes things easier for me.
* add children_headings to document abstractionRalph Amissah2026-04-223-12/+22
| | | | | | | | | | | | | | | | | Add int[] children_headings field to DocObj_MetaInfo_ and compute it in the post-processing pass of metadoc_from_src.d, right after last_descendant_ocn. Single O(n) pass builds a parent_ocn -> child heading OCNs map, then assigns to each heading object. Useful for tree-structured output. The .ssp serializer now reads directly from the abstraction field instead of pre-computing its own map. metadoc_object_setter.d: +1 line (field declaration) metadoc_from_src.d: +17 lines (computation) create_abstraction_txt.d: -10 lines (simplified) Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* add --pod2 flag, decouple --show-abstraction from --podRalph Amissah2026-04-222-31/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finer-grained control over when .ssp files are produced: --show-abstraction writes .ssp to OUTPUT/lang/abstraction/ independently of any pod flag --pod builds pod without .ssp bundled --pod2 builds pod with .ssp in media/abstraction/ Changes to spine.d: - show_abstraction() now only responds to its own flag and pod2, no longer triggered by source_or_pod - Add pod2 to opts init, getopt, OptActions - pod() returns true for both --pod and --pod2 - source_or_pod() includes pod2 Changes to source_pod.d: - Remove per-document pod directory (rmdirRecurse) before regeneration, ensuring clean slate on every run. This prevents stale content from previous runs (e.g. a --pod2 run followed by --pod would otherwise leave an outdated media/abstraction/ directory) - Gate abstraction directory creation and .ssp bundling on pod2 flag specifically Tested: --pod (no .ssp), --pod2 (.ssp in pod + zip), --show-abstraction (standalone .ssp), --pod after --pod2 (stale abstraction cleaned up). All 35 sample documents pass. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp: omit empty-value array property entriesRalph Amissah2026-04-221-3/+6
| | | | | | | | | | | | | Add empty-string guards to array property loops (.stow_link, .lev4_subtoc, .anchor_tag) so entries with zero-length values are not emitted. Empty properties have no value for PEG parsing - absent lines are faster to skip than matching a property name to find an empty value. Removes 1488 empty .anchor_tag: lines from Wealth of Networks .ssp alone. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp: add .children property for heading tree navigationRalph Amissah2026-04-221-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | - Add explicit child heading OCN lists to heading objects, pre-computed in a single O(n) pass over the body section before serialization. This makes the document tree directly navigable without scanning - each heading lists its direct sub-heading OCNs. - Example output for a chapter heading: [10] heading :1 .last_descendant: 65 .children: 14 24 42 57 - Implementation: builds an int[][int] map (parent_ocn -> child heading OCNs) from one pass over the body objects, then emits .children: during serialization for headings that have entries in the map. - The tree was already reconstructable from parent_ocn + last_descendant_ocn, but .children makes it immediate - no scanning required to find a heading's sub-structure. - Tested against all 35 sample documents - zero failures. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp serializer: include all ObjGenericComposite fieldsRalph Amissah2026-04-221-9/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Make the .ssp format a complete representation of the document abstraction by serializing all remaining fields from ObjGenericComposite (only omitting ptr.* runtime indices which are meaningless outside the in-memory context). - New fields added: .ancestors_collapsed: - collapsed level ancestor chain .dom_status: - DOM structure markedup tags status[8] .dom_status_collapsed: - DOM structure collapsed status[8] .heading_lev_collapsed: - collapsed heading level .parent_lev: - parent heading level (markup) .o_n_type: - object numbering type (0=ocn, 1=non, 2=bkidx) .is_of_type: - para/block type classification .attrib: - general attributes string .meta_lang: - block language (group/block/quote) .meta_syntax: - codeblock syntax from metainfo .sha256: - hex-encoded SHA-256 digest of object content .has: images_no_dim - image without dimensions flag .table_aligns: - column alignment array .table_walls: - table walls/borders flag .stow_link: - extracted URLs (one per line) .heading_lev_anchor: - heading level anchor tag .segment_epub: - EPUB segment anchor tag .heading_ancestors_text: - pipe-separated ancestor headings .lev4_subtoc: - sub-table-of-contents entries (one per line) .anchor_tag: - additional anchor tags (one per line) - Tested against all 35 sample documents - zero failures. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp serializer: omit identifier when it equals OCNRalph Amissah2026-04-221-3/+6
| | | | | | | | | | | | | | | - For heading objects, the identifier was always emitted on the declaration line (e.g. "[10] heading :1 10") even when it was just the OCN repeated. Now only emits the identifier when it differs from the OCN (i.e. when there is a named segment like "acknowledgments" or "a1"), reducing redundancy. Before: [10] heading :1 10 After: [10] heading :1 Named segments still appear: [0] heading :1 a1 Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* include .ssp document abstraction in source podRalph Amissah2026-04-223-1/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - When --source/--pod is used, automatically generate the .ssp document abstraction and bundle it into the pod at media/abstraction/{doc_uid}.{lang}.ssp - This makes show_abstraction implicitly true when source_or_pod is active, so the .ssp file is generated before the pod assembler runs (abstraction runs before outputHub, and source_or_pod is the first task in outputHub). - Changes: paths_source.d: Add abstraction_root() path helper to _PodPaths struct, following the same pattern as image_root(). Produces paths like pod/media/abstraction/ for both zpod (inside zip) and filesystem_open_zpod (open directory). source_pod.d: - Create media/abstraction/ directory in podArchive_directory_tree - Bundle .ssp file in pod_zipMakeReady: reads from the abstraction output directory, copies to open pod directory, adds to zip archive, computes SHA-256 digest - Write .ssp digest in zipArchiveDigest alongside sstm and ssi digests spine.d: Make show_abstraction() return true when source_or_pod is active (previously only returned true for explicit --show-abstraction flag). - The .ssp is always included when building pods - no exclusion flag for this experimental feature to keep things simple. Not generated for non-pod outputs (--text, --html, etc.) unless --show-abstraction is explicitly passed. - Tested against all 35 sample documents - zero failures. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* document abstraction as per document sqlite dbRalph Amissah2026-04-222-0/+373
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --show-abstraction-db flag to write per-document - SQLite database of document abstraction (Claude-Code primary assist) - Add a new output mode that serializes the in-memory document abstraction to a per-document SQLite database. This complements the .ssp text format (--show-abstraction) with a queryable database representation of the same data. - Schema: metadata table - key/value pairs for document metadata (title, creator, dates, rights, classify, identifiers, language, notes, make settings, doc_has counts) objects table - one row per document object with columns: section, seq (position within section), ocn, is_a, is_of_part, is_of_type, heading_level, identifier, parent_ocn, last_descendant_ocn, ancestors, indent/bullet/lang, has_* flags, segment/anchor tags, table/code properties, text content Indexed on: section, ocn, parent_ocn, is_a, heading_level - Uses prepared statements via d2sqlite3 (existing dependency) for safe and efficient insertion. Each document produces a standalone .abstraction.db file in the abstraction/ output directory. - New files: src/sisudoc/io_out/create_abstraction_db.d Follows the same pattern as create_abstraction_txt.d. Creates schema, populates metadata via key/value inserts, then iterates all sections writing objects with prepared statements within a single transaction. - Changes to spine.d: - Add "show-abstraction-db" to opts init, getopt, OptActions - Add to abstraction(), require_processing_files(), and meta_processing_general() gates - Insert call at both spineAbstraction sites - Tested against all 35 sample documents (including 9-language live-manual) - zero failures. Works standalone or combined with --show-abstraction and other output flags. - Example queries the database supports: SELECT ocn, heading_level, text FROM objects WHERE is_a = 'heading' AND section = 'body'; SELECT * FROM objects WHERE parent_ocn = 10; SELECT key, value FROM metadata WHERE key LIKE 'title.%'; Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp document abstraction as PEG parsable textRalph Amissah2026-04-222-0/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --show-abstraction flag to write .ssp document abstraction files - Add a new output mode that serializes the in-memory document abstraction (produced by spineAbstraction) to a human-readable, line-oriented text format (.ssp). This captures the full object model after parsing and abstraction but before output generation. - The .ssp format uses unambiguous line prefixes: @section { } - section boundaries (head/toc/body/endnotes/...) [N] type - object declaration with OCN .name: value - object properties (only non-defaults) | content - text content lines % comment - comments - New files: src/sisudoc/io_out/create_abstraction_txt.d Serializer module following the same template pattern as metadoc_show_summary.d. Walks doc.abstraction() section by section, writing metadata preamble (@meta, @make, @doc_has) then each object with its properties and text content. Output goes to {output_path}/{lang}/abstraction/{doc}.ssp - Changes to spine.d: - Add "show-abstraction" to opts initialization, getopt, and OptActions struct - Add show_abstraction to abstraction(), require_processing_files(), and meta_processing_general() so the flag triggers full document processing - Insert call at both spineAbstraction sites (parallel and serial branches), gated by show_abstraction flag, following the same pattern as show_config/show_summary/show_make - Tested against all 35 sample documents (including multilingual live-manual in 9 languages) - zero failures. Works standalone (--show-abstraction) or combined with other output flags (--show-abstraction --html --text). No effect on existing code paths when the flag is not used. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* upkeep, update a few pathssisudoc-spine_v0.18.0Ralph Amissah2026-04-221-10/+10
|
* spine may be run against a zipped spine-pod urlRalph Amissah2026-04-132-2/+147
| | | | | | - claude contributed src - processes zip from url using (system installed) curl for download
* spine may be run against a document-markup zip podRalph Amissah2026-04-132-2/+457
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | - claude contributed src - Opens the zip with std.zip.ZipArchive (reads the whole file into memory) - Locates pod.manifest inside the archive to discover document paths and languages - Extracts markup files (.sst/.ssm/.ssi) as in-memory strings - Extracts images as in-memory byte arrays - Extracts conf/dr_document_make if present - Presents these to the existing pipeline as if they were read from the filesystem - Some security mitigations: - Zip Slip / Path Traversal: Reject entries containing `..` or starting with `/`; canonicalize resolved paths and verify they fall within extraction root - Zip Bomb: Check `ArchiveMember.size` before extracting; enforce per-file (50MB) and total size limits (500MB) - Entry Count: Limit number of entries (a pod should have at most ~100 files) - Path depth: limit (Maximum 10 path components). - Symlinks: Verify no symlinks in extracted content before processing (post-extraction recursive scan) - Filename Validation: Only allow expected characters; reject null bytes - Malformed Zips: Catch `ZipException` from `std.zip.ZipArchive` constructor - Cleanup on error
* latex minor improvements and fixes, require testingRalph Amissah2026-04-061-9/+12
| | | | | - FIXES issue with .tex files and xetex finding image paths when run within latex/ output directory
* 2026Ralph Amissah2026-01-0948-49/+49
|
* text output, improve various (including no-ocn)Ralph Amissah2025-10-144-36/+35
| | | | - revisit links (fix later)
* abstraction metainfo, provide endnote parent ocnRalph Amissah2025-10-134-12/+8
| | | | | | - preferable, endnote parent object number available for use (as here in text output, compare "endnotes, add caller ocn" commit)
* latex quote object, quick fixRalph Amissah2025-10-081-2/+22
|
* text output, endnotes, add caller ocn (& some cleaning)Ralph Amissah2025-10-084-25/+45
|
* a text output (and skel an outline)Ralph Amissah2025-10-0310-41/+885
| | | | - spine --text [--output=output path] [markup source]
* terminal output verbosity levels, minor reworkRalph Amissah2025-09-2514-107/+130
|
* spine.d tidyRalph Amissah2025-09-231-109/+88
|
* dub_describe.json + other minor miscRalph Amissah2025-09-142-49/+49
|
* src/ext_deplends d-yaml updated (v0.10.0)Ralph Amissah2025-08-28687-5103/+7987
|
* imports, make line searchableRalph Amissah2025-07-1529-432/+359
|
* source & pod (fix build from non-pod source)Ralph Amissah2025-06-122-23/+26
| | | | - appears to work, but needs review
* org ready ldc-1.41.0-beta1; flake using ldc-1.40.1Ralph Amissah2025-04-181-2/+2
| | | | - plus minor housekeeping/tidy
* minorRalph Amissah2025-04-021-1/+0
|
* sisudoc-spine upkeep, minor, a file renamedRalph Amissah2025-03-222-0/+1
|
* triple single-quote marks block identifier addedRalph Amissah2025-02-214-4/+229
| | | | | | | | - tics a bit cumbersome where single quotes work just as well - testing required (special cases not covered) - diverges from sisu markup which will need an update sometime
* doc (metadata & abstraction) struct follow throughRalph Amissah2025-02-195-275/+236
|
* document (metadata & abstraction) structRalph Amissah2025-02-198-107/+100
| | | | | | - struct replaces tuple - some direct naming of structs returned (instead of use of auto) - minor
* 2025Ralph Amissah2025-01-0146-47/+47
|
* refactor yaml extraction code fileRalph Amissah2024-12-212-671/+354
|
* nix build flake.nix fixRalph Amissah2024-12-032-0/+2
|
* pod zip fixesRalph Amissah2024-07-104-131/+120
| | | | | - serial processing (need to be built serially) - multilingual pods, copy all languages before zip
* [fn].digest.txt, sha256 of pod source files & podRalph Amissah2024-07-047-339/+379
|
* markup source digest to metadata.htmlRalph Amissah2024-07-011-3/+18
|