Metadata-Version: 2.4
Name: ssc_codegen
Version: 0.17.0a0
Summary: Python-dsl code converter to html parser for web scraping 
Project-URL: Documentation, https://github.com/vypivshiy/selector_schema_codegen#readme
Project-URL: Issues, https://github.com/vypivshiy/selector_schema_codegen/issues
Project-URL: Source, https://github.com/vypivshiy/selector_schema_codegen
Project-URL: Examples, https://github.com/vypivshiy/selector_schema_codegen/examples
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Text Processing :: Markup :: HTML
Classifier: Topic :: Utilities
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: bs4>=0.0.2
Requires-Dist: colorama>=0.4.6; sys_platform == "win32"
Requires-Dist: cssselect>=1.2.0
Requires-Dist: lxml>=5.3.0
Requires-Dist: soupsieve>=2.6
Requires-Dist: typer>=0.15.1
Requires-Dist: typing_extensions; python_version < "3.11"
Requires-Dist: click<8.2.0
Requires-Dist: tree-sitter>=0.25.2
Dynamic: license-file

# Selector Schema codegen

Experimental PoC implementation of a code generator based on KDL2.0 syntax DSL.

## install

### From git (requires C/C++ compiler)

```bash
# Clone with submodules
git clone --recursive https://github.com/vypivshiy/selector_schema_codegen
cd selector_schema_codegen

# Install with pip (builds tree-sitter-kdl extension)
pip install .
```

**Requirements:**
- Linux/macOS: `gcc` or `clang`
- Windows: MSVC (Visual Studio Build Tools or Visual Studio)

### Via uv tool

```bash
uv tool install git+https://github.com/vypivshiy/selector_schema_codegen@features-kdl
```

## usage

### generate modules

```
ssc-gen generate examples/ -t js-pure -o .
```

### lint syntax

```
ssc-gen check examples/
```

### test schema by html output

from file:
```
python main.py run .\examples\booksToScrape.kdl:MainCatalogue -t py-bs4 -i index.html
```

from stdin:
```
curl https://books.toscrape.com/ | python main.py run .\examples\booksToScrape.kdl:MainCatalogue -t py-bs4
```

### test selectors:

from file
```
python main.py health .\examples\booksToScrape.kdl:MainCatalogue -i index.html
```

from stdin
```
curl https://books.toscrape.com/catalogue/page-2.html | python main.py health .\examples\booksToScrape.kdl:MainCatalogue
```


## syntax

see [docs](docs) and [examples](examples) how to use syntax

## LLM generate dsl config (experimental, not ready)

### prompt

use [SYSTEM_PROMPT](SYSTEM_PROMPT.md) for use in API pipelines or chats. before generate, call `ssc-gen check [FILES...] -f json` liner and send errors output if exists

### skill
use [kdl-schema-dsl](.agents/skills/kdl-schema-dsl) for generate config
