convert-scripts

Various scripts to convert editions of the Tipiṭaka into Markdown and HTML. Use at your own risk.

python scan_sc_html.py | sed "s/id='[^']*'/id='X'/g" | sed "s/data-counter='[^']*'/data-counter='X'/g" | sed "s/value='[^']*'/value='X'/g" | sort | uniq

Main Scripts

None of these take any arguments on command line

wt_genlinks.py: Create links.py containing semantic mapping between file numbers to semantic path
tipitaka2500.py: Create tipitaka2500.github.io/tipitaka from World Tipitaka
wt2md.py: Create Markdown files in tipitaka2500 from World Tipitaka

Experimental translation and summary using LLM - reads folders/files from command line

wt2eng.py: Translate Markdown files into English using Llama 3.3 70b.

Utilities

These are mainly used for experimental/testing purposes

wt2html.py: Convert World Tipitaka from XML to HTML body fragments (excludes everything outside ), no further processing (eg. fixing inline javascript links), write out to wt-html
wt2semantic.py: Convert World Tipitaka from XML to HTML body fragments (excludes everything outside ), but in a semantic tree structure, in wt-semantic
bs4.py: Jupyter notebook for playing with BeautifulSoup

Support files

html_template.py: HTML template used by tipitaka2500.py
get_data.py: convert single XML file to HTML (raw), used by various utilities
links.py: generated by wt_genlinks.py, used by used by tipitaka2500.py

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
World-Tipitaka @ b6ec4f2		World-Tipitaka @ b6ec4f2
bilara-data @ b33b97e		bilara-data @ b33b97e
gretil/2_pali		gretil/2_pali
gretilcsv		gretilcsv
kaccayana @ 7c3a954		kaccayana @ 7c3a954
tipitaka2500 @ bfcb8af		tipitaka2500 @ bfcb8af
tipitaka2500-eng @ a141c29		tipitaka2500-eng @ a141c29
tipitaka2500.github.io @ 5cc3e9a		tipitaka2500.github.io @ 5cc3e9a
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
bl2eng.py		bl2eng.py
bs4.ipynb		bs4.ipynb
convert_sc.py		convert_sc.py
convert_wt.py		convert_wt.py
genimage.py		genimage.py
get_data.py		get_data.py
gettoc.py		gettoc.py
gretil2csv.py		gretil2csv.py
html2md.py		html2md.py
html_template.py		html_template.py
htmlstrings		htmlstrings
index.html		index.html
kc2eng.py		kc2eng.py
links.py		links.py
main.py		main.py
pyproject.toml		pyproject.toml
rp2eng.py		rp2eng.py
sc2html.py		sc2html.py
scan_sc_html.py		scan_sc_html.py
terms.sed		terms.sed
testdiv.py		testdiv.py
tidipa.jpg		tidipa.jpg
tipitaka2500.py		tipitaka2500.py
uv.lock		uv.lock
v1-tipitaka2500.py		v1-tipitaka2500.py
wt2eng-gemini.py		wt2eng-gemini.py
wt2eng.py		wt2eng.py
wt2html.py		wt2html.py
wt2md.py		wt2md.py
wt2semantic.py		wt2semantic.py
wt_genlinks.py		wt_genlinks.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

convert-scripts

Main Scripts

Utilities

Support files

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

convert-scripts

Main Scripts

Utilities

Support files

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages