_ _ _ ____ _ _
/ \ _ __ ___ | |_| |__ ___ _ __ | _ \ _ __ ___ (_) __| |
/ _ \ | '_ \ / _ \| __| '_ \ / _ \ '__| | | | | '__/ _ \| |/ _\` |
/ ___ \| | | | (_) | |_| | | | __/ | | |_| | | | (_) | | (_| |
/_/ \_\_| |_|\___/ \__|_| |_|\___|_| |____/|_| \___/|_|\__,_|
bbs
XQTRs lair...
Home //
Blog //
NULL emag. //
Files //
Docs //
Tutors //
GitHub repo
i don't know what got me... but i want to build a local ansi database!
i want to have all available ansi art in my hdd, categorized with tags
and even a search engine capability to search the text into the ansis.
to make this... i am gonna build an sqlite3 database that will contain
the text from the ansi images. this way with an sql string i could
make an easy search engine. so... how you put all ansis into an sqlite
base? by getting rid of all the non usable parts of the ansi, which
are a lot!
we don't need:
+ escape codes
+ ascii symbols that are non letters/numbers
+ sauce data
+ duplicate text
+ strings that are longer than ex.20chars, cause
these may be part of the graphics and not actual
text.
the simplest way is to use bash. with this oneliner we can transform a
60Kb ansi, into 100bytes text file, that will contain only the usable
/ searchable text.
cat "$1" | sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' | \
strings -e s | tr " " "\n" | sed '/^$/d'| sed 's/[^A-Z0-9]//ig' \
| uniq -u| sed 's/SAUCE//g' | sed -r 's/\b\w{20,}\s?\b//g' \
|sed -r '/^.{,3}$/d'| tr "\n" "," > "$2"
it will remove all unnecessary chars and keep only the readable
strings, which will form our base... more on the search engine... in
next issues.