_                _   _                 ____            _     _ 	
   / \   _ __   ___ | |_| |__   ___ _ __  |  _ \ _ __ ___ (_) __| |	
  / _ \ | '_ \ / _ \| __| '_ \ / _ \ '__| | | | | '__/ _ \| |/ _\` |	
 / ___ \| | | | (_) | |_| | | |  __/ |    | |_| | | | (_) | | (_| |	
/_/   \_\_| |_|\___/ \__|_| |_|\___|_|    |____/|_|  \___/|_|\__,_|	
                                                                bbs
  XQTRs lair...
Home // Blog // NULL emag. // Files // Docs // Tutors // GitHub repo
                                                                                
                                                                                
                                                                                
   i don't know what got me... but i want to build a local ansi database!       
   i want to have all available ansi art in my hdd, categorized with tags       
   and even a search engine capability to search the text into the ansis.       
                                                                                
   to make this... i am gonna build an sqlite3 database that will contain       
   the text from the ansi images. this way with an sql string i could           
   make an easy search engine. so... how you put all ansis into an sqlite       
   base? by getting rid of all the non usable parts of the ansi, which          
   are a lot!                                                                   
                                                                                
   we don't need:                                                               
                    + escape codes                                              
                    + ascii symbols that are non letters/numbers                
                    + sauce data                                                
                    + duplicate text                                            
                    + strings that are longer than ex.20chars, cause            
                      these may be part of the graphics and not actual          
                      text.                                                     
                                                                                
   the simplest way is to use bash. with this oneliner we can transform a       
   60Kb ansi, into 100bytes text file, that will contain only the usable        
   / searchable text.                                                           
                                                                                
   cat "$1" | sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' | \                              
   strings -e s | tr " " "\n" | sed '/^$/d'| sed 's/[^A-Z0-9]//ig' \            
   | uniq -u| sed 's/SAUCE//g' | sed -r 's/\b\w{20,}\s?\b//g' \                 
   |sed -r '/^.{,3}$/d'| tr "\n" "," > "$2"                                     
                                                                                
   it will remove all unnecessary chars and keep only the readable              
   strings, which will form our base... more on the search engine... in         
   next issues.