(C) 2020 Kriss Blank and released under the MIT license, see for full license text.
local wtxtdiff=require("wetgenes.txt.diff")
Given two tables of strings, return the length , starta , startb of the longest common subsequence in table indexes or nil if not similar.
Given two tables of strings, return two tables of strings of the same length where as many strings as possible match.
Use the delimiter to split a string into a table of strings such that each string ends in the delimiter (except for possibly the final string) and a table.concat on the result will recreate the input string exactly.
table = wtxtdiff.split(string,delimiter)
String is the string to split and delimiter is a lua pattern so any special chars should be escaped.
for example
st = wtxtdiff.split(s) -- split on newline (default)
st = wtxtdiff.split(s,"\n") -- split on newline (explicit)
st - wtxtdiff.split(s,"%s+") -- split on white space
Given two tables of strings, return the length at the start and at the end that are the same. This tends to be a good first step when comparing two chunks of text.
(C) 2023 Kriss Blank under the
Generic text modifying functions.
(C) 2020 Kriss Blank under the
Some useful lex files for other editors to be used as starting points and checking we did not miss anything.
(C) 2020 Kriss Blank under the
(C) 2020 Kriss Blank under the
(C) 2020 Kriss Blank under the
undo / redo code for a text editor with persistence to disk
persistance to disk is in tsv format filename.txt.undo files where a .undo is added to the end of the file.
see for tsv format the first most column is always a command and the other columns are data needed to apply/reverse this command in theory a .undo file is a total history and as we should only be appending data (lines at a time even) then a file can be recovered from it and it should have limited corruption possibilities when things go wrong.
For security reasons this file may have undos removed as a separate step. That is to say things that where done / pasted accidentality then removed instantly will be purged from its history when performing a file save.
You can be save in the knowledge that any information you undo will not be saved except as temporary crash safe buffers.
The following need to be escaped with a \ when used in each column.
\n for newline,
\t for tab,
\r for carriage return,
\\ for backslash.
(C) 2020 Kriss Blank under the
local wutf = require("wetgenes.txt.utf")
helper functions to help manage a string as a stream of utf8 tokens.
string = wutf.char(number)
convert a single unicode value to a utf8 string of 1-4 bytes
lua pattern to match each utf8 character in a string
string = wutf.chars(number,number,...)
string = wutf.chars({number,number,...})
convert one or more unicode values into a utf8 string
unicode = wutf.ncode(string,index)
get the utf8 value at the given code index.
Note that this is slower than wutf.code as we must search the string to find the byte index of the code.
unicode = wutf.map_latin0_to_unicode[latin0] or latin0
latin0 = wutf.map_unicode_to_latin0[unicode] or unicode
I prefer the coverage of latin0 (ISO/IEC 8859-15) for font layout as it is just a few small differences for western european languages to get most needed glyphs into the first 256 codes.
size = wutf.size(string,index)
get the size in bytes of the utf8 value at the given byte index.
size = wutf.size(string)
get the size in bytes of the utf8 value at the start of this string
The return value will be 1-4 as 4 is the biggest utf8 code size.
unicode = wutf.code(string,index)
get the utf8 value at the given byte index.
unicode = wutf.code(string)
get the utf8 value at the start of this string
(C) 2023 Kriss Blank and released under the MIT license, see for full license text.
local wtxtwords=require("wetgenes.txt.words")
See for source of words and possible alternative licenses.
yes = wtxtwords.check(word)
This is a fast check if the word exists.
May call wtxtwords.load() to auto load data.
list = wtxtwords.transform(word,count,addletters,subletters)
Returns a table of upto count correctly spelled words that you may have miss spelt given the input word ordered by probability.
If the input word is spelled correctly then it will probably be the first word in this list but that is not guaranteed.
addletters is the maximum number of additive transforms, the higher this number the slower this function and it defaults to 4.
subletters is the maximum number of subtractive transforms and will not have much impact on speed, this defaults to the same value as addletters.
We run subletters subtractive transforms on our starting word and then we scan all possible words and perform addletters number of subtractive transforms on them and see if they match any of the transforms we built from our starting word. A match then means we can add up the number of transforms on both sides and that is how many steps it would take to get from one word to another by adding and subtracting letters.