Dao Help: Dao string methods

[NAME]
ALL.dao.type.string.method

[TITLE]
Dao string methods

[DESCRIPTION]

Here is the full list of string methods: 
     
   1  string( count: int, char = 0 ) => string
   2  string( count: int )[index: int => string] => string
   3  size( invar self: string, utf8 = false ) => int
   4  insert( invar self: string, str: string, pos = 0 ) => string
   5  erase( invar self: string, pos = 0, count = -1 ) => string
   6  chop( invar self: string, utf8 = 0 ) => string
   7  trim( invar self: string, where: enum<head;tail> = $head+$tail, utf8 = 0 ) => string
   8  find( invar self: string, str: string, from = 0, reverse = 0 ) => int
   9  convert( invar self: string, to: enum<local,utf8,lower,upper> ) => string
  10  replace( invar self: string, str1: string, str2: string, index = 0 ) => string
  11  expand( invar self: string, invar subs: map<string,string>|tuple<...:string>,
  12          spec = "$", keep = 1 ) => string
  13  split( invar self: string, sep = "" ) => list<string>
  14  iterate( invar self: string, unit: enum<byte,char> = $byte )[char: int, index: int]
  15  fetch( invar self: string, pattern: string, group = 0, start = 0, end = -1 )
  16      => string
  17  match( invar self: string, pattern: string, group = 0, start = 0, end = -1 )
  18      => tuple<start:int,end:int>|none
  19  change( invar self: string, pattern: string, target: string, index = 0, 
  20      start = 0, end = -1 ) => string
  21  capture( invar self: string, pattern: string, start = 0, end = -1 ) => list<string>
  22  extract( invar self: string, pattern: string, 
  23      mtype: enum<both,matched,unmatched> = $matched ) => list<string>
  24  scan( invar self: string, pattern: string, start = 0, end = -1 )
  25      [start: int, end: int, state: enum<unmatched,matched> => none|@V]
  26      => list<@V>
  27  
  28  offset( invar self: string, charIndex: int ) => int
  29  char( invar self: string, charIndex: int ) => string
     


 0.1   Initialization Methods  

In addition to creating string by using string literals, the following to initialization 
methods are provided for convenient and flexible construction of strings. 
     
   1  string( count: int, char = 0 ) => string
   2  string( count: int )[index: int => string] => string
     


 0.1.1   string(count:int,char=0)=>string  
     
   1  string( count: int, char = 0 ) => string
     
Create and return a string that is composed of "count" of "char".

Examples, 
     
   1  var S1 = string( 5, 'S'[0] )   # SSSSS
   2  var S2 = string( 3, 0x9053 )   # 道道道
     


 0.1.2   string(count:int)[index:int=>string]=>string  
     
   1  string( count: int )[index: int => string] => string
     
Create and return a string that is concatenation of the resulting strings from the execti
on of the code section.

Examples, 
     
   1  var S1 = string( 5 ){ "S" }       # SSSSS
   2  var S2 = string( 2 ){ "道语言" }  # 道语言道语言
   3  var S3 = string( 6 ){ [index] (string) index }  # 012345
     


 0.2   Methods  

The following string methods are provided for convenient manipulation of strings. 
     
   1  size( invar self: string, utf8 = false ) => int
   2  insert( invar self: string, str: string, pos = 0 ) => string
   3  erase( invar self: string, pos = 0, count = -1 ) => string
   4  chop( invar self: string, utf8 = 0 ) => string
   5  trim( invar self: string, where: enum<head;tail> = $head+$tail, utf8 = 0 ) => string
   6  find( invar self: string, str: string, from = 0, reverse = 0 ) => int
   7  convert( invar self: string, to: enum<local,utf8,lower,upper> ) => string
   8  replace( invar self: string, str1: string, str2: string, index = 0 ) => string
   9  expand( invar self: string, invar subs: map<string,string>|tuple<...:string>,
  10          spec = "$", keep = 1 ) => string
  11  split( invar self: string, sep = "" ) => list<string>
  12  iterate( invar self: string, unit: enum<byte,char> = $byte )[char: int, index: int]
  13  offset( invar self: string, charIndex: int ) => int
  14  char( invar self: string, charIndex: int ) => string
     


 0.2.1   size(invar self:string,utf8=false)=>int  
     
   1  size( invar self: string, utf8 = false ) => int
     
Return the number of bytes or characters in the string.

Examples, 
     
   1  var S = "ABCDE"
   2  var len1 = S.size()
   3  var len2 = "XYZ".size()
     

For efficiency, you might consider to use the size operator %, %S gives the length of the
string.

 0.2.2   insert(invar self:string,str:string,pos=0)=>string  
     
   1  insert( invar self: string, str: string, pos = 0 ) => string
     
Insert "str" at "pos";
Return a new string;

 0.2.3   erase(invar self:string,pos=0,count=-1)=>string  
     
   1  erase( invar self: string, pos = 0, count = -1 ) => string
     
Erase "count" bytes starting from "pos";
Return a new string;

 0.2.4   chop(invar self:string,utf8=0)=>string  
     
   1  chop( invar self: string, utf8 = 0 ) => string
     
Chop EOF, '\n' and/or '\r' off the end of the string;
-- EOF is first checked and removed if found;
-- '\n' is then checked and removed if found;
-- '\r' is last checked and removed if found;
If "utf8" is not zero, all bytes that do not constitute a valid UTF-8 encoding sequence a
re removed from the end.

Examples, 
     
   1  var S1 = "line\n"
   2  var S2 = S1.chop()
   3  
   4  var S3 = "道语言"[:8]
   5  var S4 = S3.chop( 1 )  # 道语
     


 0.2.5   trim(invarself:string,where:enum<head;tail>=$head+$tail,utf8=0)=>string  
     
   1  trim( invar self: string, where: enum<head;tail> = $head+$tail, utf8 = 0 ) => string
     
Trim whitespaces from the head and/or the tail of the string;
If "utf8" is not zero, all bytes that do not constitute a valid UTF-8 encoding sequence a
re trimmed as well.

Examples, 
     
   1  var S1 = "\tline\n"
   2  var S2 = S1.trim()         # "line"
   3  var S3 = S1.trim( $head )  # "line\n"
   4  
   5  var S4 = "\t道语言"[:8]
   6  var S5 = S4.trim()            # "道语??"
   7  var S6 = S4.trim( $tail, 1 )  # "\t道语"
     


 0.2.6   find(invar self:string,str:string,from=0,reverse=0)=>int  
     
   1  find( invar self: string, str: string, from = 0, reverse = 0 ) => int
     
Find the first occurrence of "str" in this string, searching from "from";
If "reverse" is zero, search forward, otherwise backward;
Return -1, if "str" is not found; Otherwise,
Return the index of the first byte of the found substring for forward searching;
Return the index of the last byte of the found substring for backward searching;

Examples, 
     
   1  var S1 = "dao programming language and dao virtual machine"
   2  var P1 = S1.find( "dao" )         # Find the first "dao";
   3  var P2 = S2.find( "dao", -1, 1 )  # Find the last "dao";
     


 0.2.7   convert(invar self:string,to:enum<local,utf8,lower,upper>)=>string  
     
   1  convert( invar self: string, to: enum<local,utf8,lower,upper> ) => string
     
Convert the string:
-- To local encoding if the string is encoded in UTF-8;
-- To UTF-8 encoding if the string is not encoded in UTF-8;
-- To lower cases;
-- To upper cases;

Examples, 
     
   1  var S1 = "Dao Language"
   2  var S2 = S1.convert( $upper )  # DAO LANGUAGE
   3  var S3 = S1.convert( $lower )  # dao language
     


 0.2.8   replace(invar self:string,str1:string,str2:string,index=0)=>string  
     
   1  replace( invar self: string, str1: string, str2: string, index = 0 ) => string
     
Replace the substring "str1" in "self" to "str2";
Replace all occurrences of "str1" to "str2" if "index" is zero;
Otherwise, replace only the "index"-th occurrence;
Positive "index" is counted forwardly;
Negative "index" is counted backwardly;

Examples, 
     
   1  var S1 = "dao programming language and dao virtual machine"
   2  var S2 = S1.replace( "dao", "fast", 1 )   # replace the first "dao" with "fast";
   3  var S3 = S1.replace( "dao", "fast", -1 )  # replace the last  "dao" with "fast";
   4  var S4 = S1.replace( "dao", "fast", 0 )   # replace all "dao" with "fast";
     


 0.2.9   expand(invar self:string,invar subs:map<...>|tuple<...>,spec="$",keep=1)=>...  
     
   1  expand( invar self: string, invar subs: map<string,string>|tuple<...:string>,
   2          spec = "$", keep = 1 ) => string
     
Expand this string into a new string with substrings from the keys of "subs" substituted 
with the corresponding values of "subs".
If "spec" is not an empty string, each key has to be occurred inside a pair of parenthesi
s preceded with "spec", and the "spec", the parenthesis and the key are together substitu
ted by the corresponding value from "subs";
If "spec" is not empty and "keep" is zero, "spec(key)" that contain substrings not found 
in the keys of "subs" are removed; Otherwise kept.

Examples, 
     
   1  var S1 = '<a href="@(url)">@(name)</a>'
   2  var keyvalues = { "url" => "http://daovm.net", "name" => "Dao Website" }
   3  var S2 = S1.expand( keyvalues, "@" )  # <a href="http://daovm.net">Dao Website</a>
     


 0.2.10   split(invar self:string,sep="")=>list<string>  
     
   1  split( invar self: string, sep = "" ) => list<string>
     
Split the string by seperator "sep", and return the tokens as a list.
If "sep" is empty, split at character boundaries assuming UTF-8 encoding.

Examples, 
     
   1  var S1 = "dao::io::stdio"
   2  var L1 = S1.split( "::" )  # { "dao", "io", "stdio" }
   3  
   4  var S2 = "道语言"
   5  var L2 = S2.split()  # { "道", "语", "言" }
     


 0.2.11   iterate(invar self:string,unit:enum<byte,char>=$byte)[char:int,index:int]  
     
   1  iterate( invar self: string, unit: enum<byte,char> = $byte )[char: int, index: int]
     
Iterate over each unit of the string.
If "unit" is "$byte", iterate per byte;
If "unit" is "$char", iterate per character; Assuming UTF-8 encoding;
Each byte that is not part of a valid UTF-8 encoding unit is iterated once.
For the code section parameters, the first will hold the byte value or character codepoin
t for each iteration, and the second will be the byte location in the string.

Examples, 
     

     


 0.2.12   offset(invar self:string,charIndex:int)=>int  
     
   1  offset( invar self: string, charIndex: int ) => int
     
Get byte offset for the character with index "charIndex".

 0.2.13   char(invar self:string,charIndex:int)=>string  
     
   1  char( invar self: string, charIndex: int ) => string
     
Get the character with index "charIndex".

 0.3   Pattern Matching Methods  

Please see dao.type.string.pattern for more information.

     
   1  fetch( invar self: string, pattern: string, group = 0, start = 0, end = -1 )
   2      => string
   3  match( invar self: string, pattern: string, group = 0, start = 0, end = -1 )
   4      => tuple<start:int,end:int>|none
   5  change( invar self: string, pattern: string, target: string, index = 0, 
   6      start = 0, end = -1 ) => string
   7  capture( invar self: string, pattern: string, start = 0, end = -1 ) => list<string>
   8  extract( invar self: string, pattern: string, 
   9      mtype: enum<both,matched,unmatched> = $matched ) => list<string>
  10  scan( invar self: string, pattern: string, start = 0, end = -1 )
  11      [start: int, end: int, state: enum<unmatched,matched> => none|@V]
  12      => list<@V>