[NAME] ALL.dao.type.string.method [TITLE] Dao string methods [DESCRIPTION] Here is the full list of string methods: 1 string( count: int, char = 0 ) => string 2 string( count: int )[index: int => string] => string 3 size( invar self: string, utf8 = false ) => int 4 insert( invar self: string, str: string, pos = 0 ) => string 5 erase( invar self: string, pos = 0, count = -1 ) => string 6 chop( invar self: string, utf8 = 0 ) => string 7 trim( invar self: string, where: enum<head;tail> = $head+$tail, utf8 = 0 ) => string 8 find( invar self: string, str: string, from = 0, reverse = 0 ) => int 9 convert( invar self: string, to: enum<local,utf8,lower,upper> ) => string 10 replace( invar self: string, str1: string, str2: string, index = 0 ) => string 11 expand( invar self: string, invar subs: map<string,string>|tuple<...:string>, 12 spec = "$", keep = 1 ) => string 13 split( invar self: string, sep = "" ) => list<string> 14 iterate( invar self: string, unit: enum<byte,char> = $byte )[char: int, index: int] 15 fetch( invar self: string, pattern: string, group = 0, start = 0, end = -1 ) 16 => string 17 match( invar self: string, pattern: string, group = 0, start = 0, end = -1 ) 18 => tuple<start:int,end:int>|none 19 change( invar self: string, pattern: string, target: string, index = 0, 20 start = 0, end = -1 ) => string 21 capture( invar self: string, pattern: string, start = 0, end = -1 ) => list<string> 22 extract( invar self: string, pattern: string, 23 mtype: enum<both,matched,unmatched> = $matched ) => list<string> 24 scan( invar self: string, pattern: string, start = 0, end = -1 ) 25 [start: int, end: int, state: enum<unmatched,matched> => none|@V] 26 => list<@V> 27 28 offset( invar self: string, charIndex: int ) => int 29 char( invar self: string, charIndex: int ) => string 0.1 Initialization Methods In addition to creating string by using string literals, the following to initialization methods are provided for convenient and flexible construction of strings. 1 string( count: int, char = 0 ) => string 2 string( count: int )[index: int => string] => string 0.1.1 string(count:int,char=0)=>string 1 string( count: int, char = 0 ) => string Create and return a string that is composed of "count" of "char". Examples, 1 var S1 = string( 5, 'S'[0] ) # SSSSS 2 var S2 = string( 3, 0x9053 ) # 道道道 0.1.2 string(count:int)[index:int=>string]=>string 1 string( count: int )[index: int => string] => string Create and return a string that is concatenation of the resulting strings from the execti on of the code section. Examples, 1 var S1 = string( 5 ){ "S" } # SSSSS 2 var S2 = string( 2 ){ "道语言" } # 道语言道语言 3 var S3 = string( 6 ){ [index] (string) index } # 012345 0.2 Methods The following string methods are provided for convenient manipulation of strings. 1 size( invar self: string, utf8 = false ) => int 2 insert( invar self: string, str: string, pos = 0 ) => string 3 erase( invar self: string, pos = 0, count = -1 ) => string 4 chop( invar self: string, utf8 = 0 ) => string 5 trim( invar self: string, where: enum<head;tail> = $head+$tail, utf8 = 0 ) => string 6 find( invar self: string, str: string, from = 0, reverse = 0 ) => int 7 convert( invar self: string, to: enum<local,utf8,lower,upper> ) => string 8 replace( invar self: string, str1: string, str2: string, index = 0 ) => string 9 expand( invar self: string, invar subs: map<string,string>|tuple<...:string>, 10 spec = "$", keep = 1 ) => string 11 split( invar self: string, sep = "" ) => list<string> 12 iterate( invar self: string, unit: enum<byte,char> = $byte )[char: int, index: int] 13 offset( invar self: string, charIndex: int ) => int 14 char( invar self: string, charIndex: int ) => string 0.2.1 size(invar self:string,utf8=false)=>int 1 size( invar self: string, utf8 = false ) => int Return the number of bytes or characters in the string. Examples, 1 var S = "ABCDE" 2 var len1 = S.size() 3 var len2 = "XYZ".size() For efficiency, you might consider to use the size operator %, %S gives the length of the string. 0.2.2 insert(invar self:string,str:string,pos=0)=>string 1 insert( invar self: string, str: string, pos = 0 ) => string Insert "str" at "pos"; Return a new string; 0.2.3 erase(invar self:string,pos=0,count=-1)=>string 1 erase( invar self: string, pos = 0, count = -1 ) => string Erase "count" bytes starting from "pos"; Return a new string; 0.2.4 chop(invar self:string,utf8=0)=>string 1 chop( invar self: string, utf8 = 0 ) => string Chop EOF, '\n' and/or '\r' off the end of the string; -- EOF is first checked and removed if found; -- '\n' is then checked and removed if found; -- '\r' is last checked and removed if found; If "utf8" is not zero, all bytes that do not constitute a valid UTF-8 encoding sequence a re removed from the end. Examples, 1 var S1 = "line\n" 2 var S2 = S1.chop() 3 4 var S3 = "道语言"[:8] 5 var S4 = S3.chop( 1 ) # 道语 0.2.5 trim(invarself:string,where:enum<head;tail>=$head+$tail,utf8=0)=>string 1 trim( invar self: string, where: enum<head;tail> = $head+$tail, utf8 = 0 ) => string Trim whitespaces from the head and/or the tail of the string; If "utf8" is not zero, all bytes that do not constitute a valid UTF-8 encoding sequence a re trimmed as well. Examples, 1 var S1 = "\tline\n" 2 var S2 = S1.trim() # "line" 3 var S3 = S1.trim( $head ) # "line\n" 4 5 var S4 = "\t道语言"[:8] 6 var S5 = S4.trim() # "道语??" 7 var S6 = S4.trim( $tail, 1 ) # "\t道语" 0.2.6 find(invar self:string,str:string,from=0,reverse=0)=>int 1 find( invar self: string, str: string, from = 0, reverse = 0 ) => int Find the first occurrence of "str" in this string, searching from "from"; If "reverse" is zero, search forward, otherwise backward; Return -1, if "str" is not found; Otherwise, Return the index of the first byte of the found substring for forward searching; Return the index of the last byte of the found substring for backward searching; Examples, 1 var S1 = "dao programming language and dao virtual machine" 2 var P1 = S1.find( "dao" ) # Find the first "dao"; 3 var P2 = S2.find( "dao", -1, 1 ) # Find the last "dao"; 0.2.7 convert(invar self:string,to:enum<local,utf8,lower,upper>)=>string 1 convert( invar self: string, to: enum<local,utf8,lower,upper> ) => string Convert the string: -- To local encoding if the string is encoded in UTF-8; -- To UTF-8 encoding if the string is not encoded in UTF-8; -- To lower cases; -- To upper cases; Examples, 1 var S1 = "Dao Language" 2 var S2 = S1.convert( $upper ) # DAO LANGUAGE 3 var S3 = S1.convert( $lower ) # dao language 0.2.8 replace(invar self:string,str1:string,str2:string,index=0)=>string 1 replace( invar self: string, str1: string, str2: string, index = 0 ) => string Replace the substring "str1" in "self" to "str2"; Replace all occurrences of "str1" to "str2" if "index" is zero; Otherwise, replace only the "index"-th occurrence; Positive "index" is counted forwardly; Negative "index" is counted backwardly; Examples, 1 var S1 = "dao programming language and dao virtual machine" 2 var S2 = S1.replace( "dao", "fast", 1 ) # replace the first "dao" with "fast"; 3 var S3 = S1.replace( "dao", "fast", -1 ) # replace the last "dao" with "fast"; 4 var S4 = S1.replace( "dao", "fast", 0 ) # replace all "dao" with "fast"; 0.2.9 expand(invar self:string,invar subs:map<...>|tuple<...>,spec="$",keep=1)=>... 1 expand( invar self: string, invar subs: map<string,string>|tuple<...:string>, 2 spec = "$", keep = 1 ) => string Expand this string into a new string with substrings from the keys of "subs" substituted with the corresponding values of "subs". If "spec" is not an empty string, each key has to be occurred inside a pair of parenthesi s preceded with "spec", and the "spec", the parenthesis and the key are together substitu ted by the corresponding value from "subs"; If "spec" is not empty and "keep" is zero, "spec(key)" that contain substrings not found in the keys of "subs" are removed; Otherwise kept. Examples, 1 var S1 = '<a href="@(url)">@(name)</a>' 2 var keyvalues = { "url" => "http://daovm.net", "name" => "Dao Website" } 3 var S2 = S1.expand( keyvalues, "@" ) # <a href="http://daovm.net">Dao Website</a> 0.2.10 split(invar self:string,sep="")=>list<string> 1 split( invar self: string, sep = "" ) => list<string> Split the string by seperator "sep", and return the tokens as a list. If "sep" is empty, split at character boundaries assuming UTF-8 encoding. Examples, 1 var S1 = "dao::io::stdio" 2 var L1 = S1.split( "::" ) # { "dao", "io", "stdio" } 3 4 var S2 = "道语言" 5 var L2 = S2.split() # { "道", "语", "言" } 0.2.11 iterate(invar self:string,unit:enum<byte,char>=$byte)[char:int,index:int] 1 iterate( invar self: string, unit: enum<byte,char> = $byte )[char: int, index: int] Iterate over each unit of the string. If "unit" is "$byte", iterate per byte; If "unit" is "$char", iterate per character; Assuming UTF-8 encoding; Each byte that is not part of a valid UTF-8 encoding unit is iterated once. For the code section parameters, the first will hold the byte value or character codepoin t for each iteration, and the second will be the byte location in the string. Examples, 0.2.12 offset(invar self:string,charIndex:int)=>int 1 offset( invar self: string, charIndex: int ) => int Get byte offset for the character with index "charIndex". 0.2.13 char(invar self:string,charIndex:int)=>string 1 char( invar self: string, charIndex: int ) => string Get the character with index "charIndex". 0.3 Pattern Matching Methods Please see dao.type.string.pattern for more information. 1 fetch( invar self: string, pattern: string, group = 0, start = 0, end = -1 ) 2 => string 3 match( invar self: string, pattern: string, group = 0, start = 0, end = -1 ) 4 => tuple<start:int,end:int>|none 5 change( invar self: string, pattern: string, target: string, index = 0, 6 start = 0, end = -1 ) => string 7 capture( invar self: string, pattern: string, start = 0, end = -1 ) => list<string> 8 extract( invar self: string, pattern: string, 9 mtype: enum<both,matched,unmatched> = $matched ) => list<string> 10 scan( invar self: string, pattern: string, start = 0, end = -1 ) 11 [start: int, end: int, state: enum<unmatched,matched> => none|@V] 12 => list<@V>