Fisheries and Oceans Canada / Pêches et Océans Canada - Government of Canada / Gouvernement du Canada Fisheries and Oceans Canada / Pêches et Océans Canada - Government of Canada / Gouvernement du Canada
 
Français Contact Us Help Search Canada Site
Home What's New DFO National Site Map Media

Fisheries & Oceans
 
 
Maritimes Region
Fishing Industry
General Public
Marine & Oceans Industry
Media
Students and Teachers
Scientists and Researchers
 
AconIcon ACON       Home/Topics   |   Commands

String Functions


The string functions include: Cat, Numstr, Substr, Str, Strfold, Strpos, StrRegExp, StrRep, Trim, Num, Chartonum, Numtochar, Sprintf, Lower, Upper, Len_NDC, Encode_String, and Decode_String.

None of the string functions have an initial state


Cat

This function catenates 2 or more character strings or character matrices and returns a single character string as the result. Cat also works with numeric data, see Math Cat.

There are 2 or more parameters:
string1 - a character string to be catenated.
stringn - a character string to be catenated.

cat(string,string,[string]);

filename = cat("FILE",str(90));
print(filename);
FILE90
Alternately there are 3 or more parameters when catenating character matrices:
dim - the index of the dimension over which the catenation takes place
(1 = catenate into a matrix adding elements as new rows, 2 = catenate into a matrix adding elements as new columns).
value - any character string or character matrix.
valuen - any character string or character matrix.

cat(dim,value,value,[value]);

cat(1,"test","test");
test
test
cat(2,"test","test");
tt
ee
ss
tt

See the example Data Points and Labels


Numstr

This function requires a character string, starting position, and length, and returns the numeric value of the substring as the result.

There are 3 parameters:
string - a character string.
start - the start of the substring in the string (1 is the first element)
length - the length of the substring to extract.

number = numstr(string,start,length);

a = numstr("1 2 3 4 5",5,1);
print(a);
3

Substr

This function requires a character string, starting position, and length, and returns the character substring as the result.

There are 3 parameters:
string - a character string.
start - the start of the substring in the string (1 is the first element)
length - the length of the substring to extract.

substr(string,start,length);

a = substr("test input string",6,5);
print(a);
input

See the example Filled Arrows


Str

This function requires a numeric argument, and returns a character string representation of the argument as the result.

There is 1 parameter:
value - any valid integer or floating point number, vector, or matrix.

str(value);

cyear = str(1990);

See the example Multiple Plots


Strfold

This function requires a text string argument and an optional delimiter character. The function returns a character matrix which has been ´folded´ where the delimiter character was found. The resultant matrix has a shape where the number of rows is equal to the number of delimiters, and the number of columns is equal to the longest string encountered.

There are 2 parameters:
string - a character string.
delimiter - an optional character string containing the character

strfold("string"[,"delimiter"]);

m = strfold("This is a test/of string folding","/");
len(m)
2 17
m
This is a test
of string folding

Strpos

This function requires 2 text string arguments, and determines the position of the 2nd text string within the 1st text string. The integer position (1 origin) is returned as the result. If the 2nd string is not a sub-string of the 1st string, 0 is returned.

There are 2 parameters:
source string - the character string to be searched.
search string - the character string to search for.

integer = strpos(source string, search string);

theposition = strpos("test","es");
print(theposition);
2

StrRegExp

This function uses regular expressions for pattern matching to replace text within a character string. The function requires 3 text string arguments, and replaces all occurances of the 2nd text string within the 1st text string with the 3rd string. A new string is returned with the changes. The original string is unmodified. For simple exact text matching, use the StrRep() command.

Regular expressions are text patterns that are used for string matching. The expressions contain both plain text and special characters to define the type of pattern matching to perform.

Suppose, we are looking for any character in upper case, then the regular expression we would search for is "[A-Z]". The brackets indicate that the character being compared should match any one of the characters enclosed within the bracket. The dash (-) between A and Z indicates that the pattern includes all characters between A and Z (using the ASCII character sequence).

Some special charcters are reserved as regular expression special characters to perform pattern matching (in our example the [, -, and ] charcters were special characters). To search for a special character, use a backslash before the special character ("\*" matches a single asterisk).

The special characters are:
^Beginning of the string. The expression "^A" will match an 'A' only at the beginning of the string.
^The caret (^) immediately following the left-bracket ([) has a different meaning. It is used to exclude the remaining characters within brackets from matching the target string. The expression "[^0-9]" indicates that the target character should not be a digit.
$The dollar sign ($) will match the end of the string. The expression "abc$" will match the sub-string "abc" only if it is at the end of the string.
|The alternation character (|) allows either expression on its side to match the target string. The expression "a|b" will match 'a' as well as 'b'.
.The dot (.) will match any character.
*The asterix (*) indicates that the character to the left of the asterix in the expression should match 0 or more times.
+The plus (+) is similar to asterix but there should be at least one match of the character to the left of the + sign in the expression.
?The question mark (?) matches the character to its left 0 or 1 times.
()The parenthesis affects the order of pattern evaluation and also serves as a tagged expression that can be used when replacing the matched sub-string with another expression.
[]Brackets ([ and ]) enclosing a set of characters indicates that any of the enclosed characters may match the target character.

The parenthesis, besides affecting the evaluation order of the regular expression, also serves as tagged expression which is something like a temporary memory. This memory can then be used when we want to replace the found expression with a new expression. The replace expression can specify a & character which means that the & represents the sub-string that was found. So, if the sub-string that matched the regular expression is "abcd", then a replace expression of "xyz&xyz" will change it to "xyzabcdxyz". The replace expression can also be expressed as "xyz\0xyz". The "\0" indicates a tagged expression representing the entire sub-string that was matched. Similarly we can have other tagged expression represented by "\1", "\2" etc. Note that although the tagged expression 0 is always defined, the tagged expression 1,2 etc. are only defined if the regular expression used in the search had enough sets of parenthesis.

Here are few examples.
StringSearchReplaceResult
Mr.(Mr)(\.)\1s\2Mrs.
abc(a)b(c)&-\1-\2abc-a-c
bcd(a|b)c*d&-\1bcd-b
abcde(.*)c(.*)&-\1-\2abcde-ab-de
cde(ab|cd)e&-\1cde-cd

The description of regular expressions above was taken largely from: Anjum, Zafir. 1999. Using Regular Expressions for Search/Replace.

There are 3 parameters:
source string - the character string to be searched.
search string - the regular expression to be used for searching.
replace string - the regular expression to be used for replacement.

newstring = strRegExp(source string, search string, replacement string);

s2 = strRegExp("test","(t).*(t)","\1xxx\2");
print(s2);
txxxt

StrRep

This function requires 3 text string arguments, and replaces all occurances of the 2nd text string within the 1st text string with the 3rd string. A new string is returned with the changes. The original string is unmodified. This function fully collapes overlapping reduction patterns,
e.g. strrep("teeest","ee","e") returns "test".

There are 3 parameters:
source string - the character string to be searched.
search string - the character string to search for.
replace string - the character string to search for.

newstring = strrep(source string, search string, replacement string);

s2 = strrep("test","es","xxx");
print(s2);
txxxt

Trim

This function requires a string argument, and returns a character string with the trailing blanks removed as the result.

There are 3 parameters:
string - a character string.
delimiter string - optional character string to be used as the delimiter character at which trimming occurs (default is " ").
end flag - optional flag indicating which end of the string to trim (0 = trim end of text, 1 = trim start of text).

trim("character string");

newstr = trim("test ");
strlen(newstr)
4
x = "1995/10/28"
print(x);
1995/10/28
y = trim(x,"/",1) /* trim the year off the start */
print(y)
10/28
trim(y,"/") /* trim the day off the end */
10

Num

This function requires a character string argument, and returns a numeric value representation of the argument as the result. If unable to convert the number, the result is the number 1.0E-4915 [see the function Nan].

There is 1 parameter:
string - any valid character string representing one or more integer or floating point numbers.

number = num(string);

year = num("1990");
num("10 20 test 30")
10 20 30

See the example Filled Arrows


Chartonum

This function requires a character string argument, and returns the equivalent decimal value representation of the ASCII character as the result.

There is 1 parameter:
string - any valid character string or character matrix.

integer = chartonum("string");

chartonum("test string");
116 101 115 116 32 115 116 114 105 110 103

Numtochar

This function requires a decimal value representation of an ASCII character as an argument, and returns the equivalent character (string) as the result..

There is 1 parameter:
number - any valid number, vector or integer matrix representing ASCII character(s).

string = numtochar(number);

numtochar(116 101 115 116 32 115 116 114 105 110 103);
test string

Sprintf

This function requires 2 arguments, a text string, and a numeric argument. The text string argument is used as the C format string to format the numeric argument. This function returns the formatted number as a text string or character matrix as the result. This function differs from the normal C implementation in that only integer or double numeric values are allowed to be formatted. If the format string includes a single format specifier (i.e. a single % specifier), then the entire format will be used for each element of a vector or matrix to be formatted.

There are 2 parameters:
text string - the format text string to use. You normally should include an integer format %d operator as part of the string for integer numbers, and a double format %f operator as part of the string for floating point numbers.
value - the number(s) to be formatted, usually an extalk variable.

sprintf("string",value);

answer = 6321.583423;
line = sprintf("%8.1f",answer);
6321.6
x = cat(2,1 2 3 4,1 10 18 45);
sprintf("Row %2d: %02d",x);
Row 1: 01
Row 2: 10
Row 3: 18
Row 4: 45

Lower

This function converts its character string (or character matrix) argument to lower case.

There is 1 parameter:
text string - the character string or matrix to convert.

lower("string");

line = lower("Test");
test

Upper

This function converts its character string (or character matrix) argument to upper case.

There is 1 parameter:
text string - the character string or matrix to convert.

upper("string");

line = upper("Test");
TEST

Len_NDC

This function returns the length of a character string (or character matrix) argument in NDC units if it were to be plotted. Trailing blanks are not included in the calculation of the length of the string.

Note that the actual length of the string (number of characters) may be calculated using the Len command.

There is 1 parameter:
text string - the character string or matrix to calculate the length of.

Len_NDC("string");

ltest = len_NDC("Test");
0.0344667

Encode_String

This function converts its character string argument to an encoded string using a key to encode the string.

There are 2 parameters:
text string - the character string to convert.
key string - the character string to used as the encoding key.

Encode_String("string","key");

newstring = Encode_String("Test","dog");

Decode_String

This function converts its character string argument to a decoded string using a key to decode the string.

There are 2 parameters:
text string - the character string to convert.
key string - the character string to used as the decoding key.

Decode_String("string","key");

newstring = Decode_String("Test","dog");


AconIcon ACON       Home/Topics   |   Commands



Last Modified : 2005-11-14