Ruby string
The String object in Ruby is used to store or manipulate a sequence of one or more bytes.
Ruby strings are divided into single-quote strings (’) and double-quote strings (“). The difference is that double-quote strings can support more escape characters.
Single quotation mark string
The simplest string is the single quote string, which stores the string within the single quote:
'This is a string from a Ruby program'
If you need to use a single quote character within a single quote string, you need to use a backslash () in the single quote string so that the Rubyinterpreter does not consider the single quote character to be the termination symbol of the string:
'Won\\'t you read O\\'Reilly\\'s book?'
A backslash can also escape another backslash so that the second backslash itself is not interpreted as an escape character.
The following are string-related features in Ruby.
Double quotation mark string
In a double quote string, we can use the #{}
pound sign and curly braces to evaluate the value of the expression:
Embed variables in the string:
Example
#!/usr/bin/ruby#-*- coding: UTF-8 -*-name1="Joe"name2="Mary"puts"hello
#{name1}, #{name2} where?"
The output output of the above example is as follows:
Hello Joe, where is Mary?
Perform mathematical operations in a string:
Example
#!/usr/bin/ruby#-*- coding: UTF-8 -*-x,y,z=12,36,72puts"The value of x is #{ x
}"puts"x + the value of y is #{ x + y }"puts"x + y + z the average value of is #{ (x + y +
z)/3 }"
The output output of the above example is as follows:
The value of x is 12
The value of x+y is 48
The average value of x+y+z is 40
Ruby also supports a %q
and %Q
to boot the string variable %q
single quotation mark reference rules are used, while %Q
is a doublequotation mark quotation rule, followed by one. The sum of the (! [ {
beginning delimiters of, etc. The end delimiter of } ] )
, etc.
The character followed by Q or Q is a delimiter. The delimiter can be any single-byte character that is not alphanumeric. For example, [, {, (, <,! The string is read until a matching Terminator is found.
Example
#/ Usr/bin/ruby # - * - coding: UTF-8- * - desc1=% Q {Ruby strings can use ''
And ''.} Desc2=% q | Ruby strings can use '' and ''\| Putsdesc1putsdesc2
The output output of the above example is as follows:
Ruby strings can use '' and ''.
Ruby strings can use '' and ''.
Escape character
The subscript lists escaped or non-print characters that can be escaped using a backslash symbol.
Note: escape characters are parsed within a string enclosed in double quotes. Within a string enclosed in single quotation marks, escape characters are not parsed and output as is.
Backslash symbol |
Hexadecimal character |
Description |
---|---|---|
\a |
0x07 |
Alarm symbol |
\b |
0x08 |
Backspace key |
\ cx |
Control-x |
|
\ Cmurx |
Control-x |
|
\e |
0x1b |
Escape character |
\f |
0x0c |
Feed character |
\ M -Cmurx |
Meta-Control-x |
|
\n |
0x0a |
Newline character |
\nnn |
Octal representation, where the range of n is 0.7 |
|
\r |
0x0d |
Carriage return symbol |
\s |
0x20 |
Space character |
\t |
0x09 |
Tab character |
\v |
0x0b |
Vertical tab character |
\x |
Character x |
|
\xnn |
Hexadecimal representation where the range of n is 0.9, a.f, or A.F |
Character coding
The default character set for Ruby is ASCII, and characters can be represented by a single byte. If you use UTF-8 or other modern character sets, characters may be represented by one to four bytes.
You can use the $KCODE
change the character set, as follows:
$KCODE = 'u'
The following are possible values
of $KCODE``.
Coding |
Description |
---|---|
A |
ASCII (same as none). This is the default. |
E |
EUC . |
N |
None (same as ASCII). |
U |
UTF-8 . |
String built-in method
We need an instance of the String
object to call the String
method. The following is how to create an instance of a String
object:
new[String.new(str="")]
This returns a file that contains str
new string object for the copy.Now, use the str
object, we can call any available instance method. For example:
Example
#!/usr/bin/rubymyStr=String.new("THIS IS
TEST")foo=myStr.downcaseputs"#{foo}"
This will produce the following results:
this is test
The following is the common string method (assuming str
is a String
object):
Serial number |
Method & description |
---|---|
1 |
|
2 |
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
Str = ~ obj matches str based on the regular expression pattern obj. Returnsthe position where the match begins, otherwise returns false. |
8 |
|
9 |
str.capitalize converts the first letter of a string to uppercase and the rest to lowercase. |
10 |
str.capitalize! Same as capitalize, but if not modified, capitalize! Return to nil. |
11 |
str.casecmp is a case-insensitive string comparison. |
12 |
str.center centers the string. |
13 |
Str.chomp removes the record delimiter from the end of the string |
14 |
str.chomp! Same as chomp, but str changes and returns. |
15 |
str.chop removes the last character from the str. |
16 |
str.chop! Same as chop, but str changes and returns. |
17 |
|
18 |
|
19 |
|
20 |
|
21 |
|
22 |
str.downcase returns a copy of str, and all uppercase letters are replaced with lowercase letters. |
23 |
str.downcase! Same as downcase, but str changes and returns. |
24 |
str.dump returns the version of str, and all non-print characters are replaced with |
25 |
|
26 |
|
27 |
|
28 |
|
29 |
|
30 |
|
31 |
|
32 |
|
33 |
|
34 |
str.hash returns a hash based on the length and content of the string. |
35 |
str.hex treats the leading character of a str as a string of hexadecimal numbers (an optional symbol and an optional |
36 |
|
37 |
|
38 |
|
39 |
str.inspect returns a printable version of str with escaped special characters. |
40 |
|
41 |
|
42 |
|
43 |
|
44 |
|
45 |
|
46 |
Str.oct treats the leading character of str as a string of decimal numbers (an optional symbol) and returns the corresponding number. If the conversionfails, 0 is returned. |
47 |
|
48 |
|
49 |
|
50 |
|
51 |
|
52 |
|
53 |
|
54 |
|
55 |
|
56 |
If pattern is a string String, it will be used as a delimiter when splittingthe str. If pattern is a single space, then str is split based on spaces, ignoring leading spaces and consecutive space characters. If pattern is a regular expression Regexp, the str is split where the pattern matches. When pattern matches a zero-length string, the str is splitinto a single character. If the pattern parameter is omitted, use the If the limit parameter is omitted, the trailing null field is suppressed. Iflimit is a positive number, the maximum number of fields is returned (if limit is 1, the entire string is returned as the only entry in the array). If limit is a negative number, there is no limit to the number of fields returned and trailing null fields are not suppressed. |
57 |
|
58 |
|
59 |
Str.strip returns a copy of str, removing leading and trailing spaces. |
60 |
Str.strip! Removes leading and trailing spaces from str, and returns nil if there is no change. |
61 |
|
62 |
|
63 |
|
64 |
|
65 |
Str.sum (n = 16) returns the n-bit checksum of the characters in str, where n is the optional Fixnum parameter and defaults to 16. The result is to simply sum the binary values of each character in the str, using 2n-1 as themodule. This is not a very good checksum. |
66 |
Str.swapcase returns a copy of str, with all uppercase letters converted to lowercase letters and all lowercase letters converted to uppercase letters. |
67 |
|
68 |
|
69 |
|
70 |
|
71 |
|
72 |
|
73 |
|
74 |
|
75 |
|
76 |
Str.upcase returns a copy of str, and all lowercase letters are replaced with uppercase letters. The operation is environment-insensitive, and only the characters a to z are affected. |
77 |
Str.upcase! Change the content of str to uppercase, and return nil if there is no change. |
78 |
|
String unpack instruction
The following table lists the methods String#unpack
the decompression instruction.
Instruction |
Return |
Description |
---|---|---|
A |
|
Remove trailing null and spaces. |
A |
|
String. |
B |
|
Extract bits from each character (starting with the most significant bits). |
B |
|
Extract bits from each character (starting with the least significant bits). |
C |
|
Extract a character as an unsigned integer. |
C |
|
Extract a character as an integer. |
D, d |
|
Treat sizeof (double)-length characters as native double. |
E |
|
Treat characters of sizeof (double) length as double of littleendian byte order. |
E |
|
Treat characters of sizeof (float) length as float of littleendian byte order. |
F, f |
|
Treat sizeof (float)-length characters as native float. |
G |
|
Treat characters of sizeof (double) length as double of network byte order. |
G |
|
Treat characters of sizeof (float) length as float of network byte order. |
H |
|
Extract hexadecimal (starting with the most significant bits) from each character. |
H |
|
Extract hexadecimal from each character (starting with the least significantbits). |
I |
|
Consecutive characters of sizeof (int) length (modified by _) are treated asnative integer. |
I |
|
Consecutive characters of sizeof (int) length (modified by _) are treated assigned native integer. |
L |
|
Treat four consecutive characters (modified by _) as unsigned native long integer. |
L |
|
Treat four consecutive characters (modified by _) as signed native long integer. |
M |
|
References are printable. |
M |
|
Base64 coding. |
N |
|
An unsigned long that treats four characters as network byte order. |
N |
|
An unsigned short that treats two characters as network byte order. |
P |
|
Use a character of length |
P |
|
Put |
Q |
|
Treat eight characters as unsigned quad word (64-bit). |
Q |
|
Treat eight characters as signed quad word (64 bits). |
S |
|
An unsigned short that treats two consecutive characters (different if using _) as native byte order. |
S |
|
A signed short that treats two consecutive characters (different if using _)as native byte order. |
U |
|
The UTF-8 character as an unsigned integer. |
U |
|
UU coding. |
V |
|
An unsigned long that treats four characters as little-endian byte order. |
V |
|
An unsigned short that treats two characters as little-endian byte order. |
W |
|
An integer compressed by BER. |
X |
Skip one character backwards. |
|
X |
Skip one character forward. |
|
Z |
|
Use |
@ |
Skips the offset given by the length parameter. |
Example
Try the following example to extract all kinds of data.
"abc\\0\\0abc\\0\\0".unpack('A6Z6')#=> ["abc", "abc
"]"abc\\0\\0".unpack('a3a3')#=> ["abc", "
\\000\000"]"abc\\0abc\\0".unpack('Z*Z*')#=> ["abc ", "abc
"]"aa".unpack('b8B8')#=> ["10000110",
"01100001"]"aaa".unpack('h2H2c')#=> ["16", "61",
97]"\\xfe\\xff\\xfe\\xff".unpack('sS')#=> [-2,
65534]"now=20is".unpack('M*')#=> ["now
is"]"whole".unpack('xax2aX2aX1aX2a')#=> ["h", "e", "l", "l", "o"]