Review the string knowledge in Python

  • 2020-05-09 18:49:06
  • OfStack

string

Creating string objects in Python is very easy. You are done creating a new string by placing the required text in a pair of quotation marks (see listing 1). If you think about it a little, you might get a little confused. After all, there are two types of quotation marks that you can use: single quotation marks (') and double quotation marks ("). Fortunately, Python has solved this problem once again. You can use any class 1 quotation mark to represent a string in Python, as long as the quotation mark 1 is attached. If a string begins with a single quotation mark, it must end with a single quotation mark, and vice versa. If this 1 rule is not followed, an SyntaxError exception will occur.
Listing 1. Create a string in Python


>>> sr="Discover Python"
>>> type(sr)
<type 'str'>
>>> sr='Discover Python'
>>> type(sr)
<type 'str'>
>>> sr="Discover Python: It's Wonderful!"    
>>> sr='Discover Python"
 File "<stdin>", line 1
  sr='Discover Python"
            ^
SyntaxError: EOL while scanning single-quoted string
>>> sr="Discover Python: \
... It's Wonderful!"
>>> print sr
Discover Python: It's Wonderful!

As you can see from listing 1, there are two other important aspects to the string besides the proper quotation marks. First, you can mix single and double quotation marks when creating strings, as long as the string USES the same type of quotation marks at the beginning and end of the string. This flexibility allows Python to easily retain regular text data that might require single quotation marks for a shortened verb form or ownership, and double quotation marks for quoted text.

Second, if the string is too long with 1 line, you can use the Python continuous character: backslash (\) to fold the string. Internally, the newline character is ignored when creating a string, as you can see when printing a string. You can combine these two functions to create a string containing a longer paragraph, as shown in listing 2.
Listing 2. Creating a long string


>>> passage = 'When using the Python programming language, one must proceed \
... with caution. This is because Python is so easy to use and can be so \
... much fun. Failure to follow this warning may lead to shouts of \
... "WooHoo" or "Yowza".'
>>> print passage
When using the Python programming language, one must proceed with caution. 
This is because Python is so easy to use, and can be so much fun. 
Failure to follow this warning may lead to shouts of "WooHoo" or "Yowza".

Editor's note: the example above has been broken to make the page layout more reasonable. In fact, it originally appeared as one long line.

Note that when the passage string is printed, all formats are deleted, leaving only one very long string. In general, you can use a control character to represent a simple format in a string. For example, to indicate the start of a new line, you can use the newline control (\n); To indicate the insertion of a TAB character (the default number of Spaces), use the TAB character control (\t), as shown in listing 3.
Listing 3. Using a control character in a string


>>> passage='\tWhen using the Python programming language, one must proceed\n\
... \twith caution. This is because Python is so easy to use, and\n\
... \tcan be so much fun. Failure to follow this warning may lead\n\
... \tto shouts of "WooHoo" or "Yowza".'
>>> print passage
    When using the Python programming language, one must proceed
    with caution. This is because Python is so easy to use, and
    can be so much fun. Failure to follow this warning may lead
    to shouts of "WooHoo" or "Yowza".
>>> passage=r'\tWhen using the Python programming language, one must proceed\n\
... \twith caution. This is because Python is so easy to use, and\n\
... \tcan be so much fun. Failure to follow this warning may lead\n\
... \tto shouts of "WooHoo" or "Yowza".'
>>> print passage
\tWhen using the Python programming language, one must proceed\n\
\twith caution. This is because Python is so easy to use, and\n\
\tcan be so much fun. Failure to follow this warning may lead\n\
\tto shouts of "WooHoo" or "Yowza".

Paragraph 1 in listing 3 USES control characters the way you would expect. The paragraph is well formatted and easy to read. The second example, while also formatted, refers to what is called a raw string, a string that does not have a control applied. You can always recognize the original string because the opening quote of the string is preceded by an r character, which is short for raw.

I don't know what you're talking about, but while it works, creating a paragraph string seems very difficult. Of course there must be a better way. As usual, Python provides a very simple way to create long strings that preserves the format used to create them. This method USES three double quotation marks (or three single quotation marks) to start and end a long string. You can use as many single and double quotes as you like in this string (see listing 4).
Listing 4. A string with three quotation marks


>>> passage = """
...     When using the Python programming language, one must proceed
...     with caution. This is because Python is so easy to use, and
...     can be so much fun. Failure to follow this warning may lead
...     to shouts of "WooHoo" or "Yowza".
... """
>>> print passage
        
    When using the Python programming language, one must proceed
    with caution. This is because Python is so easy to use, and
    can be so much fun. Failure to follow this warning may lead
    to shouts of "WooHoo" or "Yowza".

Treat the string as an object

If you read any of the first two articles in this series, the following sentence immediately pops into your mind: in Python, everything is an object. So far, I haven't touched on the object properties of strings in Python, but, as usual, strings in Python are objects. In fact, the string object is an instance of the str class. As you saw in exploring Python, part 2, the Python interpreter includes a built-in help tool (shown in listing 5) that provides information about the str class.
Listing 5. Get help information about strings


>>> help(str)
     
Help on class str in module __builtin__:
          
class str(basestring)
| str(object) -> string
| 
| Return a nice string representation of the object.
| If the argument is a string, the return value is the same object.
| 
| Method resolution order:
|   str
|   basestring
|   object
| 
| Methods defined here:
| 
| __add__(...)
|   x.__add__(y) <==> x+y
| 
...

Strings created using the single, double, and 3 quote syntax are still string objects. But you can also explicitly create string objects using the str class constructor, as shown in listing 6. The constructor can take simple built-in numeric type or character data as arguments. Either way, you can change the input to a new string object.
Listing 6. Creating a string


>>> str("Discover python")
'Discover python'
>>> str(12345)
'12345'
>>> str(123.45)
'123.45'
>>> "Wow," + " that " + "was awesome."
'Wow, that was awesome.'
>>> "Wow,"" that ""was Awesome"
'Wow, that was Awesome'
>>> "Wow! "*5
'Wow! Wow! Wow! Wow! Wow! '
>>> sr = str("Hello ")
>>> id(sr)
5560608
>>> sr += "World"
>>> sr
'Hello World'
>>> id(sr)
3708752

The example in listing 6 also shows several other important aspects of the Python string. First, you can create a new string by adding another string from 1, either by using the + operator, or by simply attaching the string from 1 with the appropriate quotation marks. Second, if you need to repeat a short string to create a long one, you can use the * operator to repeat the string a certain number of times. As I said at the beginning of this article, in Python, a string is an immutable sequence of characters. The last few lines in the example above illustrate this point by first creating a string and then modifying it by adding another string. As you can see from the output of the two calls to the id method, the new string object created holds the result of adding text to the original string.

The str class contains a number of useful methods for manipulating strings. Instead of 11, you can use the help interpreter to get the information. Now let's take a look at 4 useful functions and demonstrate other tools for str class methods. Listing 7 demonstrates the upper, lower, split, and join methods.
Listing 7. String method


>>> sr = "Discover Python!"
>>> sr.upper()
'DISCOVER PYTHON!'
>>> sr.lower()
'discover python!'
>>> sr = "This is a test!"
>>> sr.split()
['This', 'is', 'a', 'test!']
>>> sr = '0:1:2:3:4:5:6:7:8:9'
>>> sr.split(':')
['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
>>> sr=":"
>>> tp = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9')
>>> sr.join(tp)
'0:1:2:3:4:5:6:7:8:9'

The first two methods, upper and lower, are easy to understand. They simply convert both strings to uppercase or lowercase letters, respectively. The split method is useful because it breaks a string into smaller sequences by using token characters (or any character in a given sequence of characters) as indicators of the break position. So, the first split method example splits the string "This is a test" using the default token, which can be any white space character (this sequence includes Spaces, tabs, and newlines). The second split method demonstrates how to divide a string into series 1 strings using different token characters (in this case, a colon). The last example shows how to use the join method, which is the opposite of the split method and can make several short sequences of strings into one long string. In this case, the colon is used to concatenate the sequence of single-character strings contained in tuple from 1.

Use strings as containers for characters

At the beginning of this article, I emphasized that strings in Python are immutable sequences of characters. Part 2 of this series explores Python, and part 2 introduces tuple, which is also an immutable sequence. tuple supports access to elements in a sequence by using index notation, separating elements in a sequence using fragments, and creating new tuples using specific fragments or adding different fragments to 1. Depending on this 1 case, you might wonder if you can apply the same 1 technique to the Python string. As shown in listing 8, the answer is clearly "yes."
Listing 8. String method


>>> sr="0123456789"
>>> sr[0]
'0'
>>> sr[1] + sr[0]  
'10'
>>> sr[4:8]   # Give me elements four through seven, inclusive
'4567'
>>> sr[:-1]   # Give me all elements but the last one
'012345678'
>>> sr[1:12]  # Slice more than you can chew, no problem
'123456789'
>>> sr[:-20]  # Go before the start?
''
>>> sr[12:]   # Go past the end?
''
>>> sr[0] + sr[1:5] + sr[5:9] + sr[9]
'0123456789'
>>> sr[10]
Traceback (most recent call last):
 File "<stdin>", line 1, in ?
IndexError: string index out of range
>>> len(sr)   # Sequences have common methods, like get my length
10

In Python, it is very simple to process a string as a sequence of characters. You can take a single element, add different elements at 1, cut out several elements, and even add different fragments at 1. One very useful feature of slicing is that more slicing before or after the start does not throw an exception, just start or end the sequence by default accordingly. In contrast, if you try to access a single element with an index outside the allowed range, you get an exception. This behavior explains why the len method is so important.

String: a powerful tool

In this article, I introduced the Python string, which is an immutable sequence of characters. In Python, you can easily create strings using a number of methods, including single, double, or, more flexibly, one set of three quotation marks. Assuming that everything in Python is an object, you can use the underlying str class methods to get additional functionality or directly use the sequence functionality of strings.


Related articles: