Strings are amongst the most popular types in Python. We can create them simply by enclosing characters in quotes. Python treats single quotes the same as double quotes.
Creating strings is as simple as assigning a value to a variable.
Example:
str = "This is string"
A string is a sequence of characters. You can access the characters one at a time with the bracket operator.
If you try to use double quotes inside a string delimited by double
quotes, you’ll get an error:
>>> text = "She said, "What time is it?""
File "<stdin>", line 1
text = "She said, "What time is it?""
^
SyntaxError: invalid syntax
Python throws a SyntaxError because it thinks the string ends after the second ", and it doesn’t know how to interpret the rest of the line. If you need to include a quotation mark that matches the delimiter inside a string, then you can escape the character using a backslash:
>>> text = "She said, \"What time is it?\""
>>> print(text)
She said, "What time is it?"
Note : When you work on a project, it’s a good idea to use only single quotes or only double quotes to delimit every string. Keep in mind that there really isn’t a right or wrong choice! The goal is to be consistent because consistency helps make your code easier to read and understand.
String Concatenation
You can combine, or concatenate, two strings using the + operator:
>>> string1 = "abra"
>>> string2 = "cadabra"
>>> magic_string = string1 + string2
>>> magic_string
'abracadabra'
In this example, the string concatenation occurs on the third line. You concatenate string1 and string2 using +, and then you assign the result to the variable magic_string. Notice that the two strings are joined without any whitespace between them.
You can use string concatenation to join two related strings, such as joining a first name and a last name into a full name:
>>> first_name = "Arthur"
>>> last_name = "Dent"
>>> full_name = first_name + " " + last_name
>>> full_name
'Arthur Dent'
Here, you use string concatenation twice on the same line. First, you concatenate first_name with " " to ensure a space appears after the first name in the final string. This produces the string "Arthur ", which you then concatenate with last_name to produce the full name "Arthur Dent".
String Indexing
Each character in a string has a numbered position called an index.You can access the character at the nth position by putting the number n between two square brack ([]) immediately after the string:
>>> flavor = "fig pie"
>>> flavor[1]
'i'
flavor[1] returns the character at position 1 in "fig pie", which is i. Wait. Isn’t f the first character of "fig pie"? In Python—and in most other programming languages counting always starts at zero. To get the character at the beginning of a string, you need to access the character at position 0:
>>> flavor[0]
'f'
Important! : Forgetting that counting starts with zero and trying to access the first character in a string with the index 1 results in an off-by-one error.
Off-by-one errors are a common source of frustration for beginning and experienced programmers alike!
If you try to access an index beyond the end of a string, then Python raises an IndexError:
>>> flavor[9]
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
flavor[9]
IndexError: string index out of range
The largest index in a string is always one less than the string’s length. Since "fig pie" has a length of seven, the largest index allowed is 6.
Strings also support negative indices:
>>> flavor[-1]
'e'
The last character in a string has index -1, which for "fig pie" is the letter e. The second to last character i has index -2, and so on.
Just like with positive indices, Python raises an IndexError if you try to access a negative index less than the index of the first character in the string:
>>> flavor[-10]
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
flavor[-10]
IndexError: string index out of range
For example, suppose a string input by a user is assigned to the variable user_input. If you need to get the last character of the string, how do you know what index to use?
One way to get the last character of a string is to calculate the final
index using len():
final_index = len(user_input) - 1
last_character = user_input[final_index]
Getting the final character with the index -1 takes less typing and doesn’t require an intermediate step to calculate the final index:
last_character = user_input[-1]
Negative indices may not seem useful at first, but sometimes they’re a better choice than a positive index.
String Slicing
Suppose you need a string containing just the first three letters of the string "fig pie". You could access each character by index and concatenate them like this:
>>> first_three_letters = flavor[0] + flavor[1] + flavor[2]
>>> first_three_letters
'fig'
If you need more than just the first few letters of a string, then getting each character individually and concatenating them together is clumsy and long-winded. Fortunately, Python provides a way to do this with much less typing.
You can extract a portion of a string, called a substring, by inserting a colon between two index numbers set inside square brackets like this:
>>> flavor = "fig pie"
>>> flavor[0:3]
'fig'
flavor[0:3] returns the first three characters of the string assigned to flavor, starting with the character at index 0 and going up to but not including the character at index 3. The [0:3] part of flavor[0:3] is called a slice. In this case, it returns a slice of "fig pie". Yum!
So, for "fig pie", the slice [0:3] returns the string "fig", and the slice [3:7] returns the string " pie".
If you omit the first index in a slice, then Python assumes you want to start at index 0:
>>> flavor[:3]
'fig'
The slice [:3] is equivalent to the slice [0:3], so flavor[:3] returns the first three characters in the string "fig pie". Similarly, if you omit the second index in the slice, then Python assumes you want to return the substring that begins with the character whose index is the first number in the slice and ends with the last character in the string:
>>> flavor[3:]
' pie'
For "fig pie", the slice [3:] is equivalent to the slice [3:7]. Since the character at index 3 is a space, flavor[3:9] returns the substring that starts with the space and ends with the last letter: " pie".
If you omit both the first and second numbers in a slice, you get a string that starts with the character at index 0 and ends with the last character. In other words, omitting both numbers in a slice returns the entire string:
>>> flavor[:]
'fig pie'
It’s important to note that, unlike with string indexing, Python won’t raise an IndexError when you try to slice between boundaries that fall outside the beginning or ending boundaries of a string:
>>> flavor[:14]
'fig pie'
>>> flavor[13:15]
''
In this example, the first line gets the slice from the beginning of the string up to but not including the fourteenth character. The string assigned to flavor has a length of seven, so you might expect Python to throw an error. Instead, it ignores any nonexistent indices and returns the entire string "fig pie".
The third line shows what happens when you try to get a slice in which the entire range is out of bounds. flavor[13:15] attempts to get the thirteenth and fourteenth characters, which don’t exist. Instead of raising an error, Python returns the empty string ("").
Note : The empty string is called empty because it doesn’t contain any characters. You can create it by writing two quotation marks with nothing between them:
empty_string = ""
A string with anything in it—even a space—is not empty. All the following strings are non-empty:
non_empty_string1 = " "
non_empty_string2 = " "
non_empty_string3 = " "
Even though these strings don’t contain any visible characters, they are non-empty because they do contain spaces.
You can use negative numbers in slices. The rules for slices with negative numbers are exactly the same as the rules for slices with positive numbers
Just like before, the slice [x:y] returns the substring starting at index x and going up to but not including y. For instance, the slice [-7:-4] returns the first three letters of the string "fig pie":
>>> flavor[-7:-4]
'fig'
Notice, however, that the rightmost boundary of the string does not have a negative index. The logical choice for that boundary would seem to be the number 0, but that doesn’t work.
Instead of returning the entire string, [-7:0] returns the empty string:
>>> flavor[-7:0]
''
This happens because the second number in a slice must correspond to a boundary that is to the right of the boundary corresponding to the first number, but both -7 and 0 correspond to the leftmost boundary in the figure.
If you need to include the final character of a string in your slice, then you can omit the second number:
>>> flavor[-7:]
'fig pie'
Of course, using flavor[-7:] to get the entire string is a bit odd considering that you can use the variable flavor without the slice to get the same result!
Slices with negative indices are useful, though, for getting the last few characters in a string. For example, flavor[-3:] is "pie".
Strings Are Immutable
To wrap this section up, let’s discuss an important property of string objects. Strings are immutable, which means that you can’t change them once you’ve created them. For instance, see what happens when you try to assign a new letter to one particular character of a string:
>>> word = "goal"
>>> word[0] = "f"
Traceback (most recent call last):
File "<pyshell#16>", line 1, in <module>
word[0] = "f"
TypeError: 'str' object does not support item assignment
Python throws a TypeError and tells you that str objects don’t support
item assignment.
If you want to alter a string, then you must create an entirely new
string. To change the string "goal" to the string "foal", you can use a
string slice to concatenate the letter "f" with everything but the first
letter of the word "goal":
>>> word = "goal"
>>> word = "f" + word[1:]
>>> word
'foal'
First, you assign the string "goal" to the variable word. Then you concatenate the slice word[1:], which is the string "oal", with the letter "f" to get the string "foal". If you’re getting a different result here, then make sure you’re including the colon character (:) as part of the string slice.
Accessing Values in Strings
Python does not support a character type; these are treated as strings of length one, thus also considered a substring.
To access substrings, use the square brackets for slicing along with the index or indices to
obtain your substring
>>> fruit = 'banana'
>>> letter = fruit[1]
The second statement extracts the character at index position 1 from the fruit variable and assigns it to the letter variable.
The expression in brackets is called an index. The index indicates which character in the sequence you want (hence the name).
But you might not get what you expect:
But
print(letter)
a
For most people, the first letter of “banana” is “b”, not “a”. But in Python, the index is an offset from the beginning of the string, and the offset of the first letter is zero.
>>> letter = fruit[0]
>>> print(letter)
b
So “b” is the 0th letter (“zero-th”) of “banana”, “a” is the 1th letter (“one-th”), and “n” is the 2th (“two-th”) letter.
length of a string
The number of characters contained in a string, including spaces, is called the length of the string. For example, the string "abc" has a length of 3, and the string "Don't Panic" has a length of 11.
Python has a built-in len() function that you can use to determine the length of a string.
Example -
>>> len('banana')
6
>>> fruit = 'banana'
>>> len(fruit)
6
First, you assign the string "banana" to the variable fruits. Then you use len() to get the length of fruits, which is 6.
Multiline Strings
To create multi-iline strings using triple quotes (""" or ''') as delimiters. Here’s how to write a long paragraph using this approach:
paragraph = """Hey everyone,
This is python multi-line string
using triple quotes."""
Triple-quoted strings preserve whitespace, including newlines. This means that running print(paragraph) would display the string on multiple lines, just as it appears in the string literal. This may or may not be what you want, so you’ll need to think about the desired output before you choose how to write a multil-ine string.
To get the last letter of a string, you might be tempted to try something like this:
>>> length = len(fruit)
>>> last = fruit[length]
IndexError: string index out of range
The reason for the IndexError is that there is no letter in “banana” with the index
6. Since we started counting at zero, the six letters are numbered 0 to 5. To get
the last character, you have to subtract 1 from length:
>>> last = fruit[length-1]
>>> print(last)
a
Alternatively, you can use negative indices, which count backward from the end of
the string. The expression fruit[-1] yields the last letter, fruit[-2] yields the
second to last, and so on.
The in operator
The word in is a boolean operator that takes two strings and returns True if the first appears as a substring in the second:
>>> 'a' in 'banana'
True
>>> 'seed' in 'banana'
False
Format operator
The format operator, % allows us to construct strings, replacing parts of the strings with the data stored in variables. When applied to integers, % is the modulus operator. But when the first operand is a string, % is the format operator.
The first operand is the format string, which contains one or more format sequences that specify how the second operand is formatted. The result is a string.
For example, the format sequence %d means that the second operand should be formatted as an integer (“d” stands for “decimal”):
>>> camels = 42
>>> '%d' % camels
'42'
The result is the string ‘42’, which is not to be confused with the integer value 42. A format sequence can appear anywhere in the string, so you can embed a value in a sentence:
>>> camels = 42
>>> 'I have spotted %d camels.' % camels
'I have spotted 42 camels.'
If there is more than one format sequence in the string, the second argument has to be a tuple. Each format sequence is matched with an element of the tuple, in order.
The following example uses %d to format an integer, %g to format a floating-point number (don’t ask why), and %s to format a string:
>>> 'In %d years I have spotted %g %s.' % (3, 0.1, 'camels')
'In 3 years I have spotted 0.1 camels.'
The number of elements in the tuple must match the number of format sequences in the string. The types of the elements also must match the format sequences:
>>> '%d %d %d' % (1, 2)
TypeError: not enough arguments for format string
>>> '%d' % 'dollars'
TypeError: %d format: a number is required, not str
In the first example, there aren’t enough elements; in the second, the element is the wrong type
0 Comments