Data types in Python: String
String manipulation and formatting using Python's standard library methods
Table of contents
- String
- Basics
- Concatenate
- Replicate
- Printing strings
- Escape character
- Multi-lines string
- Raw string
- Triple quoted string
- Manipulating strings
- Indexing
- Slicing
- IN and NOT
- Case transformation: upper(), lower(), isupper(), islower()
- String validation: isX()
- startswith() and endswith()
- Map to and from lists
- Justify text: center, ljust and rjust
- Removing white spaces and other characters using lstrip, rstrip and strip
- Formatting strings
- str.format
- Formatted string literals
- Conclusion ✍️
Now that we've reviewed variables, let's explore Python's data types.
Python supports the following data types out of the box:
Description | Type |
Text | str |
Numeric | int , float , complex |
Sequence | list , tuple , range |
Map | dict |
Set | set , frozenset |
Boolean | bool |
Binary | bytes , bytearray , memoryview |
In this article, we will focus on strings (str
).
String
Strings are simply a sequence of characters.
You can create a string by including a chain of characters within a single quote ('
) or a double quote ("
)
my_string = "This is a string"
my_second_string = 'This is also a string'
Let's explore some methods that Python provides out of the box for strings manipulations and formatting.
This article is a kind of cheatsheet for string manipulations.
I tried to cover the most commonly used methods.
Feel free to only read the bits you are interested in!
Basics
Concatenate
You can concatenate (i.e. 'combine') strings simply by putting them side by side!
string = "Python" " " "is" " " "a nice language"
print(string)
# >> Python is a nice language
Replicate
You can replicate (i.e. 'combine' the same string multiple times) by simply using the multiplier operator
print("hip ..." * 2)
# >> hip ...hip ...
print("Hooray! 🎉")
# >> Hooray! 🎉
Printing strings
Escape character
You can escape special characters by using the prefix \
.
For example, to print He said "hi"
you would write the following program
say_hi = "He said \"hi\""
print(say_hi)
# >> He said "hi"
You can find a list of special characters here.
Multi-lines string
A special note to the newline
character: if you wish to write a string over several lines for formatting reasons, you must escape the newline
character.
multiline = "I am on the first line \
And I am on the first line too"
print(multiline)
# >> I am on the first line And I am on the first line too
wrong_multilines = "This will not work
because Python isn't aware of our new line character."
# >> 🔴 SyntaxError: EOL while scanning string literal (i.e. you must escape the new line character.)
Raw string
By preceding your string with r
you create a raw string, which doesn't translate the escape character (/
)
say_hi = r"He said \"hi\""
print(say_hi)
# >> He said \"hi\"
Triple quoted string
By enclosing a string into a triple quote, you automatically escape any quote ('
), double quote ("
) and the new line character.
Because the newline character doesn't have to be escaped, we can print on several lines
triple_quoted_string = '''This string will print quotes ('), double quotes (")
and won't mind newline characters despite the lack of escape characters (\).
This string will be printed on three lines 🎉'''
print(triple_quoted_string)
# >> This string will print quotes ('), double quotes (")
# >> and won't mind new line characters despite the lack of escape characters (\).
# >> This string will be printed on three lines 🎉
Manipulating strings
Indexing
A string is simply a chain of characters. To access a certain character, we can simply use the index of the character. Python is '0 indexing', which means that the first index is 0.
my_string = "INDEXING"
# Characters I N D E X I N G
# Index 0 1 2 3 4 5 6 7
print(my_string[3])
# >> E
Slicing
We can use string[start: end]
to slice a string and obtain a sub-string.
start
is inclusive but end
is exclusive.
my_string = "Hello Brisbane"
# H e l l o B r i s b a n e
# 0 1 2 3 4 5 6 7 8 9 10 11 12 13
# Using start and end index
# i.e. give me a string of the characters at index 0,1,2,3 and 4.
print(my_string[0:5])
# >> Hello
# Using only end index
# The start index is considered to be 0
# This is a short hand version of the slice above
print(my_string[:5])
# >> Hello
# Using only the start index
# The end index is considered to be the last character (here, 13)
print(my_string[6:])
# >> Brisbane
# Using negative index, we count from the last characters
print(my_string[-8:])
# >> Brisbane
print(my_string[-14:-9])
# >> Hello
IN and NOT
in
and not
allow you to check for the existence of a sub-string within a string
my_animals = "🐶, 🐱, 🐭, 🐴"
print("🐶" in my_animals)
# >> True
print ("🐘" in my_animals)
# >> False
print ("🐘" not in my_animals)
# >> True
Case transformation: upper()
, lower()
, isupper()
, islower()
Pretty straightforward, upper()
and lower()
allow you to change the casing of a string.
isupper()
and islower()
allow you to check the casing of a string.
string = "i am not yelling 📣"
print(string.upper())
# >> I AM NOT YELLING 📣
print(string.islower())
# >> True
print(string.upper().isupper())
# >> True
String validation: isX()
We have several methods available to validate strings, in a similar fashion to isupper()
or islower()
:
isalpha()
: only letters and no blankisalnum()
: only letters and numbers, no blankisdecimal()
: only numbers, no blankisspace()
: only spaces, tabs, newlines characters. No blank.istitle()
: only capitalised casing
alpha = "letters"
print(alpha.isalpha())
# >> True
alnum = "42characters"
print(alnum.isalnum())
# >> True
decimal="42"
print(decimal.isdecimal())
# >> True
space = "\t \n"
print(space.isspace())
# >> True
title = "Batman The Movie"
print(title.istitle())
# >> True
startswith()
and endswith()
Self-explanatory, allow you to check if string starts or end with a sub-string
string = "starts and ends"
print(string.startswith("starts"))
# >> True
print(string.endswith("ends"))
# >> True
Map to and from lists
We haven't covered the lists yet, but there are two methods to turn a list into a string and vice-versa: join
and split
With join
, the str
provided will be used as the spacer.
# A list of strings
# We will cover list in a future article
list_of_words = ["Hello", "my", "name", "is","Alo"]
list_of_animals = ["🦍", "🐘", "🦁"]
# Spacer is " "
string = " ".join(list_of_words)
print(string)
# >> Hello my name is Alo
# Spacer is ","
string = ", ".join(list_of_animals)
print(string)
# >> 🦍, 🐘, 🦁
With split
, the str
provided will be the delimiter.
my_string = "A storm is coming, watch out!"
print(my_string.split(","))
# >> ["A storm is coming", "watch out!"]
print(my_string.split(" "))
# >> ["A", "storm", "is", "coming,", "watch", "out!"]
Justify text: center
, ljust
and rjust
These three methods allow you to justify the text. The first argument defines the length of the new string, including the existing string. An optional second argument can be passed to the function to specify the fill character.
print("left".ljust(20))
# >> "left "
print("right".rjust(20,"-"))
# >> "---------------right"
print("center", center(10,"="))
# >> "==center=="
# note the rounding to 9 characters in order to keep 'center' centered
Removing white spaces and other characters using lstrip
, rstrip
and strip
Use the three methods to strip white spaces from strings.
left_padded_string = " Hello"
print(left_padded_string.lstrip())
# >> Hello
right_padded_string = "Hello "
print(right_padded_string.rstrip())
# >> Hello
right_and_left_padded_string = " Hello "
print(right_padded_string.strip())
# >> Hello
You can also strip any characters by giving strip
an argument:
string = "Hellow"
print(string.strip("w"))
# >> Hello
Formatting strings
str.format
The preferred way to format string is by using str.format
, introduced in Python 3.
str.format
will replace any curly brackets {}
by the arguments passed to format
.
Note that the position matters when the curly brackets are empty!
You can also give position numbers to the curly brackets to decide the order of the arguments.
name = "Alo"
profession = "developer"
print("Hello 👋 I'm {} and I work as a {}.".format(name, profession))
# >> Hello 👋 I'm Alo and I work as a developer.
print("Hello 👋 I'm {} and I work as a {}.".format(profession, name))
# >> Hello 👋 I'm developer and I work as a Alo.
print("Hello 👋 I'm {1} and I work as a {0}.".format(profession, name))
# >> Hello 👋 I'm Alo and I work as a developer.
You can also name the arguments to avoid any confusion!
name = "Alo"
profession = "developer"
print("Hello 👋 I'm {fname} and I work as a {fprofession}.".format(fprofession = profession, fname=name))
# >> Hello 👋 I'm developer and I work as a Alo.
Formatted string literals
Python 3.6 introduced a new way to format strings: string literals.
All you have to do is precede your string by f
!
word = "cool"
print(f"String literals are very {word}!")
# >> String literals are very cool!
# You can even do basic arithmetic with them
print(f"Two plus two is {2 + 2}.")
# >> Two plus two is 4.
Conclusion ✍️
Phew, that was more than expected wasn't it?
As you can see, Python gives you out of the box a lot of tools to manipulate and format strings.
These are only a few of the tools available and you can find the whole list of methods available for strings in the standard library here.
In my opinion, it's always worth checking the language documentation when learning a new language. As you can see, Python is pretty similar to other languages when it comes to strings - but sometimes, you can save yourself a bit of time by spending a few minutes reviewing the language's methods.
After all, it's provided with your language - you may as well use it!
Next time, we