1. String Index, Length, Slicing and Traversal

Quick Overview of Day

Explore string operators, index values, length of strings, string slicing, and traversing a string using a for loop (by item).

  • CS20-CP1 Apply various problem-solving strategies to solve programming problems throughout Computer Science 20.
  • CS20-FP1 Utilize different data types, including integer, floating point, Boolean and string, to solve programming problems.
  • CS20-FP2 Investigate how control structures affect program flow.
  • CS20-FP3 Construct and utilize functions to create reusable pieces of code.

Throughout the portion of the course, we have used strings to represent words or phrases that we wanted to print out. Our definition was simple: a string is simply some characters inside quotes. We will now explore strings in much more detail.

1.1. A Collection Data Type

So far we have seen built-in types like: int, float, bool, str and we’ve seen lists. int, float, and bool are considered to be simple or primitive data types because their values are not composed of any smaller parts. They cannot be broken down. On the other hand, strings and lists are different from the others because they are made up of smaller pieces. In the case of strings, they are made up of smaller strings each containing one character.

Types that are comprised of smaller pieces are called collection data types. Depending on what we are doing, we may want to treat a collection data type as a single entity (the whole), or we may want to access its parts. This ambiguity is useful.

Strings can be defined as sequential collections of characters. This means that the individual characters that make up the string are assumed to be in a particular order from left to right.

A string that contains no characters, often referred to as the empty string, is still considered to be a string. It is simply a sequence of zero characters and is represented by ‘’ or “” (two single or two double quotes with nothing in between).

1.2. Concatenation Reminder

As we have seen before, you cannot perform mathematical operations on strings, even if the strings look like numbers. The one exception to this rule is that the + operator does work with strings, but for strings, the + operator represents concatenation, not addition. As we have learned previously, concatenation means joining the two operands by linking them end-to-end. For example:

1.3. Index Operator: Working with the Characters of a String

The indexing operator (Python uses square brackets to enclose the index) selects a single character from a string. The characters are accessed by their position or index value. For example, in the string shown below, the 14 characters are indexed left to right from postion 0 to position 13.

index values

It is also the case that the positions are named from right to left using negative numbers where -1 is the rightmost index and so on. Note that the character at index 9 (or -5) is the blank/space character.

The expression place[4] selects the character at index 4 from place, and creates a new string containing just this one character. The variable some_char refers to the result.

Remember that computer scientists often start counting from zero. The letter at index zero of "Saskatoon Sask" is S. So at position [4] we have the letter a.

If you want the zero-eth letter of a string, you just put 0, or any expression with the value 0, in the brackets. Give it a try.

The expression in brackets is called an index. An index specifies a member of an ordered collection. In this case the collection of characters in the string. The index indicates which character you want. It can be any integer expression so long as it evaluates to a valid index value.

Note that indexing returns a string — Python has no special type for a single character. It is just a string of length 1.

1.3.1. Check Your Understanding

    string-exploration1: What is printed by the following statements?

    sentence = "python rocks"
    print(sentence[3])
    
  • t
  • Index locations do not start with 1, they start with 0.
  • h
  • Yes, index locations start with 0.
  • c
  • sentence[-3] would return c, counting from right to left.
  • Error, you cannot use the [ ] operator with a string.
  • [ ] is the index operator

    string-exploration2: What is printed by the following statements?

    sentence = "python rocks"
    print(sentence[2] + sentence[-5])
    
  • tr
  • Yes, indexing operator has precedence over concatenation.
  • ps
  • p is at location 0, not 2.
  • nn
  • n is at location 5, not 2.
  • Error, you cannot use the [ ] operator with the + operator.
  • [ ] operator returns a string that can be concatenated with another string.

1.4. Length

The len function, when applied to a string, returns the number of characters in a string.

To get the last letter of a string, you might be tempted to try something like this:

That won’t work. It causes the runtime error IndexError: string index out of range. The reason is that there is no letter at index position 6 in "Banana". Since we started counting at zero, the six indexes are numbered 0 to 5. To get the last character, we have to subtract 1 from the length. Give it a try in the example above.

Alternatively in Python, we can use negative indices, which count backward from the end of the string. The expression fruit[-1] yields the last letter, fruit[-2] yields the second to last, and so on. Try it! Most other languages do not allow the negative indices, but they are a handy feature of Python!

1.4.1. Check Your Understanding

    string-exploration3: What is printed by the following statements?

    sentence = "python rocks"
    print(len(sentence))
    
  • 11
  • The blank counts as a character.
  • 12
  • Yes, there are 12 characters in the string.

    string-exploration4: What is printed by the following statements?

    sentence = "python rocks"
    print(sentence[len(sentence)-5])
    
  • o
  • Take a look at the index calculation again, len(sentence)-5.
  • r
  • Yes, len(sentence) is 12 and 12-5 is 7. Use 7 as index and remember to start counting with 0.
  • s
  • sentence is at index 11
  • Error, len(sentence) is 12 and there is no index 12.
  • You subtract 5 before using the index operator so it will work.

    string-exploration5: What is printed by the following statements?

    sentence = "python rocks"
    print(sentence[-3])
    
  • c
  • Yes, 3 characters from the end.
  • k
  • Count backward 3 characters.
  • s
  • When expressed with a negative index the last character s is at index -1.
  • Error, negative indices are illegal.
  • Python does use negative indices to count backward from the end.

1.5. The Slice Operator

A substring of a string is called a slice. Selecting a slice is similar to selecting a character:

The slice operator [n:m] returns the part of the string from the n’th character to the m’th character, including the first but excluding the last. In other words, start with the character at index n and go up to but do not include the character at index m. This behavior may seem counter-intuitive but if you recall the range function, it did not include its end point either.

If you omit the first index (before the colon), the slice starts at the beginning of the string. If you omit the second index, the slice goes to the end of the string.

There is no Index Out Of Range exception for a slice. A slice is forgiving and shifts any offending index to something legal.

Note

What do you think fruit[:] means?

1.5.1. Check Your Understanding

    string-exploration6: What is printed by the following statements?

    sentence = "python rocks"
    print(sentence[3:8])
    
  • python
  • That would be sentence[0:6].
  • rocks
  • That would be sentence[7:].
  • hon r
  • Yes, start with the character at index 3 and go up to but not include the character at index 8.
  • Error, you cannot have two numbers inside the [ ].
  • This is called slicing, not indexing. It requires a start and an end.

    string-exploration7: What is printed by the following statements?

    sentence = "python rocks"
    print(sentence[7:11] * 3)
    
  • rockrockrock
  • Yes, rock starts at 7 and goes through 10. Repeat it 3 times.
  • rock rock rock
  • Repetition does not add a space.
  • rocksrocksrocks
  • Slicing will not include the character at index 11. Just up to it (10 in this case).
  • Error, you cannot use repetition with slicing.
  • The slice will happen first, then the repetition. So it is ok.

1.6. Traversal and the for Loop: By Item

A lot of computations involve processing a collection one item at a time. For strings this means that we would like to process one character at a time. Often we start at the beginning, select each character in turn, do something to it, and continue until the end. This pattern of processing is called a traversal.

We have previously seen that the for statement can iterate over the items of a sequence (a list of names in the case below).

Recall that the loop variable takes on each value in the sequence of names. The body is performed once for each name. The same was true for the sequence of integers created by the range function.

Since a string is simply a sequence of characters, the for loop iterates over each character automatically.

The loop variable a_char is automatically reassigned each character in the string “Go Spot Go”. We will refer to this type of sequence iteration as iteration by item. Note that it is only possible to process the characters one at a time from left to right.

1.6.1. Check Your Understanding

    string-exploration8: How many times is the word HELLO printed by the following statements?

    s = "python rocks"
    for ch in s:
        print("HELLO")
    
  • 10
  • Iteration by item will process once for each item in the sequence.
  • 11
  • The blank is part of the sequence.
  • 12
  • Yes, there are 12 characters, including the blank.
  • Error, the for statement needs to use the range function.
  • The for statement can iterate over a sequence item by item.

    string-exploration9: How many times is the word HELLO printed by the following statements?

    s = "python rocks"
    for ch in s[3:8]:
        print("HELLO")
    
  • 4
  • Slice returns a sequence that can be iterated over.
  • 5
  • Yes, The blank is part of the sequence returned by slice
  • 6
  • Check the result of s[3:8]. It does not include the item at index 8.
  • Error, the for statement cannot use slice.
  • Slice returns a sequence.

1.7. Practice Problems

Try the following practice problems. You can either work directly in the textbook, or use Thonny. Either way, copy/paste your finished code into Thonny and save your solution into your Computer Science 20 folder when you finish!

1.7.1. Con Cat

Create a program that takes in the name of a cat, then prints out a hello message. For example, if the user types in Garfield, your program could print something like Good to see you, Garfield!. Be sure to use concatenation in your solution!

1.7.2. Duckling Names

In Robert McCloskey’s book Make Way for Ducklings, the names of the ducklings are Jack, Kack, Lack, Mack, Nack, Ouack, Pack, and Quack. This code below attempts to output these names in order. Unfortunately, the output is not quite right because Ouack and Quack are misspelled. See if you can fix it without changing the value of either the prefixes or suffix variables! You should do this by adding a conditional statement within the for loop.

1.7.3. First Letter of a Word

Note

The only thing you need to do for this question is to complete the function definition! You do not need to call the function, as that will be done automatically for you.

Create a function with a single parameter word that returns True if the word begins with the letter “t” or “c”.

Examples:

starts_with_tc("thing") True

starts_with_tc("concatenation") True

starts_with_tc("warman") False

1.7.4. Password Length

Note

The only thing you need to do for this question is to complete the function definition! You do not need to call the function, as that will be done automatically for you.

Create a function with a single parameter password that returns True if the password is between 8 and 32 characters (inclusive), and False otherwise. Please note there is much more to a strong password than just the length of the string!

Examples:

password_length("123456789") True

password_length("qwerty") False

password_length("cray-topnotch-tampa-anthem-trial") True

1.7.5. Removing the Start and End

Create a program that takes in three inputs from the user:

  • a word/sentence
  • a starting number
  • an ending number

After taking in the input, the program should print out a portion of the word/sentence that was entered. For example, if the word the user entered was Saskatchewan, the starting number was 3 and the ending number was 2, the program should print katchew. Notice that in the output, it is the same as the word, except that the first 3 characters and last 2 characters have been removed.

To be sure you understand the challenge, consider the following example:

  • word = “unimaginatively”
  • starting_number = 3
  • ending_number = 2
  • output should be maginative
Next Section - 2. String Traversal (by index), and the Accumulator Pattern with Strings