Kotlin Training Program

DOWNLOAD APP

FEEDBACK

Search

Index Of

To search a specific char or substring in a string, we have the indexOf() function :

 fun main() {
    val string = "GrandMaster"
    println(string.indexOf('m', ignoreCase = true)) // Prints 5
    println(string.indexOf("master", ignoreCase = true)) // Prints 5
}
 

indexOf() functions either returns index of first occurrence of the query (if found) or -1 (if not found).

Lets implement these functions on our own.

indexOf(char)

Looking for a character is straight forward. We just have to iterate over the string and check if char is equal to string[index]. If found equal, we return the index. If not found, loop is exited and we return null.

Implementation

 fun String.indexOf(char: Char, ignoreCase: Boolean = false): Int? {
    // Iterate
    forEachIndexed { index, c ->
        // If found, return index
        if (c.equals(char, ignoreCase)) return index
    }

    // If we reach here, then char not found
    return null
}
 

indexOf(string)

Finding a substring requires multiple subsequent comparisons depending on the length of query.

For example, to search for the query ili in the string ability, we have to check entire substrings of length of 3 at each occurence of i in the string i.e. at index range 2..4 and 4..6.

There are multiple ways to approach this problem. Lets understand the simplest but expensive approach first and then keep improving the approach to make it more efficient.

Using windows

The most naive approach is to check query against all substrings of the string. To get all substrings of a string of given length, we can use the windowed() function.

 fun String.indexOfUsingWindowed(query: String, ignoreCase: Boolean = false): Int? {
    windowed(query.length).forEachIndexed { index, substring ->
        if (substring.equals(query, ignoreCase)) return index
    }
    return null
}
 

Lets analyse the number of comparisons made by this approach.

Let n and m be the length of string and query respectively where n≥m. Number of substrings of length m in a string of length n is n-m+1. In the worst case, for each substring we perform m comparisons against the query. In total, we perform (n-m+1) * m = m(n+1) - m^2 comparisons.

Example :

 Input = "Ability", Query = "lit"
No. of substrings of length 3 in Input = 7 - 3 + 1 = 5
Substrings of length 3 = ["Abi", "bil", "ili", "lit", "ity"]
Worst case comparisons = 5 * 3 = 15
 

Can we perform better than this? Of course we can! Instead of comparing each and every substring, its better to compare only the substrings that have the same first character as that of query.

Using substring

Instead of getting all the substrings using windowed() function, we are interested only in those that start with the first character of query.

Algorithm

To find out and compare against those substrings, we follow the algorithm :

Iterate over the string and for each character,

  • If equal to query.first(),
    • find the possible end of query in the string by index + query.length
    • if possibleEnd ≤ string.length, get the substring string[index until possibleEnd] & compare with query
    • if found equal, return index
  • If not found, exit the loop and return null

Example

 Input = "Ability", Query = "ity"

Iterating over the input string :
[0] = 'A' != 'i' -> continue
[1] = 'b' != 'i' -> continue
[2] = 'i' == 'i' -> [2 until 5] = "ili" != "ity" -> continue
[3] = 'l' != 'i' -> continue
[4] = 'i' == 'i' -> [4 until 7] = "ity" == "ity" -> return 4

Number of comparisons = 7 + 3 = 10
 

Implementation

 fun String.indexOfUsingSubstringAlt(query: String, ignoreCase: Boolean = false): Int? {
    forEachIndexed { index, c ->

        // If matches to first char of query
        if (c.equals(query.first(), ignoreCase)) {

            // Evaluate possible end of query in string
            val possibleEnd = index + query.length

            // Check only if in bounds
            if (possibleEnd <= length) {
                if (query.equals(substring(index, possibleEnd), ignoreCase)) {
                    return index
                }
            }
        }
    }

    // If we reach here, then query not found
    return null
}
 

Through this algorithm, we have reduced the number of comparisons. We proceed to find and compare the substring only if first characters match.

The total number of comparisons depend on the number of occurrence of first character of query in string. However in best case, we just need to make n+m comparisons.

There is one more problem though. This approach requires extracting the substring of query length even if only the first characters match and rest of the characters do not. In the above Ability example, extracting out the [2 until 5] substring ili was an overhead because only first characters i.e. i match and rest do not when comparing with query ity. This makes it memory inefficient.

Can we further improve our program? Yes, we can! Looping over the string instead of using windowed() function, we gained control over which substrings to extract and compare. Now we need control over substring matching process, so that we don’t extract the substring.

Using pointers

In this approach, we won’t be using the substring() function because it led us to extract out the entire substring even on partial matches. Instead we will use pointers here.

The string loop variable index can be thought of as a pointer. On each iteration it points to a character in the string. To keep track of which characters to match, we need two pointers (variables) - one for input string i and one for query j. Both will start from 0. i will be incremented on each iteration but j will be incremented based on the comparison result.

Algorithm

  • Initialize pointer j to 0
  • For each character c at index i in input string,
    • if c equals query[j], increment j. One character is matched, we need to match the next character of query on next iteration.
    • else (found unequal) and if j>0, reset j to 0. j>0 indicates we have matched 1 or more characters of query, but now unequal characters are found, so we have to restart matching query.
    • if j==query.length, then return the index (i - query.length + 1). j pointing to query.length indicates all characters of query being matched. We simply return the start of substring as the index.
  • if we exit the loop, return null (not found)

Example

 Input = "Ramanujan", Query = "anuj"

i=0, j=0
'R' != 'a' -> i=1
'a' == 'a' -> i=2, j=1
'm' != 'n' -> i=3, j=0
'a' == 'a' -> i=4, j=1
'n' == 'n' -> i=5, j=2
'u' == 'u' -> i=6, j=3
'j' == 'j' -> i=7, j=4 
j == Query.length -> return (7-4+1) = 4
 

Implementation

 fun String.indexOf(query: String, ignoreCase: Boolean = false): Int? {
    // Pointer for query
    var j = 0

    // Iterate over this string
    forEachIndexed { i, c ->

        // If chars at both pointers are equal (partial match), increment query pointer
        if (c.equals(query[j], ignoreCase)) j++

        // If unequal and already matched >1 chars, restart
        else if (j > 0) j = 0

        // If all chars of query matched, return true
        if (j == query.length) return (i - query.length + 1)
    }

    // If we reach here, then query not found
    return null
}
 

This approach is much better than the previous two. It doesn’t require forming substrings. It simply uses iteration over input string. The best part is - it requires only n comparisons in the worst case (n being length of input string).

Contains

To check whether a string contains a Char or subtring, we have the contains() function :

 fun main() {
    val string = "GrandMaster"
    println(string.contains('m', ignoreCase = true)) // Prints true
    println(string.contains("master", ignoreCase = true)) // Prints true
}
 

Now that we have seen the implementation of indexOf() funtion, similar is that of contains() function. The only difference being, we have to return Boolean instead of index/null. We can reuese the indexOf() function for this.

 // Char search
fun String.contains(char: Char, ignoreCase: Boolean = false): Boolean {
    return indexOf(char, ignoreCase) != null
}

// Substring search
fun String.contains(query: String, ignoreCase: Boolean = false): Boolean {
    return indexOf(query, ignoreCase) != null
}